Sunteți pe pagina 1din 154

UNIX

Table of Contents
Preface..................................................................................................................................4
1. Operating Systems Basics ...............................................................................................5
1.1 Introduction................................................................................................................5
1.2 What is an Operating system?....................................................................................6
1.3 Functions of an Operating System ............................................................................7
1.4 Process Management.................................................................................................8
1.5 Memory Management................................................................................................9
1.6 Device (I/O) Management.......................................................................................10
1.7 File System Management.........................................................................................11
1.8 Flavors......................................................................................................................12
2. UNIX – Overview..........................................................................................................13
2.1 History .....................................................................................................................13
2.2 Flavors......................................................................................................................15
2.3 Logging In................................................................................................................16
2.4 File and Directory Structure.....................................................................................17
2.5 Process.....................................................................................................................19
3. Shell Basics....................................................................................................................22
3.1 Introduction..............................................................................................................22
Program Initiation .................................................................................................24
Input-output Redirection .......................................................................................24
Pipeline Connection ..............................................................................................25
Substitution of Filenames .....................................................................................25
Maintenance of Variables ......................................................................................25
Environment Control ............................................................................................26
3.2 Flavors......................................................................................................................27
3.3 Interaction with Shell...............................................................................................33
3.4 Features....................................................................................................................36
3.5 Program execution...................................................................................................41
3.6 Background Jobs......................................................................................................43
3.7 Batch jobs.................................................................................................................43
3.8 Input, output and pipes.............................................................................................47
4. The Command Line.......................................................................................................52
4.1 Creating and Manipulating Files and Directories....................................................52
4.2 Controlling Permissions to Files..............................................................................68
4.3 grep and find............................................................................................................79
4.4 Extracting data.........................................................................................................89
4.5 Redirection and Piping.............................................................................................93
4.6 Sorting and Comparing............................................................................................95
5. The vi editor...................................................................................................................99
5.1 Command mode.....................................................................................................101
5.2 Ex mode.................................................................................................................101
5.3 Edit mode...............................................................................................................101
6. Shell Programming......................................................................................................104
6.1 Variables.................................................................................................................104

2
6.2 Command-Line arguments.....................................................................................106
6.3 Decision-making constructs...................................................................................107
6.4 Looping Constructs................................................................................................114
6.5 Reading data ..........................................................................................................118
7. Basics of UNIX Administration...................................................................................121
7.1 Login Process and Run Levels...............................................................................121
7.2 Processes ...............................................................................................................125
7.3 Archiving and backup............................................................................................127
7.4 Security..................................................................................................................132
8. Communication............................................................................................................133
8.1 Basic UNIX mail....................................................................................................133
8.2 Communicating with other users...........................................................................137
8.3 Accessing the Internet............................................................................................139
8.4 e-mail on the Internet.............................................................................................141
9. Makefile concepts........................................................................................................143
9.1 Introduction............................................................................................................143
9.2 Using Variables in Makefile ..................................................................................146
9.3 Writing Makefiles..................................................................................................148
9.4 Sample Makefile....................................................................................................153

3
Preface

This courseware is meant for beginners of UNIX Operating System. The information presented
provides an overview of the UNIX operating system and adequate stuff to gain considerable
working knowledge. The content is divided into nine sections to enable easy understanding.

The first section provides the basics of Operating System that would help beginners who come
from a non-computer science background. The structure and the functions of any operating
system, in general, are discussed here. The different modules of the operating system, namely,
process management, memory management, file system management, Input-Output management
and the different flavors of Operating are discussed in brief.

The second section gives an overview of UNIX Operating System, briefly discussing the history,
flavors and the structure of UNIX. The logging in process and other basic concepts underlying
the UNIX operating System are also discussed here.

The concepts and features relating to the UNIX shell are discussed in the third section. This
covers the variety of shells available, interacting with the shell, the features associated with the
shell, executing ordinary programs, batch jobs, background jobs and the concept of input, output
and pipes are also covered in this section of the courseware.

The various operations carried out on the command line are discussed in the fourth section.
Creating and manipulating files and directories, controlling permissions, extracting data,
redirection, piping, sorting, comparing and so on forms the essence of this section.

The famous vi editor is discussed along with the various modes of operation in the fifth section.

The Shell Programming concepts and constructs are discussed in the sixth section.

The seventh section covers the basics of UNIX administration and security aspects of UNIX
Operating System.

The eighth section covers connecting to other systems, basic concepts of mailing in UNIX,
communicating with other users, accessing and e-mailing on the Internet.

The last section discusses creating an application on UNIX platform.

4
Chapter 1
1. Operating Systems Basics
1.1 Introduction

The structure of a computer system can be depicted as in the figure below. The figure shows how
the operating system fits in a computer system.

Application Programs
|
System Programs
|
Operating System
|
Assembly Language
|
Micro program
|
Physical devices

Figure 1 – Computer System

Physical devices consist of IC chips, wires, power supplies, CRT and so on.

Micro program is located in the ROM. The software directly controls these physical devices
and provides a cleaner interface to the next layer. This could be implemented in hardware also.
The micro program interprets the machine language or assembly language instructions to the
devices.

Assembly language is specific to the processor and it typically contains 50 to 300 instructions for
doing simple operations like moving data around, doing arithmetic and comparisons.

Operating system hides all the complexity and provides a clean interface to the system and
application programs that reside on it.

System programs are those that are loaded according to user’s convenience on top of the
operating system like compilers, editors and command interpreters.

Application programs are those that are written by the users to solve their problems.

5
1.2 What is an Operating system?

The term Operating System denotes those system program modules within a computer system
that governs the control of equipment resources such as processes, main storage, secondary
storage, I/O Devices, and files; and the system modules that provide a base for all application
programs such as word processors, database management systems, and so on.

The Operating System can be viewed as an extended machine and also as a resource manager; the
former being the top-down view and the latter being the bottom-up view.

Operating system as an extended machine (Top-down view)

• Hides the truth about the hardware from the programmer


• Provides a nice, simple view of named files that can be read and written
• Abstraction

Hence, the top-down view looks upon the operating system as providing an easy interface to the
user hiding all the hardware complexities.

Operating system as a resource manager (Bottom-up view)

• Manages all pieces of a complex system


• Provide an orderly and controlled allocation of processes, memory, and I/O
devices among the various programs competing for them
• Keeps track who is using which resource, grants resource requests, accounts
usage, mediate conflicting requests from different programs and users

Hence, the bottom-up view looks upon the system that governs the process management, memory
management, file system management and I/O management.

6
1.3 Functions of an Operating System

The functions of an Operating System can be listed as follows:

 Provides easy interface by hiding the hardware complexities


 Process Management
 Memory Management
 File System Management
 I/O Management

These functions will be discussed in brief in the subsequent sections.

7
1.4 Process Management

Process is a program in execution. Processor management is concerned with the management of


the physical processors, specifically, the assignment of processors to processes. As a process
executes, it changes state. The state of a process is defined in part by the current activity of that
process. The process state diagram is as follows:

Scheduler
New Dispatch Exit

Admitted Ready
Running
Interrupt
Termination

I/O Blocked I/O or


Completion Event Wait

Figure 2 – Process States

The various functions of the process management functions are as follows:

• Keep track of the status of processes using process tables. The Operating System
maintains a process table with one entry per process. When the switching takes place, the
values in the process table are saved and the values corresponding to the next process is
taken.
• Scheduling of processes is an important function in deciding which process will use the
processor. The job scheduler chooses from all the jobs submitted to the system and
decides which one will be allowed to take the processor.
• Allocate the resources to a process by setting up necessary hardware registers and this is
often called the dispatcher.
• Reclaim resources when process relinquishes processor usage, terminates, or exceeds
allowed amount of usage.
• Inter-process Communication using system calls.
• Process synchronization using monitors, event counters and semaphores.

8
1.5 Memory Management

Memory management of an Operating System is concerned with the management of primary


memory. The primary memory or the core memory is the part of memory that the processors
directly access for instructions and data.

The various functions of the memory management functions are as follows:

• Keep track of the status of each location of primary memory, whether allocated to any
process or free
• Determining allocation policy for memory
• Allocation technique – Once it is decided to allocate memory, the specific location must
be selected and allocation information updated
• De-allocation technique and policy – handling the de-allocation of memory
• Paging – Swapping pages from the secondary memory to primary memory when a page
fault occurs

9
1.6 Device (I/O) Management

Management of I/O devices includes efficient management of actual I/O devices such as printers,
card readers, tapes, disks, control units, control channels and so on.

There are three basic techniques for implementing the device management.

Dedicated – A technique whereby a device is assigned to a single process.


Shared – A technique whereby a device is shared by many process.
Virtual – A technique whereby a physical device is simulated on another.

The various functions of the device management functions are as follows:

• Keep track of the status of I/O devices using unit control blocks for devices
• I/O scheduling using scheduling algorithms
• Allocation – physically assigning a device to a process. The corresponding channels and
control units must be assigned.
• De-allocation – may be done on either on a process level or job level. On a process level,
a device may be assigned for as long as the process needs it. On a job level, a device is
assigned for as long as the job exists in the system.

The method of deciding how devices are allocated depends on the flexibility of the device. Some
devices cannot be shared e.g., card readers and must therefore be dedicated to a process. Others
may be shared e.g., disks and hence more flexibility. Others may be made into virtual devices.
For example, the operation of punching on a cardpunch could be transformed into a write onto a
disk i.e., a virtual card punch and at some later time a routine would copy the information onto a
card punch. Virtual card reader, card punch, and printer devices is performed by spooling
routines. The virtual devices approach allows:

• Dedicated devices to be shared, hence, more flexibility in scheduling these devices


• More flexibility in job scheduling
• Better match of speed of device and speed of requests for that device

10
1.7 File System Management

Information Management is concerned with the storage and retrieval of information entrusted to
the system. The modules of information system are collectively referred to as the file system.

The various functions of the file system management functions are as:

• Keep track of all information in the system through various tables, the major one being
the file directory – sometimes called the VTOC (Volume Table of Contents). These
tables contain the name, location, and access rights of all information within the system
• Deciding policy for determining where and how information is stored and who gets
access to the information. Factors influencing this policy are efficient utilization of
secondary storage, efficient access, flexibility to users, and protection of access rights to
the information requested.
• Allocate the information, e.g., open a file. Once the decision is made to let a process
have access to information, the allocation modules must find the desired information,
make the desired information accessible to the process, and establish the appropriate
access rights.
• De-allocate the information, e.g., close a file. Once the information is no longer needed,
temporary table entries and other such resources may be released. If the user has updated
the information, the original copy of that information may be updated for possible use by
other processes.

11
1.8 Flavors

The different flavors of Operating systems are as follows.

MS-DOS
UNIX
WINDOWS
MVS/ESA
ATLAS
XDS-940
THE
RC 400
CTSS
MULTICS
MACH

The student is advised to explore more information about the operating systems mentioned above.

12
Chapter 2
2. UNIX – Overview
2.1 History

UNIX
An open discussion on the workings of an operating system is never complete without discussing
what the operating system is and the history behind it. The purpose of this module is to explain
how UNIX came to be what it is today.

UNIX is a multi-user, multitasking, multithreading computer operating system that enables


different people to access a computer at the same time and to run more than one program
simultaneously. Since its humble beginning nearly 40 years ago, it has been redefined and
refined time and time again. Networking capabilities enhance the suitability of UNIX for the
workplace, and support for DOS and Windows is coming in the 32-bit workstation markets.
UNIX was designed to be simple and flexible. The key components include a hierarchical
directory tree that divides files into directories and real time processing.

Exploring History

The UNIX Operating System came into life more or less by accident. In the late 1960s, an
operating system called MULTICS was designed by the Massachusetts Institute of Technology to
run on GE Mainframe computers. Built on banks of processors, MULTICS enabled information
sharing among users, although it required huge amounts of memory and ran slowly.

Ken Thompson working for Bell Labs wrote a crude computer game to run on the mainframe. As
the performance the mainframe gave or the cost of running it is very low, Ken with the help of
Dennis Ritchie rewrote the game to run on a DEC computer and in the process, wrote an entire
operating system as well. Several variations have circulated about the system, but the common is
that it is a derivative of MULTICS. In 1970, Thompson and Ritchie’s operating system came to
be called UNIX and Bell Labs kicked in financial support to refine the product.

By 1972, around 10 computers were running UNIX, and in 1973, Thompson and Ritchie rewrote
the kernel from Assembly language to C language – the brainchild of Ritchie. Since then, UNIX
and C have been intertwined, and growth of UNIX is partially due to the ease of transporting the
C language to other platforms. AT&T, the parent company of Bell offered UNIX in source-code
form to government institutions and to universities for a fraction of its worth. In 1979, it was
ported to popular VAX computers from Digital, further cementing its way to many universities.

Thompson, in 1975 moved to University of California and he recruited a graduate student named
Bill Joy to help enhance the system. In 1977, Joy mailed out free copies of his system
modifications. When AT&T began releasing it as a commercial product, system numbers were
used (System III, System V and so on). The refinements done at the University were released as
Berkeley Software Distribution, or BSD (2BSD, 3BSD, and so on). These included the vi editor
and C shell. AT&T versions accepted 14 characters for file names and Berkeley expanded it to

13
25. Towards the end of 1970s, ARPA (Advanced Research Projects Agency), of DoD
(Department of Defense) used Berkeley version of UNIX.

Bill Joy, in the mean time left the campus setting and became one of the founding members of
Sun Microsystems. Sun workstations used a derivative of BSD known as the Sun Operating
System, or SunOS.

As there was lack of uniformity among UNIX versions, several steps were taken to correct the
problem. In 1988, Sun Microsystems and AT&T joined forces to rewrite UNIX into System V,
release 4.0. Other companies like IBM and Digital Equipment, countered by forming their own
standards group to come up with a guideline for UNIX. Both groups incorporated BSD in their
guidelines but still managed to come up with different versions of System V.

In 1992, a joint product, UNIXWARE as announced by UNIX System Laboratories, a spin


company of AT&T. UNIXWARE combines UNIX with features of Netware was marketed. In
the early 1990s, Berkeley announced that no more editions of BSD would be forthcoming.

As the enhancements in PC Operating Systems were interesting, Microsoft created XENIX – a


form of UNIX designed for desktop PC and later sold it to Santa Cruz Operations (SCO). As the
personal computer has matured, UNIX has come into favor as an Operating system for it. SCO
UNIX, as well as Sun’s and Novell’s entries provide excellent operating systems for multi-user
environments.

14
2.2 Flavors

Since it began to escape from AT&T's Bell Laboratories in the early 1970's, the success of the
UNIX operating system has led to many different versions. Universities, research institutes,
government bodies and computer companies all began using the powerful UNIX system to
develop many of the technologies which today are part of the IT environment. Computer Aided
Design, Manufacturing Control Systems, Laboratory Simulations and the Internet began life with
and because of UNIX systems.

Soon all the large vendors, and many smaller ones, were marketing their own, versions of the
UNIX system optimized for their own computer architectures and boasting many different
strengths and features.

Find below the chronological picture of UNIX. Click on this image to view a larger picture.

Figure 3 – UNIX Chronology

The different flavors of UNIX are shown in the picture. Various giants like SCO, IBM, AT&T,
Siemens, Berkeley, SUN Microsystems, DEC, and HP redefined the preliminary versions adding
many features and thus leading to many flavors of UNIX. The different flavors include Multics,
System III, BSD, Sun OS, Solaris, Sinix, Ultrix, HP UX, AIX, XENIX, SCO-UNIX and so on.

15
2.3 Logging In

A UNIX session begins with the login prompt. This prompt appears on the screen. The system
administrators must again a login name and password to you before you can gain admittance to
the system. After you have attained both, enter the login name at the prompt, as in the following:

login : user1 <Enter>

Next a password : prompt appears. Enter your password here.

After you correctly enter the login and password, the shell prompt appears. The shell prompt can
be dollar sign ($) or percentage sign (%). UNIX contains various shells or command interpreters.
The % identifies that the C shell is in use, and the $ is used by most of the shells. A # sign on the
other hand indicates that you have logged in with administrative permissions.

To end your session with the operating system, you need to leave the shell and return to the login.
This task is known as logging out or signing off. Several methods are available depending on
your software. The preferred method is the exit command. An alternative is pressing Ctrl+D,
which signifies the end of data input. Other commands that might exist on your system and
perform the same function are logout, logoff, or log.

16
2.4 File and Directory Structure

The UNIX operating system uses an inverted tree structure, much the same as the many other
operating systems. In an inverted tree, one main directory branches into several subdirectories.
Each subdirectory can then further branch into more subdirectories. This structure, although
novel at the time UNIX first became available, is familiar now and commonplace with most
operating systems. The purpose of each of the standard subdirectories and the method by which
UNIX maintains files and directories and the method by which UNIX maintains files and
directories, however, is vitally important.

Root directory and its branches

The root directory is the beginning, or first layer, of the file system. Symbolized as a forward
slash (/), the root directory is the point from which all other subdirectories branch. A root user
also called the super user has the ability to change anything related to the file system without
question. This user can bring the system up, shut it down, and do everything in between. By no
small coincidence, the home directory of this user is the root directory – from here, all
information filters down. The figure below illustrates this by presenting the classic diagram of
the UNIX subdirectory file system.

bin dev etc lib l&f shlb tcb tmp usr var

crdm bin include lib man spool

l&f – lost & Found

Figure 4 – Inverted tree of UNIX – Classic Diagram

Only one file – UNIX should be within the normal system’s root directory. This is the actual,
bootable, operating system file. In its absence, the operating system cannot come up after a
restart. This file is also known as the kernel. In addition to the UNIX file, there are a number of
subdirectories.

Every system has unique ones created by an administrator for specific purposes, but the default
ones created when a new operating system is installed are as follows.

17
bin
dev
etc
lib
lost+found
shlib
tcb
tmp
usr
var

18
2.5 Process

UNIX is a multi-user, multi-tasking operating system, which means, more than one user can use
the system and each user can run more than one program at the same time. In UNIX, each task
that the system that the system is executing is a process. Normally, the UNIX has to handle
several processes at the same time. However, the CPU can handle only one job at a time. So,
UNIX uses the time-sharing concept to solve the problem. Time-sharing means the kernel
maintains a list of current tasks or processes and allocates a small time slice for each process.
The kernel switches to other process in the queue once the current process completed or the
allocated time has elapsed. The kernel also assigns a priority number to each of the processes.
Based on the priority, the processes are scheduled to run on the CPU. Typically, the kernel
switches from process to process very rapidly. This gives the user the impression that all the
processes are running simultaneously. Suppose the system is executing many processes that do
not fit simultaneously into the main memory, the kernel then swaps the blocked and sleeping
processes to the disk and makes room for the running process. All the processes are assigned a
unique identification number by the kernel called the process id (pid). This pid is used by the
kernel to identify each process. The information about the processes, their states, pid, memory
area, the terminal from which the program was executed, the owner of the process and some other
system information are maintained in the process table by the kernel.

After the login process, the kernel invokes a shell for that user. The shell is the parent process
that runs till the user logs out of the system. All the commands, jobs, or processes that run under
the shell or get invoked from the shell are child processes of the parent shell. Each process gets
killed after it is completed. All the child processes should be killed before the parent dies.
Another shell can also be invoked from an existing shell. The child shell should be killed before
exiting from the parent shell.

Manipulation of Processes

Process Status

The ps command displays the contents of the process table. This command without any option
displays a four-column output containing the status of the processes of the particular user terminal
only. The first column is the pid of the process, the second column is the terminal number, the
third column is the time taken by the process at the time of issuing the command and last column
displays the command or the program name itself.

The ps command with the option –e, displays all the active processes. With the –l option displays
among other details, the priority associated with each process, the state of the process and number
of CPU resources used by the process. The –l option along with the –e option displays the
information about all the processes that are active in the system.

Process Scheduling

There are also some process scheduling commands namely nice, sleep, at, and nohup etc.

nice

19
It is also possible for individual user to run a process with a reduced priority. Only the super user
can increase the priority value for a process. This is achieved by a command nice.

Example: $ nice <command> <enter>


$ nice -15 vi first <enter>

The priority gets reduced by 15 units. If the number is not specified, the priority gets reduced by
10 units.

sleep

To suspend the execution of any process started by the user for a short interval, the command
sleep is used. The time period for sleep is given in terms of seconds and up to a maximum of
65535 seconds.

Example: $ sleep <seconds> <enter>

at

The at command is used for scheduling one or more commands to be executed at a specified time.
This command can be used to execute a command or program at a later time, even after logged
out from the system.

Example: $ at <time> <enter>


<command>
<command>
…………..
Ctrl+D
$

$ at 8pm
$ at 2001 sat
$ at 1300 mon week
$ at 3pm sep 30

nohup

Kernel terminates all active processes by sending a hang up signal at the time of logout. The
nohup command is used to continue execution of a process even after the user has logged out.
With nohup, any command can be specified.

Example: $ nohup <command>

Any command that is specified with nohup will continue running even after the user has logged
out.

Background Processing

UNIX can schedule the execution of more than one program at the same time. In fact, the system
can execute only one program at any one instance, but it switches between processes so quickly,
usually fractions of seconds, that most of the time all the programs seems to be running at the

20
same time. If a particular program seems to be taking a lot of time to complete and the user likes
to begin another task, he can schedule that program in the background. The main advantage of
running programs in the background is that shell prompt appears immediately. Initiating a
background process is easy with a UNIX system. Simply type the ampersand character at the end
of the command line invoking the process that has to be run in the background. The shell will
respond by printing the process id of the background process and immediately display the
prompt. The user can start a number of background processes, but depending upon the system,
the kernel may limit it to 30 or 50 per user. Generally, processes that do not require inputs from
the user are started in the background, otherwise it is impossible to guess which program is
accepting the inputs, the process in background or one in the foreground. Like the foreground
processes, the shell redirects the standard input, output and error files of the background
processes to the terminal and keyboard. Therefore, it is possible to get outputs from the
background processes mixed with the output of the foreground process. This can be avoided by
redirecting the standard input, output and error files of the background processes.

Example: $ backprocess& <Enter>


2777

At the time of logout, all the active processes, including the background processes, will be
terminated. To ensure the background processes to continue nohup command is used before the
program name.

Example: $ nohup backprocess& <Enter>


2779

Background processes can also be given by issuing more than one command separated by a semi
colon. The prompt returns only after the execution of the last command in sequence.

Terminating Processes

Occasionally, a need may arise to stop executing processes. This can be done by sending a
software termination signal to the process. To send this software termination signal, the kill
command is used. The kill command takes the process id as the argument and terminates the
process.

$ kill <pid>

The pid is displayed by the shell immediately after it places a process in the background. It can
also be obtained through the ps command. Some programs are designed to ignore the interrupts.
In this case, the form of kill command would not terminate a process. However, we can request
the kill command to send a sure kill signal instead. This signal always terminates any processes
owned by the user issuing the kill command. The sure kill signal is represented by including a
minus nine option (-9) along with the kill command.

$ kill -9 2777

Killing the shell process will logout the user from the system. Only the owner of the process or
the super user can kill a process.

21
Chapter 3
3. Shell Basics
3.1 Introduction
You can do many things without having an extensive knowledge of how they actually work. For
example, you can drive a car without understanding the physics of the internal combustion
engine. A lack of knowledge of electronics doesn't prevent you from enjoying music from a CD
player. You can use a UNIX computer without knowing what the shell is and how it works.
However, you will get a lot more out of UNIX if you do.

Three shells are typically available on a UNIX system - Bourne, Korn, and C shells.

As the shell of a nut provides a protective covering for the kernel inside, a UNIX shell provides a
protective outer covering. When you turn on, or "boot up," a UNIX-based computer, the program
unix is loaded into the computer's main memory, where it remains until you shut down the
computer. This program, called the kernel, performs many low-level and system-level functions.
The kernel is responsible for interpreting and sending basic instructions to the computer's
processor. The kernel is also responsible for running and scheduling processes and for carrying
out all input and output. The kernel is the heart of a UNIX system. There is one and only one
kernel.

As you might suspect from the critical nature of the kernel's responsibilities, the instructions to
the kernel are complex and highly technical. To protect the user from the complexity of the
kernel, and to protect the kernel from the shortcomings of the user, a protective shell is built
around the kernel. The user makes requests to a shell, which interprets them, and passes them on
to the kernel. The remainder of this section explains how this outer layer is built.

Once the kernel is loaded to memory, it is ready to carry out user requests. First, though, a user
must log in and make a request. For a user to log in, however, the kernel must know who the user
is and how to communicate with him. To do this, the kernel invokes two special programs, getty
and login. For every user port—usually referred to as a tty—the kernel invokes the getty program.
This process is called spawning. The getty program displays a login prompt and continuously
monitors the communication port for any type of input that it assumes is a user name.

When getty receives any input, it calls the login program. The login program establishes the
identity of the user and validates his right to log in. The login program checks the password file.
If the user fails to enter a valid password, the port is returned to the control of a getty. If the user
enters a valid password, login passes control by invoking the program name found in the user's
entry in the password file. This program might be a word processor or a spreadsheet, but it
usually is a more generic program called a shell.

In the system suppose four users have logged in. Of the four active users, two are using the
Bourne shell, one is using the Korn shell, and one has logged into a spreadsheet. Each user has
received a copy of the shell to service his requests, but there is only one kernel. Using a shell does
not prevent a user from using a spreadsheet or another program, but those programs run under the
active shell. A shell is a program dedicated to a single user, and it provides an interface between
the user and the UNIX kernel.

22
You don't have to use a shell to access UNIX. In the previous example, one of the users has been
given a spreadsheet instead of a shell. When this user logs in, the spreadsheet program starts.
When he exits the spreadsheet, he is logged out. This technique is useful in situations where
security is a major concern, or when it is desirable to shield the user from any interface with
UNIX. The drawback is that the user cannot use mail or the other UNIX utilities.

Because any program can be executed from the login—and a shell is simply a program—it is
possible for you to write your own shell. In fact, three shells, developed independently, have
become a standard part of UNIX. They are

 The Bourne shell, developed by Stephen Bourne

 The Korn shell, developed by David Korn

 The C shell, developed by Bill Joy

This variety of shells enables you to select the interface that best suits your needs or the one with
which you are most familiar.

Functions

It doesn't matter which of the standard shells you choose, for all three have the same purpose - to
provide a user interface to UNIX. To provide this interface, all three offer the same basic
functions:

• Command line interpretation

• Program initiation

• Input-output redirection

• Pipeline connection

• Substitution of filenames

• Maintenance of variables

• Environment control

• Shell programming

Command Line Interpretation

When you log in, starting a special version of a shell called an interactive shell, you see a shell
prompt, usually in the form of a dollar sign ($), a percent sign (%), or a pound sign (#). When you
type a line of input at a shell prompt, the shell tries to interpret it. Input to a shell prompt is
sometimes called a command line.

The basic format of a command line is

23
command arguments

command is an executable UNIX command, program, utility, or shell program. The arguments are
passed to the executable.

Most UNIX utility programs expect arguments to take the following form:

options filenames

For example, in the command line

$ ls -l file1 file2
there are three arguments to ls, the first of which is an option, while the last two are file names.

One of the things the shell does for the kernel is to eliminate unnecessary information. For a
computer, one type of unnecessary information is whitespace. Therefore, it is important to know
what the shell does when it sees whitespace. Whitespace consists of the space character, the
horizontal tab, and the new line character. Consider this example:

$ echo part A part B part C

part A part B part C

Here, the shell has interpreted the command line as the echo command with six arguments and
has removed the whitespace between the arguments. For example, if you were printing headings
for a report and you wanted to keep the whitespace, you would have to enclose the data in
quotation marks, as in

$ echo 'part A part B part C'

part A part B part C

The single quotation mark prevents the shell from looking inside the quotes. Now the shell
interprets this line as the echo command with a single argument, which happens to be a string of
characters including whitespace.

Program Initiation

When the shell finishes interpreting a command line, it initiates the execution of the requested
program. The kernel actually executes it. To initiate program execution, the shell searches for the
executable file in the directories specified in the PATH environment variable. When it finds the
executable file, a subshell is started for the program to run. You should understand that the
subshell can establish and manipulate its own environment without affecting the environment of
its parent shell. For example, a subshell can change its working directory, but the working
directory of the parent shell remains unchanged when the subshell is finished.

Input-output Redirection

24
Input-output redirection is the responsibility of the shell to make this happen. The shell does the
redirection before it executes the program. Consider these two examples, which use the wc word
count utility on a data file with 5 lines:

$ wc -l fivelines

5 fivelines

$ wc -l <fivelines

5
This is a subtle difference. In the first example, wc understands that it is to go out and find a file
named fivelines and operate on it. Since wc knows the name of the file it displays it for the user.
In the second example, wc sees only data, and does not know where it came from because the
shell has done the work of locating and redirecting the data to wc, so wc cannot display the file
name.

Pipeline Connection

Since pipeline connections are actually a special case of input-output redirection in which the
standard output of one command is piped directly to the standard input of the next command, it
follows that pipelining also happens before the program call is made. Consider this command
line:

$ who | wc -l

5
In the second example, rather than displaying its output on your screen, the shell has directed the
output of who directly to the input of wc. Pipes are discussed in Chapter 4.

Substitution of Filenames

Metacharacters can be used to reference more than one file in a command line. It is the
responsibility of the shell to make this substitution. The shell makes this substitution before it
executes the program. For example,

$ echo *

file1 file2 file3 file3x file4


Here, the asterisk is expanded to the five filenames, and it is passed to echo as five arguments. If
you wanted to echo an asterisk, we would enclose it in quotation marks.

Maintenance of Variables

The shell is capable of maintaining variables. Variables are places where you can store data for
later use. You assign a value to a variable with an equal (=) sign.

$ LOOKUP=/usr/mydir

25
Here, the shell establishes LOOKUP as a variable, and assigns it the value /usr/mydir. Later, you
can use the value stored in LOOKUP in a command line by prefacing the variable name with a
dollar sign ($). Consider these examples:

$ echo $LOOKUP

/usr/mydir

$ echo LOOKUP

LOOKUP

Like filename substitution, variable name substitution happens before the program call is made.
The second example omits the dollar sign ($). Therefore, the shell simply passes the string to
echo as an argument. In variable name substitution, the value of the variable replaces the variable
name.

For example, in

$ ls $LOOKUP/filename
the ls program is called with the single argument /usr/mydir/filename.

Environment Control

When the login program invokes your shell, it sets up your environment, which includes your
home directory, the type of terminal you are using, and the path that will be searched for
executable files. The environment is stored in variables called environmental variables. To change
the environment, you simply change a value stored in an environmental variable. For example, to
change the terminal type, you change the value in the TERM variable, as in

$ echo $TERM

vt100

$ TERM=ansi

$ echo $TERM

ansi

26
3.2 Flavors

Most contemporary versions of UNIX provide all three shells—the Bourne shell, C shell, and
Korn shell — as standard equipment. Choosing the right shell to use is an important decision
because you will spend considerable time and effort learning to use a shell, and more time
actually using it. The right choice will allow you to benefit from the many powerful features of
UNIX with a minimum of effort. Of course, no one shell is best for all purposes. If you have a
choice of shells, then you need to learn how to choose the right shell for the job.
The shell has three main uses:
As a keyboard interface to the operating system
As a vehicle for writing scripts for your own personal use
As a programming language to develop new commands for others

Each of these three uses places different demands on you and on the shell you choose.
Furthermore, each of the shells provides a different level of support for each use.

The first point to keep in mind when choosing a shell for interactive use is that your decision
affects no one but yourself. This gives you a great deal of freedom: you can choose any of the
three shells without consideration for the needs and wishes of others. Only your own needs and
preferences will matter.

The principal factors that will affect your choice of an interactive shell are as follows:

Learning. It is a lamentable fact of life that as the power and flexibility of a tool increases, it
becomes progressively more difficult to learn how to use it. The much-maligned VCR, with its
proliferation of convenience features, often sits with its clock unset as silent testimony. So too it
is with UNIX shells. There is a progression of complexity from the Bourne shell, to the C shell, to
the Korn shell, with each adding features, shortcuts, bells and whistles to the previous. The cost
of becoming a master is extra time spent learning and practicing. You'll have to judge whether
you'll really use those extra features enough to justify the learning time. Keep in mind though that
all three shells are relatively easy to learn at a basic level.

Command editing. The C shell and the Korn shell offer features to assist with redisplaying and
reusing previous commands; the Bourne shell does not. The extra time savings you can realize
from the C shell or the Korn shell command editing features depends greatly on how much you
use the shell. Generations of UNIX users lived and worked before the C and Korn shells were
invented, demonstrating that the Bourne shell is eminently usable, just not as convenient for the
experienced, well-practiced C shell or Korn shell user.

Wildcards and shortcuts. Once again, your personal productivity (and general peace of mind)
will be enhanced by a shell that provides you with fast ways to do common things. Wildcards and
command aliases can save you a great deal of typing if you enter many UNIX commands in the
course of a day.

Portability. If you will sit in front of the same terminal every day, use the same UNIX software
and applications for all your work, and rarely if ever have to deal with an unfamiliar system, then,
by all means choose the best tools that your system has available. If you need to work with many
different computers running different versions of UNIX, as system and network administrators
often must, you may need to build a repertoire of tools (shell, editor, and so on) that are available

27
on most or all of the systems you'll use. Don't forget that being expert with a powerful shell won't
buy you much if that shell isn't available. For some UNIX professionals, knowing a shell
language that's supported on all UNIX systems is more important than any other consideration.

Prior experience. Prior experience can be either a plus or a minus when choosing a shell. For
example, familiarity with the Bourne shell is an advantage when working with the Korn shell,
which is very similar to the Bourne shell, but somewhat of a disadvantage when working with the
C shell, which is very different. Don't let prior experience dissuade you from exploring the
benefits of an unfamiliar shell.

The table rates the three shells using the preceding criteria, assigning a rating of 1 for best choice,
2 for acceptable alternative, and 3 for poor choice.

Shell Learning Editing Shortcuts Portability Experience

Bourne 1 3 3 1 3
C 2 2 1 3 2
Korn 3 1 2 2 1

Bourne Shell

The Bourne shell is your best choice for learning because it is the simplest of the three to use,
with the fewest features to distract you and the fewest syntax nuances to confuse you. If you
won't be spending a lot of time using a command shell with UNIX, then by all means develop
some proficiency with the Bourne shell. You'll be able to do all you need to, and the productivity
benefits of the other shells aren't important for a casual user. Even if you expect to use a UNIX
command shell frequently, you might need to limit your study to the Bourne shell if you need to
become effective quickly.

The Bourne shell is lowest in the productivity categories because it has no command editor and
only minimal shortcut facilities. If you have the time and expertise to invest in developing your
own shell scripts, you can compensate for many of the Bourne shell deficiencies, as many shell
power users did in the years before the C shell and the Korn shell were invented. Even so, the
lack of command editing and command history facilities means you'll spend a lot of time retyping
and repairing commands. For intensive keyboard use, the Bourne shell is the worst of the three. If
you have any other shell, you'll prefer it over the Bourne shell.

The C shell and the Korn shell were invented precisely because of the Bourne shell's low
productivity rating. They were both targeted specifically to creating a keyboard environment that
would be friendlier and easier to use than the Bourne shell, and they are here today only because
most people agree that they're better.

However, portability concerns might steer you toward the Bourne shell despite its poor
productivity rating. Being the oldest of the three shells (it was written for the very earliest
versions of UNIX), the Bourne shell is available virtually everywhere. If you can get your job
done using the Bourne shell, you can do it at the terminal of virtually any machine anywhere.

28
This is not the case for the C and Korn shells, which are available only with particular vendors'
systems or with current UNIX releases.

The Bourne shell has a rating of 3 for prior experience because prior experience using the Bourne
shell is no reason to continue using it. You can use the Korn shell immediately with no additional
study and no surprises, and you can gradually enhance your keyboard skills as you pick up the
Korn shell extensions. If you have access to the Korn shell, you have no reason not to use it.

C Shell

The C shell rates a 2 for learning difficulty, based simply on the total amount of material available
to learn. The C shell falls between the Bourne shell and the Korn shell in the number and
complexity of its facilities. Make no mistake—the C shell can be tricky to use, and some of its
features are rather poorly documented. Becoming comfortable and proficient with the C shell
takes time, practice, and a certain amount of inventive experimentation. Of course, when
compared to the Bourne shell only on the basis of common features, the C shell is no more
complex, just different.

The C shell rates a passing nod for command editing because it doesn't really have a command
editing feature. Its history substitution mechanism is complicated to learn and clumsy to use, but
it is better than nothing at all. Just having a command history and history substitution mechanism
is an improvement over the Bourne shell, but the C Shell is a poor second in comparison to the
simple and easy command editing of the Korn shell.

With the Korn shell, you can reuse a previously entered command, even modify it, just by
recalling it (Esc-k if you're using the vi option) and overtyping the part you want to modify. With
the C shell, you can also reuse a previous command, but you have five different forms for
specifying the command name (!!, !11, !-5, !vi, or !?vi?), additional forms for selecting the
command's arguments (:0, :^, :3-5, :-4, :*, to name a few), and additional modifiers for changing
the selected argument (:h, :s/old/new/, and so forth). Even remembering the syntax of command
substitution is difficult, not to speak of using it.

On the other hand, if you like to use wildcards, you'll find that the C shell wildcard extensions for
filenames are easier to use—they require less typing and have a simpler syntax—than the Korn
shell wildcard extensions. Also, its cd command is a little more flexible. The pushd, popd, and
dirs commands are not directly supported by the Korn shell (although they can be implemented in
the Korn shell by the use of aliases and command functions). Altogether, the C shell rates at the
top of the heap in terms of keyboard shortcuts available, perhaps in compensation for its only
moderately successful command editing. Depending on your personal mental bent, you might
find the C shell the most productive of all three shells to use. We have seen that those already
familiar with the C shell have not been drawn away in droves by the Korn shell in the past.

For portability considerations, the C shell ranks at the bottom, simply because it's a unique shell
language. If you know only the C shell, and the particular system you're using doesn't have it,
you're out of luck. A C shell user will almost always feel all thumbs when forced to work with the
Bourne shell, unless she is bilingual and knows the vagaries and peculiarities of both.

The C shell gets a 2 for prior experience. If you already know it and want to continue using it,
there is no compelling reason why you shouldn't. On the other hand, you may be missing a good
bet if you decide to ignore the Korn shell. Unless you feel quite comfortable with the C shell's

29
history substitution feature and use it extensively to repair and reuse commands, you might find
the Korn shell's command editing capability well worth the time and effort to make the switch.
Anyone accustomed to using the Korn shell's command editing capability feels unfairly treated
when deprived of it—it's that good. If you haven't already experimented with the Korn shell and
you have the chance, I would strongly recommend spending a modest amount of time gaining
enough familiarity with it to make an informed choice. You might be surprised.

Altogether, the C shell is a creditable interactive environment with many advantages over its
predecessor, the Bourne shell, and it is not clear that the Korn shell is a compelling improvement.
Personal preference has to play a role in your choice here. However, if you're new to UNIX, the C
shell is probably not the best place for you to start.

Korn Shell

In terms of time and effort required to master it, the Korn shell is probably the least attractive.
That's not because it's poorly designed or poorly documented, but merely because it has more
complex features than either of the other two shells. Of course, you don't have to learn everything
before you can begin using it. The Korn shell can be much like good music and good art, always
providing something new for you to learn and appreciate.

For productivity features, the Korn shell is arguably the best of the three shells. Its command
editor interface enables the quick, effortless correction of typing errors, plus easy recall and reuse
of command history. It's hard to imagine how the command line interface of the Korn shell could
be improved without abandoning the command line altogether.

On the down side, the Korn shell provides equivalents for the C shell's wildcard extensions, but
with a complicated syntax that makes the extensions hard to remember and hard to use. You can
have the pushd, popd directory interface, but only if you or someone you know supplies the
command aliases and functions to implement them. The ability to use a variable name as an
argument to cd would have been nice, but you don't get it. The Korn shell's command aliasing and
job control facilities are nearly identical to those of the C shell. From the point of view of
keyboard use, the Korn shell stands out over the C shell only because of its command editing
feature. In other respects, its main advantage is that it provides the C shell extensions in a shell
environment compatible with the Bourne shell; if Bourne shell compatibility doesn't matter to
you, then the Korn shell might not either.

Speaking of Bourne shell compatibility, the Korn shell rates a close second to the Bourne shell for
portability. If you know the Korn shell language, you already know the Bourne shell, because ksh
is really a superset of sh syntax. If you're familiar with the Korn shell, you can work reasonably
effectively with any system having either the Bourne or Korn shells, which amounts to virtually
one hundred percent of the existing UNIX computing environments.

Finally, in terms of the impact of prior experience, the Korn shell gets an ambiguous rating of 2.
If you know the Bourne shell, you'll probably want to beef up your knowledge by adding the
extensions of the Korn shell and switching your login shell to ksh. If you already know ksh, you'll
probably stick with it. If you know csh, the advantages of ksh may not be enough to compel you
to switch.

If you're a first-time UNIX user, the Korn shell is the best shell for you to start with. The
complexities of the command editing feature will probably not slow you down much; you'll use
the feature so heavily its syntax will become second nature to you before very long.

30
If you develop any shell scripts, you'll probably want to write them in the same shell language
you use for interactive commands. As for interactive use, the language you use for personal
scripts is largely a matter of personal choice.

If you use either the C shell or the Korn shell at the keyboard, you might want to consider using
the Bourne shell language for shell scripts, for a couple of reasons. First, personal shell scripts
don't always stay personal; they have a way of evolving over time and gradually floating from
one user to another until the good ones become de facto installation standards. Writing shell
scripts in any language but the Bourne shell is somewhat risky because you limit the machine
environments and users who can use your script. Of course, for the truly trivial scripts, containing
just a few commands that you use principally as an extended command abbreviation, portability
concerns are not an issue.

If you're not an experienced UNIX user and shell programmer, you probably know only one of
the three shell languages. Writing short, simple shell scripts to automate common tasks is a good
habit and a good UNIX skill. To get the full benefit of the UNIX shells, you almost have to
develop some script writing capability. This will happen most naturally if you write personal
scripts in the same language that you use at the keyboard.

For purposes of comparison, the table below describes the shell features that are available in only
one or two of the three shells.

Feature sh csh ksh

Arithmetic expressions - X X
Array variables - X X
Assignment id=string X - X
case statement X - X
cdpath searches SysV X X
clobber option - X X
Command aliases - X X
echo -n option - X -
export command X - X
foreach statement - X -
getopts built-in command - - X
glob command - X -
Hash table problems, rehash and unhash commands - X -
Job control (bg, fg, ...) - X X
let command - - X
limit, unlimit commands - X -
nice shell built-in - X -
nohup shell built-in - X -

31
notify shell built-in - X -
onintr command - X -
print command - - X
pushd, popd commands - X -
RANDOM shell variable - - X
repeat shell built-in - X -
select statement - - X
setenv, unsetenv commands - X -
SHELL variable specifies command to execute scripts - X -
switch statement - X -
until statement X - X
set -x X - X
set optionname - X -
Set-uid scripts - - X
Shell functions SysV - X
Substring selectors :x - X -
trap command X - X
typeset command - - X
ulimit command X - X
Undefined variable is an error - X -
! special character - X -
@ command - X -
*(...) wildcards - - X
$(...) command expression - - X
{...} wildcards - X -
|& coprocessing - - X
>& redirection - X -

32
3.3 Interaction with Shell

Before a user can interact with the shell, you must have a session with your UNIX host, which
typically involves connecting to your system and logging in. A login session is established by
entering your user name and password. The system then proceeds through a login procedure and
starts the shell that the system administrator configured for you.

An interactive login session can be established to the UNIX server through a wide variety of
methods and from an equally wide variety of clients. Those methods are connections through a
modem, direct connection from a terminal, and network connections through Token ring and
Ethernet. The exact method used to connect to your UNIX system depends upon the type of
communications interface used to make the connection.

You can see a login prompt after you are connected. In response to that prompt, enter the user
name. Typically you also are prompted to enter your password. The combination of your login
name and password identifies you to the system and restricts access to the server to authorized
people only.

Issuing Commands

Issuing commands to the shell is as simple as typing a command and pressing <enter>. Each
command is terminated by the Enter key, which signals that the shell should process the
instructions the user has typed and execute them. Although Enter is most commonly used, the
semicolon can be used as a command terminator, too. The following example uses the semicolon:

who; date; ls

This executes the who command, followed by the date command, and then the ls command. Each
command is executed as it appears on the command line and is equivalent to typing the command
on separate lines.

The table lists what each of the special characters is used for.

Character Use

Space Separates commands, options, and arguments

Tab Separates commands, options, and arguments

New line Terminates a command

Prompts

The shell informs the user that it is ready to accept input through a prompt. The prompt on the
Bourne shell is traditionally a dollar sign ($); for the C shell a percent sign (%); and the Korn
shell a number followed by dollar sign (4$). In fact, each shell has several different prompts that
inform you of different shell conditions, as illustrated in the given table:

33
Prompt Name Bourne shell C shell Korn Shell

PS1 $ not used !$ or $


PS2 > not used >
PS3 not used not used #?
Prompt not used % not used

The names PS1, PS2, PS3, and prompt are the names of the shell variables that the respective
shells use to identify each prompt. Using these variables, the shell prompt can be changed.

The PS1 prompt used in the Bourne and Korn shell is the standard prompt. The standard prompt
for the C shell is quite different; its name is prompt, and it is a percent sign. The PS2 prompt is
used in the Bourne and Korn shell to indicate that the shell is expecting more input. The PS3
prompt in the Korn shell prompts the user to enter a choice based upon the information provided
in another Korn shell command.

Example of PS1 and PS2 prompts:

$ date’
>

In the above example, the user struck the apostrophe keys he pressed Enter. This caused the shell
to print the second prompt “>” because it was looking for the second apostrophe to complete the
command. It is useful to know the additional control keys that the shell uses.

Special Key Name Explanation

Ctrl+D End of File Logs out or ends input operation

Ctrl+H Backspace Erases the previous key typed or backspace

Ctrl+\ Quit Interrupts the current command

Ctrl+c Interrupt Interrupts the current command or DEL

Ctrl+S XOFF Pauses output on the display

Ctrl+Q XON Restarts output on the display

You can find out what special keys are used on your system with the sty command, which is used
to Query and configure the system about configuration of your connection session. The stty
output looks like the following:

$ stty – a
line = NTTYDISC; speed 38400 baud
erase = ^h; Kill = ^u; min = 6; time = 1; intr = ^c; quit = ^|; eof=_^d;

34
The –a option to sty instructs the command to print the information regarding your connection.
The information that you are looking for is erase, intr, and quit. From this output, erase is Ctrl+H,
intr is Ctrl+c, and quit is Ctrl+D.

Handling Mistakes

The shell also informs you when you make mistakes. If the command you enter is not available or
doesn’t exist, the shell informs you with a message like the following one:

$ daet
daet: not found

This message tells you that the shell can’t find the command you asked it to execute. When you
make mistake typing a command, press the Backspace key to move the cursor back over your
mistake so that you can correct it. It is important to remember that the shell does not know how to
interpret the control codes sent by the arrow keys on your keyboard.

35
3.4 Features

The shell has a number of features that can be used on a regular basis without the need for the
programming language. The shell, depending upon which one you are using, might provide for
wild cards, background jobs execution, job control, and more.

Wild Cards

Wild Cards can be used to perform filename substitution. The Wild Cards used in filename
substitution are the asterisk (*), the question mark (?), and the character class ([..]). Wild Cards
can be used anywhere in the filename and can be combined to produce some complex pattern,
meaning that the pattern match is restricted to those that meet the entire pattern.

The asterisk matches any character zero or more times. For example, if you enter the command Is
–l *, then the shell will substitute all of the files for the asterisk.

$ Is –l *
-rw-- 1 chare users 390 Aug 31 20:15 aba
-rw-- 1 chare users 390 Aug 31 20:14 abc
-rw-- 1 chare users 108 Aug 31 20:15 bab
-rw-- 1 chare users 418 Aug 31 20:15 debbie
$

But the asterisk can be used in many other ways. What if the command Is –l a* were used on
these files? This will list the files aba and abc only.

From this example, you can see that you could also look for file names using the asterisk first,
then some text, or by enclosing the asterisk between some text. Following are illustrations of
these two cases:

$ Is -l *b
-rw-- 1 chare users 108 Aug 31 20:15 bab
$
-rw-- 1 chare users 418 Aug 31 20:15 debbie
$

The given table provides some further examples and explanations for the different types of wild
cards.

36
Example Description

a* Matches all files starting with or followed by zero or more letters


*.doc Matches any file names that end in .doc
text.* Matches any file names that start with text
t*.doc Matches any file names that start with t and end with .doc

Question Mark Wild Card Examples

a? Matches any file with two letters, the first one being an a
?.doc Matches any file that has one letter followed by .doc
??? Matches any three character file name

Combination Wild Cards

a??.* Matches any file that starts with a, is followed by two letters, a
period, and any other characters

*.?? Matches any text followed by a period and two more letters

The question mark matches one character only. Just as the * matches zero or more letters, there
must be one character when the question mark is used. The command ‘Is –l ?’ lists only those
files that have one character . The command Is – l ?? lists only those files that have two
characters in their name. The asterisk and the question mark characters can be combined to
match very specific patterns, as shown under the combination wild cards.

The character class lists files that contain letters from group of characters and it matches one
character of the group. The character class is used by listing the specific characters between left
and right brackets ([ ]). The character class is typically used with either the ? or * wild cards.
The table given below illustrates the uses of the character class, along with more wild cards.

Characters Description

[abc] ?? Matches any three letter filename that starts with the letters a, b, or c

[Abc] * Matches any file name starting with the letter A, b, or c

[abc][xyz] * Matches any file name that starts with a, b, c, has a secnd letter of
x, y, or z, and is followed by other letters

[abcd] Matches any single character file name that is a, b, c, or d.

[abc][ghi] Matches any two character file name

37
More Wild card examples

ls * [!o] Matches any file that doesn’t end with an o

ls *[\!o] Matches any file that doesn’t end with an o

cat chap.[0-7] Matches any file named chap. That ends with 0-7

ls[aft] * Matches any file starting with either the letter a, f, or t

ls t?? Matches any three letter file starting with t

Quotes

The shell also understands the different quotes that are available and that serve different purposes.
The quotation mark characters are as follows:

‘text’ protects the contents


“text” expands the contents
`text` executes as a command

The single quotes instruct the shell to protect and not interpret the text between them. These
quotes are used in shell programs, which are called shell scripts.

The double quotes are used to group words together to form an argument to a command, or to
form a sentence. These quotes are generally used in shell scripts, but also can be used by the
ordinary user.
Consider the following example:

$ tp this is a test
Number of arguments =4
Argument 1 = this
Argument 2 = is
Argument 3 = a
Argument 4 = test

No quotes are around the text provided to the tp command as an argument. As a result, tp gives
four arguments. Consider the next example:

$ tp “ this is a test”
Number of arguments = 1
Argument 1 = this is a test

In this example, the argument, this is a test, is enclosed in “”. This instructs the shell to group the
words together and treat them as one argument. In this case, the double quotes and single quotes
are equivalent. They differ in how they allow the shell to expand a shell variable. The shell has
variables that can be used in shell programs and are used to control the execution of other

38
programs and the user’s environment. This variable is called an environment variable. Consider
the following example in which a variable named TERM is defined. To see the contents of the
shell variable, reference the variable using its name preceded by a dollar sign.

$ tp “ $TERM ”
Number of arguments = 1
Argument = ansi

In this example, the shell expands the variable $TERM and givers the value of the variable to the
program tp. The next example illustrates how the single quote is different from the double.

$ sh tp ‘$TERM’
Number of argument = 1
Argument = $TERM

In the preceding example, the shell cannot see that the $TERM in the quotes is supposed to be a
variable, so it prints the literal $TERM instead.

The back tick quotes are used to put the output of the command in a shell variable. This is useful
because information that might be needed again in a shell program can be easily accessed. Even
at the command line this can be useful. For example, if you have along directory path, you can
store your current directory in a variable and change to that directory again without typing the
path. For example, to save your current working directory in a variable called PWD, type the
following command:

$ PWD = `pwd`

This causes the shell to execute the command pwd and puts the output of the command into the
variable PWD. The contents of the shell variable PWD is printed by using the command echo.

$ echo $PWD

Variables and the Environment

Shell variables are a storage place for information, just like variables in any programming
language. Variable names are case sensitive. They can be any length, can be upper or lowercase,
and can contain numbers and some special characters; variable names cannot start with a number,
nor can they contain the characters special to the shell – the asterisk, dollar sign, question mark,
and brackets. The following are all valid variable names:

TERM
PATH
path
Visual _ id
testDir

Access to the value stored in the variable is gained by prefixing a dollar sign to the name of the
variable.

39
Examining Existing Variables

To see the environment variables configured on your system, use the env or printenv command:

$ env
PATH = : /usr/ ucb :/ bin :/usr/bin
LOGNAME = chare
SHELL = /usr/bin/ksh
HOME = / home/chare
TERM = ansi

The user environment on this system lists the five shell variables. Environmental variables are
typically defined in the system setup files, which are processed when a user logs in, or in the
user’s own setup files.

The PATH variable in the preceding example indicates that the shell looks in the current directory,
then/ usr / ucb, then/bin, and finally/ usr/bin. If the command you entered doesn’t exist in any of
these directories, then the shell reports that it cannot find the command.

When a user logs in to a UNIX system, the system records the user’s login name and saves it in
the LOGNAME variable. This variable is used by many programs to determine who is running
the command.

The SHELL variable defines the name of the shell that the user is currently using as his login
shell and is defined when the user logs in to the system.

The HOME variable defines what the user’s home directory is. This variable is often used by
UNIX commands to find out where information to this user should be written.

The last variable is TERM variable. Many UNIX commands depend upon knowing the terminal
type the user is using. The terminal type is determined when the user logs in either through a
system default or through customization of the user’s startup files. Some systems prompt the user
to enter the terminal type they are using.

C Shell Variables

The C shell was developed at the University of California at Berkeley. When compared to the
Bourne shell, the C shell has a different command structure for its programming language.
Variables are created by using the command set, followed by the variable name and the value, as
in the following example:

set variable = value


set a = seashell

C shell variables are accessed by preceding the variable name with the dollar sign. The C shell
supports a different mechanism for assigning environmental variables. The C shell uses the
command sentenv, followed by the name of the variable and value to be assigned. This places
variable in the environment.

% setenv TERM vt220

40
3.5 Program execution

The shell uses the command line to collect from you the name of the command and the options,
and arguments you want to pass to the command. The shell goes through several steps to execute
a program, as illustrated in figure.

Check
Display
the
error
system
message
for the
comman
No
$ date

Yes
Run the
Load the
command
command

Figure 5 - Program Execution

As illustrated in this figure, the user types the command date. The shell locates the command by
looking in each directory listed in the PATH variable. If the command is not found, the shell
prints an error message. If the command is located, the shell determines if the user can execute
the command and if so, loads the command into memory and requests the kernel to execute it.
The end result is the output from the command, which, in this example, is the date.

The shell is a command, also. This means that you can execute a shell when needed. A shell
executed through the sh command is called a sub shell. A sub shell inherits its parent’s
environment.

Program Grouping

Programs can also be grouped together to control how they are processed. Grouping is
accomplished using the (..) and {..} constructs. The (..) construct causes the commands that are
enclosed between the parentheses to be executed in a sub shell. The (..) construct starts a new
version of the shell and executes the command in this new sub shell. This can be useful to group
a series of commands.

41
($ date; who; Is ) | wc
20 29 163
$

In this example, the date, who, and Is commands are executed in a subshell, and the output of
those commands is then processed by the command wc, which counts the lines, words, and
characters in the output stream. The advantage is that the data output from all three commands is
merged together and sent to the wc command.

The {..} construct instructs the shell to run the commands enclosed between the braces in your
current shell. The following example illustrates the differences between these constructs.
$ var = 1024 set the value of var
$ echo $var print it
1024
$ (var = 2048) in a subshell set var to 2048
$ echo $ var print it
1024
$ (var = 8192; echo $ var) in a subshell, set var to 8192 ad print it
8192
$ echo $ var print the current value of var
1024
$ {var = 2048; } in this shell, change var to 2048
$ echo $ var print it
2048
$

The semicolon in the {var = 2048;} example is required because the {..} construction is
considered a command, and when two commands are grouped on the same command line, they
must be separated by a semicolon.

42
3.6 Background Jobs

Typically, commands that you execute are in the foreground, meaning that the program you are
executing has controlled over your workstation. When you run an interactive command such as
vi, or others, you interact with the command. This is foreground execution. Background
execution enables you to start a non-interactive command and send it to the background. This
allows the command to continue execution and frees your login session.

To execute a non-interactive command in the background, the command and its options and
arguments are typed on the command line, followed by an ampersand (&).

$ long – running – command &


# PID
$

In the above example, long – running – command will execute until it completes, regardless of
how long it takes. The # PID that is returned is the UNIX Process ID number assigned to this
command. Background execution is not suitable for command that required keyboard
interaction.

3.7 Batch jobs

In batch, you have little control if you want to automate a job, or want it executed on a regular
basis. But there is yet another command in this series, called cron. cron is designed to execute
commands at specific time, based upon a schedule. This schedule is known as a crontab file
because of the command that is used to submit it. The cron command is not actually executed by
a user. It is started when the system is booted and remains running, until the system is shut down.

crontab file

The crontab file is used to provide the job specifications that cron uses to execute the commands.
A user has one crontab file that can contain as many jobs as required. For example,

minutes hours day of month month day of week command

Each line in the crontab file looks like this. There can be no blank lines, but comments are
allowed, using the shell comment character, #. The first five fields are integer fields, and the
sixth is the command field, which contains the command information to be executed. Each of
the five integer fields contains informatioin in several formats and has a range of legal values.
The legal values are listed in the following table.

minutes hours day of month month day of week

0 - 59 0 - 23 1 – 31 1 – 12 0 - 6 0 = sunday

43
Each of these five integer fields has a series of formats that are allowable for their values. These
formats include the following:

 A number in the respective range. For example, a single digit.


 A range of numbers separated by a hyphen, indicating an inclusive range. For example,
1-10.
 A list of numbers separated by commas, meaning all of these values. For example,
1,5,10,30.
 A combination of the previous two types. For example, 1-10, 20-30,
 An asterisk, meaning all legal values.

Some sample commands from crontab file:

0 * * * 0-6 echo “\007” >> /dev/console; date>> /dev/console; echo >>/dev/console

0, 15, 30, 45 * 1-31 * 1-5 /usr/local/collect.sys > /usr/spool/status/unilabs

0, 10, 20, 30, 40, 50 * * * * /usr/local/runq –v9 2>/dev/null

5, 15, 25, 35, 45, 55 * * * * /usr/lib/uucp/uucico –r1 –sstealth 2>/dev/null

These four entries are from a crontab file on a real system. They illustrate the different values
that each of the integer fields can contain. The first line means that the command will be executed
at the 0 minute of every hour because the minute’s field is zero, and the hour’s field, which is the
second from the left, is an asterisk, meaning all legal values. This is done for every day of the
month, for every month of the year- both of these fields, the third and fourth, contain asterisk
also. The fifth integer field contains the values 0-6, which indicates that this is done for each day
the week.

In the second line, the minutes field indicates that this command is executed every 15 minutes.
The asterisk in the second field means that this done for every hour of the day. The third field,
which is the day of the month, indicates that this command is ti be run on every day of the month.
When you look at the fifth field, this command is restricted to Monday through Friday. So, if the
day of the month is in the range 1 to 31, and the day of the week is in the range of Monday to
Friday, then the command will be executed.

The following example creates a number of problems:

* * * * * any command

This has the effect of overloading your system very quickly because the command is executed
every minute of every hour, of every day, of every month. Depending upon the command, this
could bring your system to its knees very quickly. Please be careful to avoid crontab entries like
the preceding one.

Creating a crontab File

crontab files are created with your favorite text editor. Do not use word processors, because they
insert special characters and control codes. The file must be plain text, such as that which is

44
created with the vi editor. Each field should be separated by a space or a tab. Blank lines are
not allowed; they cause the crontab command to complain, as you will see subsequently. The
comment symbol is the same as that used for the shells, the pound symbol (#).You should save
your crontab file in your home directory.

Submitting the crontab File

The crontab file, once created, is submitted to cron through the use of the command crontab.
The only argument crontab needs to submit a cron job is the name of the file that contains your
specifications. An example follows:

$ cat cronlist
#
# This jo0
b is executed to remind me to go home
#
15 17 * * 1 -5 mail -s “time to go home” < / dev/ null
$ crontab cronlist
$

This demonstrates a successful submission to crn. But it is not uncommon to make mistakes, as
shown here:

$ cat cronlist
#
# This job is executed to remind me to go home
#
15 17 * * 1 – 5 mail –s “time to go home” < /dev/ null

$ crontab cronlist

crontab: error on previous line; unexpected character found in line.

In the preceding example, there is a blank line at the end of the file. When crontab reads the file
to ensure that its format is correct, it sees the blank line and reports that there is an error in the
file. This is the same error message that crontab will print if there aren’t enough fields on the
line. However, integers that are outside their boundaries are reported, as seen in the following:

$ cat cronlist
#
# This job is executed to remind me to go home
#
99 17 * * 1 -5 mail -s “time to go home” < /dev/ null
$ crontab cronlist
99 17 * * 1 -5 mail –s “time to go home” < /dev/ null
$ crontab cronlist
99 17 * 8 1-5 mail –s “time to go home” </dev/null
crontab: error on previous line; number out of bounds
$

45
Making Changes

The crontab command has two options that can be used to make changes to your submitted jobs.
The first is –r. This option instructs crontab to remove the existing crontab file. The contents of
the file that are used by cron are destroyed. The second option is –l, which lists the jobs that are
currently know to cron for the invoking user. These options are illustrated in the following:

$ crontab –l
#
# This job is executed to remind me to go home
#
15 17 * * 1-5 mail -s “time to go home” </dev/null
$ crontab –r
$ crontab –l
crontab: can’t open your crontab file.
$

This example demonstrates that by using crontab-l you were able to see the commands that you
previously submitted to cron. With the –r option,you removed those specifications, which you
were able to see because no output was generated when you next ran the crontab –l command. In
the final part of this example, crontab tells you that it can’t open you crontab file.
The files used by cron are stored in the directory ./user/spool/cron/crontab. If you want to make
changes to your crontab file, the following steps are used.

1. crontab –l > $ HOME/ cronlist

This retrieves the job specifications that cron currently has and saves them in the cronlist,
in your home directory.

2. vi cronlist

Once you have this information, you must edit it to reflect the changes you want to see.
Edit the file using your favorite text editor, make the changes, save the file.

3. crontab cronlist

Next execute the crontab command with your newly changed cronlist file. This has the
effect of replacing the current information with your new specification.

46
3.8 Input, output and pipes

When you log in to UNIX, three files, or data streams are opened- standard input, standard
output, and standard error. Standard input is typically the keyboard. Standard output is the
output device. Standard error is the error message stream. The standard output and error are
separated so that programmers, users, and system administrators can all take advantage of the
shell’s powerful redirection facilities.

Input and output

When you execute a command, the output of the command is written on standard output. When a
command is executed, any output that the programmer wanted the user to see is written to
standard output. Error messages, such as those printed when an invalid option used, are generally
written to standard error. The standard output and error are generally printed on the terminal.

:
#
#
# @ (#) termlist v1.0 – show a list of supported terminals
# copyright chris hare, 1989
#
# This script will occasionally generate some different looking results
# Which is dependent upon how the termcap file is set up
#

:
#
# Get the system and release name
#
SYS=`uname –s`
REL=`uname –r`
echo “Supported Terminals for $SYS $REL”
echo “ ___________________________________”

grep ‘^..|.*|.*’ /etc/termcap |


sed ‘s/:\\ //g
s/^..|//g
s/:.*://g’|
sort –d | awk ‘{ FS= “|” ; printf “%-15s\t%-40s\n”, $, $NF }’

Any error messages are written to standard error, which appears the same as standard output
because they both print on the terminal device.
The input/output redirection facilities enable you to change to where the input, output, and error
streams are connected. This increases the level of flexibility in the operation of the system.

Redirection allows the output of a command to be written in a place other than what is typical.
You perform output redirection by writing the command and following it with a > and the name
of the file to which the output should be written.

47
Following are examples are for standard output redirection. The object on the right hand side of
the > sign must be a file, as in the following syntax:

Command > file

The following example run the command who and save the output in a file.
$ who > /tmp/save
$ cat /tmp/save
chare console jun 25 16:36

Using the > file construct instructs the shell to create the file if it doesn’t exist. IF the file does
exist, the information is lost. If you want to append information to the existing file, use >> rather
than >, as in the next example:
$ date >> /tmp/save
$ cat /tmp/save
chare console jun 25 16:36
Wed jun 29 09:53:47 EDT 1994

Notice that the file / tmp/save still has the information from the previously executed who
command and the output from the date command.

You can redirect where the standard error messages are written, but it is done somewhat
differently than with standard output. You must add a file descriptor number, as in the following
line.
Command 2> file

Three file descriptors are associated with data streams. Zero (0) is standard input; One (1) is
standard output; and two (2) is standard error.

Standard input is generally associated with your keyboard. You can also redirect from where
standard input comes by typing the command, followed by the < and a file, which has the input
needed for the command. The following is an example:

command < file

Input redirection is used infrequently when compared with output redirection, but can be used to
provide input to an interactive command. The following is an example of input redirection.
$ cat list
banana
apple
The sort command accepts its input and sorts the data.

Example:
$ sort < list
apple
banana

48
Using exec

Another method of input/output redirection is to use the exec command. The following example
performs the redirection only for this one command:

who > tmp / save

But if you want to catch the output of a number of commands in a file, you can redirect the entire
data stream. The following example redirects the standard output stream to the file / tmp /
stdout:

$ exec > /tmp / stdout


$ date
$ who
$ ls
$ exec > ?dev / tty
$ pwd
/tmp

In this example, the standard output is to be saved in the file / tmp / stdout. The shell still
displays the prompt because it is displayed on standard error. The execution of the who, date, and
ls command looks like no output is generated. Then you restore the standard output to the
terminal by using a terminal device known as /dev / tty, which is a special name connected to the
connection port you are using.

In the C shell, input and output redirection is slightly different, but the mechanisms work
similarly. The C shell does not allow the use of numbers. Rather, an ampersand (&) is appended
to the redirection symbol. This has the effect redirecting both standard output and standard error
to the same place. The following example redirects standard output and standard error in the C
shell.
% who – Z > & / tmp / save

Pipes

Pipe (|) is a mechanism that connects the output of the command to the input of another. By using
pipes, you can build very powerful commands without having to learn a high level programming
languages. Commands must meet the following requirements to be used in a pipe.

 The command must write to standard output


 The other command in then pipe must read from standard input.

Pipes have a command on each side of the | symbol, as in the following example:

command | command

This is called pipeline. Pipelines can be long and involved, or consist of only one or two
command. The following program is an example of a complicated pipeline that uses the facilities
of a number of common UNIX commands.

49
#
# Get the system and release name
#
SYS= ` unname -s `
REL= ` unname –r`
echo “ Supported Terminals for $SYS $ REL”
echo “===================================”

grep ‘^.. | .* | .*’ /etc / termcap |


sed ‘ s/:\\ //g
s /^ . .* : // g
s/ : . *: //g’ |
sort –d | awk ‘{ FS= “\”; printf “ %-15s \ t%-40 s\n” , $1, $NF }’

Pipelines do not need to be this complicated. To illustrate standard input, output, and error and
their interaction with pipes, look at the following small shell program. The first program in this
example is called ax1 and is as follows:

$ cat ax1
# ax1 shell program
echo “This is being sent to standard output.”
exit 0

ax1 simply prints the message This is being sent to standard output on the screen. If you were to
run this at the command line, you would see the following:

$ ax1
This is being sent to standard output.
$
The second command is called ax2 and is as follows:

$ cat ax2
# ax2 shell program

while read LINE


do
echo “this came from standard input”
echo “ - > $LINE”
done

ax2 is a little more complicated. It reads standard input and prints each line that is read on the
standard output device. If you run ax2 from the command line, the following happens:

$ ax2
this is a test
this came from standard input
-> this is a test
line 2
this came from standard input
-> line 2
this is for a sample pipeline

50
this came from standard input
-> this is for a sample pipeline
$

Because the ax2 command is reading from standard input, you must tell it when there is no more
input to process. This is done using the Ctrl+d character.

Pipes are used in every situation. To find out how many files are in a directory, use the command
ls | wc -l. If you want to view allthe files in a directory, use ls –l | more.

51
Chapter 4

4. The Command Line


4.1 Creating and Manipulating Files and Directories

A file is a sequence of character, or bytes. When reading these bytes, the UNIX Kernel places no
significance on them at all; it leaves the interpretation of these bytes up to the application or
program that requested them. The data files used by application programs are saved by the
programs in a specific format that the application knows how to interpret.

Understanding Directories

A directory is a file, meaning that it is just a sequence of bytes. The information contained in a
directory is the name of the file and an inode number. The inode is like an index card that holds
the specific information about a file or other directory, include the following:

 Owner of the file


 Group that owns the file
 File size
 File permissions
 Data block addresses on the disk where the information is stored.

The directory structure is part of the heart and soul of UNIX. This is dependent upon the version
of UNIX you are using.

Special Files

UNIX also supports several types of specific files, which fall into one of the following categories:

 Character device files


 Block device files
 Hard link
 Symbolic links

Character device files read and write data one character at a time. Block device files access a
block of data at a time. The block is generally either 512 0r 1,024 bytes. The kernel actually does
read or write the device a character at a time, but the information is handled through a buffering
process that only provides the information when there is a block. Disk appears as both block and
character devices because, depending on the command in use, they function either type of device.
The hard link is a special type of file that allows a single file to have multiple names. Hard links
have the following two restrictions:

 The file and the second name must be part of the same file system.
 Hard links can only provide a second name for files. They do not work for directories.

52
Symbolic links serve the same purpose as hard links. They can span file systems, and they point
to directories.

Moving around a Directory

Moving around in the directory structure requires that you know two commands: cd and pwd.
The cd command is used to change from one directory to another.

cd

The cd command is used to from one place to another in the file system. The cd command uses
the value of the environment variable HOME.

$ pwd
/tmp
$ cd
$ pwd
/home/chare
$

When using cd with an argument, you add the name of the directory that you want to access. You
can use either a full path name or a relative pathname to get to that directory. The following are
some examples of using cd with both types of pathnames:

$ cd /tmp
$ pwd
/tmp
$ cd
$ pwd
/home/chare
$ cd book
$ pwd
/home/chare/book
$ cd /usr
$ pwd
/usr
$

pwd

pwd is an acronym for Print Working Directory. The pwd command accepts no arguments and
prints the name of the directory you are currently working in. For example,

$ pwd
/home/chare/gopher2.012/doc
$

53
Understanding Absolute and Relative Pathnames

Absolute pathnames always starts with a slash. Following is a list of some absolute pathnames:

/
/usr/spool/mqueue
/home/chare
/usr/mmdf

When using absolute pathnames with cd, enter the cd command followed by the name of the
directory. Relative pathnames do not begin with a slash because the term relative mans relative to
the current directory.

Listing Files and Directories

There are two commands that can help you navigate through the files and directories in your
UNIX system.

ls Command

The ls command lists the contents of a directory. It accepts arguments and has a plethora of
options that affect its operation. The following code shows some sample output of the ls
command:

$ ls
CrossRoads . DECterm book
Mail gophermail.tar
News uyhl.pc
$

The ls command has a list of options that can alter the way it lists the files. The most common
options are as follows:

-a lists all file, including hidden files


-C lists in columns
-F shows the file type
-l lists in long format
-d shows directory names only
-r does a recursive listing

The ls –a command lists all files in the directory, including hidden files. Every directory has two
dot files. The current directory is indicated with the single dot (.), and the parent directory is
shown as dot – dot ( ..). The following code illustrates the ls –a command:

$ lx
gamma1 infra _red uvA uvB xray1 xray2

$ ls -a
. .white gamma1 uvA xray1

54
.. .xray3 infra _ red1 uvB xray2

From the above example, the first time the ls command was issued no options were indicated and
not all the files were listed. The second example added the –a option to the command, and now
you see the hidden files as well.

The –C option to ls is the standard mode of operation for BSD systems. The -C option is used to
lists the files in columns. The –C options often is combined with the –F option to see the
executable files and directories in the file list. The following commands show an example of
these two options:

$ ls -C
gamma 1 infra_red1 uvA uvB
xray2

$ ls -CF
gamma1 infra_red1* uvA/ uvB
xray1 xray2

In the second part of the previous code, some files are suffixed with an asterisk or a slash. The
asterisk indicates that the file is an executable. These slash that this is a directory.

The –l option lists the files and the directories in a long format. This format provides most of the
needed information to the user. When looking at your output from ls -l, the time field can show a
date and time, such as date and time.

Sometimes, you must search through many files to find the existing directories. The ls -d option
is used for this purpose. The –d option instructs ls to display only the directories and not the
files. The following code illustrates the use of ls –d in the / usr /lib directory:

$ pwd
/usr/lib

$ ls -l tmac
total 4
-rw- -r- -r - - 1 bin bin 55 Jun 6 1993 tmac.an
-rw- -r- -r - - 1 bin bin 91 Jun 6 1993 tmac.m
-rw- -r- -r - - 1 bin bin 65 Jun 6 1993 tmac.osd
-rw- -r- -r - - 1 bin bin 58 Jun 6 1993 tmac.ptx

$ ls -d tmac
tmac

$ ls -ld tmac
drwxrwxrwx 2 root users 96 Jun 6 1993 tmac

55
In the above code, the initial command shows the contents of the directories tmac. The second
time the ls command was issued, only the –d option was used, which does not list the contents of
the directory tmac. Finally the ls command was issued with both the –l and –d options. This
prints all the information on the tmac directory.

The last option is –R option, which instructs ls to perform a recursive listing of the files in the
directory. The following code shows using ls –R in a small directory structure:

$ ls -lR uvA
total 12
drwxr-xr-x 2 chare users 48 Aug 24 17:53 micro_light
-rw-r- -r-- 1 chare users 438 Aug 24 17:52 test1
-rwxr-xr-x 1 chare users 45 Aug 24 17:53 test3

uvA/micro_light:
total 1
-rw-r- -r-- 1 chare users 29 Aug 24 17:53 sam

The three additional options, used less frequently than the six options, can come in handy for
changing the order in which the list is displayed.

• -r Reverse the display.


• -t Shows the files in order of time modified with the most recently modified files
appearing first.
• -U Shows the files in order of last access time, with most recently accessed files listed
first.

These options can be used individually, or in conjunction with any of the other available options.

The File Command

The file command uses a file called /etc/magic, or on some systems /usr/lib/file/magic, to tell it
what to look for in the file that tells what type of file it is. This is called the file signature. The
file command is on your system is determined by the magic file and how many entries your
vendor included in it. The file command should be followed by a file or list of files that you want
to examine. The following example shows a sample of the file command:

$ file *
FileCabinet: directory
MW.INI: data
a.out: mc68k executable
a1: ascii text
b1: c program text
brk: commands text
city: English text
hello.c: c program text
list: empty

56
$

Creating a File

Files can be created by application programs through output redirection, or as temporary storage
by commands such as language compilers.

Using the touch Command to Create a File

This command creates an empty file and can be used to update the file access time. The syntax of
the command is as follows:

$ touch filename

The creation of a file using the touch command is illustrated in the following code:
The touch command creates a file in the current directory with the name specified on the
command line. This command is also used for updating the access times on files. The following
code illustrates the changes that occur on the access times using touch:

$ touch /etc/passwd
touch: canot change times on /etc/passwd
$ ls -l list
-rw-r- -r-- 1 chare users 29 Aug 24 17:53 list
$ date
Wed Aug 24 18:01:14 EST 1994
$ touch list
$ ls -l list
-rw-r- -r-- 1 chare users 29 Aug 24 18:01 list
$

Editor

You can create files on UNIX by using text editor. Those that are typically part of UNIX
distributions are ed, ex, vi and emacs.

Output Redirection

Files can be created using output redirection. The output redirection enables you to change where
the output of a command is sent. You can send the output to a file using the appropriate symbol
for your shell
Example:
$ cal > /tmp/output
$ cat /tmp/output
August 1994
S M Tu W Th F S
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 28

57
29 30 31
$ ls -l /tmp/output
-rw-r- -r-- 1 chare users 29 Aug 24 17:53 /tmp/output
$

The output of the cal command is being redirected into the file/ tmp / output, thereby creating a
new file. The redirection symbol is the > between the command and the file name.

Making a Copy

Creating a file also can be done by making a copy of an existing file. This process involves the
use of the command cp, which requires at least two arguments as illustrated in the following
examples.

$ cp old _ file new _ file


$ cp file 1 file 2 ....... directory

The first line copies the old file to the new file.

$ ls –l
total 3
-rw-r- -r-- 1 chare users 29 Aug 24 17:04 sam
-rw-r- -r-- 1 chare users 27 Aug 24 17:53 list
-rw-r- -r-- 1 chare users 23 Aug 24 17:03 sample
$ cp list new_file
$ ls -l
total 4
-rw-r- -r-- 1 chare users 29 Aug 24 17:04 sam
-rw-r- -r-- 1 chare users 29 Aug 24 17:53 new_file
-rw-r- -r-- 1 chare users 27 Aug 24 17:53 list
-rw-r- -r-- 1 chare users 23 Aug 24 17:53 sample
$

Reading a File

Aside from using an editor to look at a file, a number of commands enable you to view the
contents of a file without using an editor. The commands used for this purpose are cat, more, and
pg.

The Cat Command

The cat command is used to view a small text file or to send a text file to another program
through a pipe. The cat command has no facility to view the file in manageable chunks. The only
way to do this is to use the Ctrl + s key to suspend the output and ctrl + Q to restart the output
flow. If you are connecting through a network, it is possible that the control commands will not
be processed quickly enough to avoid the “loss” of data off the screen.

The use of the cat command both to view a file and to send it to another program through a pipe
is demonstrated in the following code:

58
$ cat fruits
apple
orange
lemon
lime
banana
kiwi
cherry
$ cat fruits | sort
apple
banana
cherry
kiwi
lemon
lime
orange
$

In the first part, the cat command lists the contents of the file to the screen. In the second part of
the code, the cat command is used to give sort some input to sort.

To find out if the file has more information than you can handle on –screen, you can use one of
the commands. You can also use the command wc to find out how big the file is.

The wc command is a word counter. It scans a given file and counts the number of lines, words,
and characters in file. The wc command uses the whitespace to tell when a word starts and ends.
This tells you whether you can use cat, or if you should use more or pg. The wc command uses
the following format:

$ wc file

In this format, wc reports all three counts: lines, words, and characters, as demonstrated in the
following code. The following example also shows how using the –l option instructs the wc
command to count only the number of lines

$ wc a. out
39 541 21959 a.out
$ wc -1 fruits
7 fruits
$

If the output from wc indicates that the file is more than 20 or 22 lines, then it is probably a good
idea to use either more or pg to view the file.

The More Command

One alternative to the cat command is the more command, which has its roots in BSD UNIX.
Because of its popularity, many other UNIX vendors started including the more command in their
own distributions. The format of the more command is as follows:

59
$ more file

The more command displays the file one screen at a time, making it useful for viewing large files.
As each screen is displayed, more pauses the display and prints a prompt on the last line of the
screen.

Example:
$ more test3
total 220
drwxr-xr-x 2 chare users 48 Aug 24 1996 micro_light
-rw-r- -r-- 1 chare users 29 Aug 24 1996 sam
drwxr-xr-x 2 chare users 43 Aug 24 17:53 abd
-rw-r- -r-- 1 chare users 25 Aug 24 17:04 samply
drwxr-xr-x 2 chare users 58 Aug 24 17:59 egjkrhgo
drwxr-xr-x 2 chare users 68 Aug 24 07:53 ujheue
-rw-r- -r-- 1 chare users 79 Aug 24 07:04 iuyeu
-rw-r- -r-- 1 chare users 99 Aug 24 17:04 uke
--More---(15%)

The last line in the display shows the --More---(15%)prompt. By this, more tells you that it is
waiting for a command, and that you have viewed 35 percent of the total file

The more command has flexibility built in to it- from searching for text, to moving forward, to
starting vi at the line that you are viewing.

The pg Command

The pg command is like the more command, but has more system V background. It enables the
user to view a file one screen at a time. The command format for pg is the same as more, as
shown in the following example:
$ pg file

The following list shows the help screen from pg command:

h help
q or Q quit
<blank> or \n next page
l next line
d or ^D display half of a page more
. or L redisplay current page
f skip the next page forward
n next file
p previous file
$ last page
w or z set window size and display next page
s savefile save current file in savefile
/pattern/ search forward for pattern
? pattern ? or
^ pattern ^ search backward for pattern

60
! command execute command

most command can be preceded by a number, as in +1\n (next page); -1\n (previous page); 1\n
(page).

The previous list shows some similarities between pg and more, but there are some significant
differences. For example, to view the next screen in the file, press Enter, not the spacebar. To
view the next line in the file use the l key, not the enter key. The following code shows the use of
pg to view a file, and how to use the l command:

$ pg test3
total 220
drwxr- xr-x 2 chare users 32 May 16 1993 Clipboard
-rw-r- - r- - 1 chare users 126 Jun 5 1993 Environment
drwxr-xr-x 6 chare users 272 May 3 07:47 Filecabinet
-rw-r---- 1 chare users 63 Jul 29 1993 MW.INI
drwx-------- 2 chare users 32 Apr 30 06: 37 Mail
drwxr- xr-x 2 chare users 32 May 16 1993 Wastebasket
-rwxr-xr-x 1 chare users 21959b May 8 07:01 a.out
:l
-rw-r- -r-- 1 chare users 29 Aug 24 17:04 sam
:l
-rw-r- -r-- 1 chare users 25 Aug 24 16:04 hgf
l:
-rw-r- -r-- 1 chare users 28 Aug 24 19:04 ggy
:

With pg command, you can easily move backward line by line as well as forward line by line.

Removing a file

When a file is removed, the number of the directory in which the file was is changed to zero. This
means that there is no way to connect the filename to the actual information. And after this is
done, you cannot effectively undo the information unless you have a backup copy of the file
saved somewhere.

The rm command

Removing a file is done with the rm command, which has three options: -i,-f and –r. The format
of the rm command is as follows:

$ rm file1 file2 file3 ... …

Like most UNIX commands, you can specify any number of files on the rm command line. The
following code shows removing a file and verifying that it has been deleted.

$ ls -l
total 13
-rw-r- -r-- 1 chare users 29 Aug 24 17:04 sam
-rw-r- -r-- 1 chare users 24 Aug 24 16:04 output

61
drwxr-xr-x 2 chare users 53 Aug 24 15:53 abd
-rw-r- -r-- 1 chare users 99 Aug 24 17:04 ijm
-rw-r- -r-- 1 chare users 09 Aug 24 18:04 jhm
-rw-r- -r-- 1 chare users 49 Aug 24 19:04 sajh
-rwsr-xr-x 1 chare users 79 Aug 24 15:04 samjh
$ rm output
$ ls -l
total 2
-rw-r- -r-- 1 chare users 29 Aug 24 17:04 sam
drwxr-xr-x 2 chare users 53 Aug 24 15:53 abd
-rw-r- -r-- 1 chare users 99 Aug 24 17:04 ijm
-rw-r- -r-- 1 chare users 09 Aug 24 18:04 jhm
-rw-r- -r-- 1 chare users 49 Aug 24 19:04 sajh
-rwsr-xr-x 1 chare users 79 Aug 24 15:04 samj
$

Use the pwd command to ensure that you know where you are in the directory structure. This
helps prevent you from removing something that you don’t really want removed. Next, list the
files, and then type the rm command.

The following list of code demonstrates the best way to remove files using wild cards. Use of the
–i option puts rm into interactive mode. For each file on the command line, rm prompts you with
the name of the file. If you specify y and press Enter, the file is removed.
Example:
$ ls -l
total 33
-rw-r- -r-- 1 chare users 29 Aug 24 17:04 sam
-rw-r- -r-- 1 chare users 24 Aug 24 16:04 output
drwxr-xr-x 2 chare users 53 Aug 24 15:53 abd
-rw-r- -r-- 1 chare users 99 Aug 24 17:04 ijm
-rw-r- -r-- 1 chare users 09 Aug 24 18:04 jhm
-rw-r- -r-- 1 chare users 49 Aug 24 19:04 sajh
-rwsr-xr-x 1 chare users 79 Aug 24 15:04 samjh
$rm -i *
sam: ? y
rm: output directory
ijm: ?
jhm: ? y
sajh: ?
$ ls -l
total 3
-rw-r- -r-- 1 chare users 24 Aug 24 16:04 output
drwxr-xr-x 2 chare users 53 Aug 24 15:53 abd
-rw-r- -r-- 1 chare users 49 Aug 24 19:04 sajh
$

You type n or nothing and press enter, the file is not removed. Remember that when a file is
deleted, it cannot be recovered without the use of backup.

The –f option forces the removal of a file, regardless of the permissions. The use of this option
requires that the user be the owner of the file, or root.

62
The –r option can perform recursive removals of a directory structure, which includes all files and
directories in that structure. The rm –r command is very dangerous . Do not use it with wild
cards unless you are prepared to live with the consequences.

The rm –r command accepts files or directories to be removed. If the argument is a directory,


then rm looks through the directory, removes everything under it., then removes the directory
itself. The following code shows an rm –r command in action:

$ ls -l
total 6
-rw-r- -r-- 1 chare users 29 Aug 24 17:04 sam
-rw-r- -r-- 1 chare users 24 Aug 24 16:04 output
drwxr-xr-x 2 chare users 53 Aug 24 15:53 abd
-rw-r- -r-- 1 chare users 99 Aug 24 17:04 ijm
-rw-r- -r-- 1 chare users 09 Aug 24 18:04 jhm
$ cd new_dir
$ ls -l
total 1
drwxr-xr-x 2 chare users 112 Aug 24 19:03 test1
$ rm -ri test1
directory test1: ?y
test1/a.out: ? y
test1/a1: ? y
test1/a2: ? y
test1/a3: ? y
test1/a4: ? y
test1: ? y
$

Simply using the rm –r command provides no output or error messages unless you do not have
the permission to remove the directory tree. In these cases, rm reports the error message
indicating that you don’t have the needed permission.

Creating the Directory

To make a directory, use the command mkdir, which accepts multiple arguments, each one being
the name of the directory you want to create. Unless indicated by providing a full pathname, the
directories are created in your current directory.

The mkdir command

The syntax for the mkdir is as follows:

$ mkdir directory _ name

The same rules for file names apply for directory names. You can insert a space into a directory
name, but it makes the name difficult to use. Avoid spaces in directory names;

63
use an underscore instead. The directory name also cannot already exist as either a directory or
file. Both cases result in an error message. The following shows some sample directory being
created.

$ ls -l
total 5
-rw-r- -r-- 1 chare users 29 Aug 24 17:04 sam
-rw-r- -r-- 1 chare users 24 Aug 24 16:04 output
drwxr-xr-x 2 chare users 53 Aug 24 15:53 abd
-rw-r- -r-- 1 chare users 99 Aug 24 17:04 ijm
$ mkdir new_dir
drwxr-xr-x 2 chare users 199 Aug 24 17:04 new_dir
$ mkdir sam
mkdir:file exists
$ mkdir /etc/chare
$mkdir: permission denied
$mkdir “space dir”
$ ls -ld spaxe dir
space not found

In this code, you see a directory listing of the files in the current directory. The user makes a
directory new _ dir, and then tries to make another directory ax1. Of course this fails, and mkdir
tells him that it failed because a file called ax1 already exists.
The user then tries to create a directory in / etc. This typically is not permitted because the / etc /
directory, as you may recall, is used by the system as a place to store system administration
commands and system configuration files. Next is a directory name with a space in it. Using a
space in the directory name can lead to all kinds of confusion. This is why you should avoid
using spaces in both file and directory names.

Removing a Directory

Removing a directory is accomplished by using the rmdir or rm –r commands.

The rmdir command

The rmdir command also can accept multiple directory names like the mkdir command, but it has
one requirement. The directory must be empty; it cannot contain any hidden files, files, or
directories at all. If your directory has subdirectories that have subdirectories, removing them can
be a tedious process. The following code illustrates the removal of a directory using rmdir:

$ ls -l
total 7
-rw-r- -r-- 1 chare users 29 Aug 24 17:04 sam
-rw-r- -r-- 1 chare users 24 Aug 24 16:04 output
drwxr-xr-x 2 chare users 53 Aug 24 15:53 abd
-rw-r- -r-- 1 chare users 99 Aug 24 17:04 ijm
-rw-r- -r-- 1 chare users 09 Aug 24 18:04 jhm
$ rmdir /etc
rmdir: /etc: permission denied
$ rmdir space dir

64
rmdir:space nonexistent
rmdir: dir nonexixtent
$ rmdir “space dir”
$ mkdir new_dir/test1
$ rmdir new_dir
rmdir: new_dir not empty
$

In this example, the user tried to create a subdirectory in the /etc directory. For the same reason
that the user can’t create a directory, the user can’t remove /etc. The user then tried to remove
that directory, which has a space in the name, and forgot about the quotation marks. This looked
like two arguments to rmdir, which then complained that it couldn’t find a directory named space,
or dir. Then he remembered about the quotation marks and removed the directory. However, the
directory new_dir couldn’t be removed because the directory isn’t empty. Another command rm
–r was explained earlier.

The mv Command

The mv command does two things: moves a file from one place to another and renames a file.
The cp command is used to copy files from one directory to another. The disadvantage of this is
that your file now takes up twice as much disk space. You can move the file from one place to
another instead. This saves some disk space.

You can accomplish the same thing as the mv command does by copying the file and removing
the original. The syntax to move a file from one directory to another is as shown in the
following:

$ mv file file file …. Directory

This enables the user to enter at least one file name followed by the directory to move the file to.
If more than one file is specified, and the last name is not a directory, the move fails and mv
reports an error. The following code shows the use of mv command:

$ ls –l
total 6
-rw-rw-rw- 1 chare users 656 Aug 24 18:54 ax1
drwxr-xr-x 2 chare users 32 Aug 24 18:52 micro_light
-rw-rw-rw- 1 chare users 605 Aug 24 18:40 sample
-rw-rw-rw- 1 chare users 60 Aug 24 18:41 test
$ mv s* t* /usr/tmp
$ ls s* t*
s* not found
t* not found
$ ls /usr/tmp
sample test
$

Notice that several different types of wild cards are used in this example. In the first ls command,
separate substitutions are made for the c* files and the s* files. The same substitution is made for
the mv command. However, when it comes time to check the / usr / tmp/ directory, separate
substitutions are not used; they are combined using a character class.

65
The second format of the mv command enables you to change the name of a file. You can choose
to change the name and leave the file in the current directory, or you can change the name and
move it to another directory. The syntax for these cases as follows:

$ mv old _ name new _ name


$ mv old _name /new _dir/new _ name

The first example changes the file name and keeps it in the current directory. In the second
example, the file name is changed and the file is put into another directory.

The first example with mv renames the file ax1 to junk. The second example shows that the file
starter .doc is being moved to the /tmp directory, and its name being changed to junk also. If you
were to list the files in the current directory, you would see that the file tty _ lines is missing, but
a file named junk is there instead. When you look at the /tmp directory, you find a file named
junk. If the situation arises, then mv informs you with the appropriate error message.

The ln command

If you need to have the same file known by different names, this is possible without having to
make a copy. A hard link enables you to give a file another name. This is done by creating an
entry in a directory; no additional disk space is consumed.

Creating a hard link is done with the command ln. The syntax of the command is as follows:

$ ln old _ name new _ name

The new name can be either an absolute or relative pathname. The following code illustrates the
use of the ln command to create a hard link:

$ ls –l old_one
-rw-r--r-- 1 chare users 202 Aug 24 19:16 old_one
$ ln old_one new
$ ls –l old_one
-rw-r--r-- 2 chare users 202 Aug 24 19:16 new
-rw-r--r-- 2 chare users 202 Aug 24 19:16 old_one
$

This code shows an ls –l listing of a file called old _ one in the directory. Notice the link count,
which is the number immediately following the permissions. When the example began, the link
count was one. After the ln command was used to create another name for this file, the next ls
shows a link count of 2, meaning that two directory entries in the system point to this file. The
second kind of link, often used in Network File System (NFS) environments is the symbolic link.
The symbolic or soft links are often used for attaching a different directory name but also can be
used for files.

If dir1 and dir2 are directories,

$ ln –s dir1 dir2 creates a soft link.

66
This code shows an example of creating symbolic links using the ln command. The syntax for ln
–s is the same as for ln. In this example, a symbolic link to a file is created. The ls –l output in
the example shows that in the case of dir2, which is a symbolic link to dir 1, the output reports a
line like the following:

dir2 -> dir1

The copy command

The copy command is used for copying a directory and its contents. It has number of options.
They are –r, -o, -m, and –v. These instruct copy to do a recursive copy, retain the owner and last
modification date, and to be verbose about what it is doing.

$ copy -romv ../ andrewg


examine directory ../andrewg/mail400
examine directory ../andrewg/mail400/mtaexe
copy file ../andrewg/matsexe/mta-admin
copy file ../andrewg/mail400/mtaexe/mta
examine directory ../andrewg/mail400/uaexe/mail400
copy file ../andrewg/mail400/uaexe/mail400
examine directory ../andrewg/gopher
copy file ../andrewg/wsg-10.exe
$

This preceding notation tells the shell to go up one level and find the directory andrewg, and then
copy it in the current directory. The remainder of the example shows the output of the copy
command. Copy, when using the –v option, informs you when it copies a file, or when it looks at
a directory to see what there is to copy.

There are other two ways to copy a directory structure. One is to use the command cp –R on
systems that supports it, as shown below:

$ cp –R source destination

Here, the source directory structure is copied to a new directory called destination.
The second method uses the command tar. The syntax of the command follows:

$ cd . source; tar cf- . | (cd dest; tar xfBp -)

This uses the tar command to create an archive of the directory and to send it through a pipe to
subshell which, in turn, goes to the destination directory and uses tar to extract the archive. The
result of this command is a copy of the original structure.

67
4.2 Controlling Permissions to Files

Defining Permissions

File and directory permissions form the combination lock around your data. However, even if
you completely secure your information so that no other user can access it, on most systems the
system administrator can still open and read your files. Even though you can prevent the
majority of people from accessing your files, you cannot prevent all of them.

The Password File

The basic information about who you are is stored in a file called the password file, which is
typically found at / etc/ passwd. The password file entry consists of seven fields:

Username chare
Password A/49wrhyu
UID 1003
GID 104
Comment Chris Hare
Home Directory /home/chare
Shell /usr/bin/ksh

These entries are inserted when your account is created by the system administrator. The
password file is colon delimited, which means that a colon separates each field in the file. The
first entry is your login, or username. Some examples of usernames are as follows:

chare
terri
jimh
rfh

All these are valid user name. The second field in your password entry is encrypted password.
When you login, the password that you type is encrypted to see if there is a match.

The third field is your actual user number, or UID. This number uniquely identifies you to the
system. The fourth field is your group number, or GID. This is your login group, and it identifies
to which group of users you belong.

The sixth field in the password file is your home directory. When you log in to UNIX, it places
you in a particular location of the directory structure. The seventh field is your login shell. There
are number of shells that are available. The value of this field determines if you will be using
the Bourne shell, the Korn shell, or the C shell.

The Group File

The group file contains information about groups of users. The group file has a colon separating
each field in the file, and there are four fields.

68
Groupname tech
Password *
GID 104
user list chare, andrewg, patc

The first field is the name of the group. The second field, which usually contains an x, is for a
group password. The third field is the actual group number, which identifies the group. The
fourth field is a list of comma separated user names.

User ID

Each user is assigned a unique user id or UID that identifies him to the system. Every process
you run, every file you create, is stamped with your UID. This UID is associated only with your
user name.

Group ID

A group is a collection of users who may be assigned together so that they can access a common
set of files or directories. This means that if they do not own the file, but are a member of the
group who owns the file, they can still be allowed access to the file.

The id Command

The command id is used to give the information about the user. As illustrated in the following
code, id tells you your user name, UID, and GID. If your effective UID or GID is different, these
also are listed in the output of id.

$ id
uid = 1003 (chare) gid = 104(tech)
$

Understanding the Permission Bits

Nine permission bits are associated with each file and directory – three for the owner of the file,
three for the members of the same group owner of the file, and three for every one else. If you
are not the owner, but you belong to the same group that owns the file, then the group
permissions control your access.

File Permissions

For each group of users, there is a set of permission bits. These bits correspond to being able to
read, write, and execute the file. The permission field is classified into three components. Read
permissions give the user the capability to open the file and view its contents. This could be done
with the commands like cat, more, and vi. Write permission gives a user the capability to open
the file and modify its contents. Execute permissions gives the user the capability to execute it as
a command.

The following example illustrates permissions in action to control access to files:

$ ls -l output

69
--w - - - - - - -1 chare users 236 Aug 24 20:13 output
$ cat output
cat: cannot open output
$ ls -l output2
-r- - r- -r- - 1 chare users 236 Aug 24 20:13 output2
$ echo “new data” > output2
ksh: output2: cannot create
$ output2
ksh: output: cannot execute
$

The first ls example has no read permissions for anyone. Subsequently, the user tried to overwrite
the contents of the file output2 using shell redirection. Because the file has read permission but
not write permission, this action is denied, as is the user’s request to execute a file.

Directory Permissions

When a user has read permission on a directory, the user can list the contents of a directory. With
write permission, the user can create new files, or delete existing files from the directory. The
execute bit on a directory does not mean that you can execute the directory but that you can use
cd to go into the directory, or use a file in the directory. The following code illustrates the
permissions on a directory in action:

$ ls -l a
a/list:Permission denied
$ ls -l
total 4
dr- - r- -r- - 2 chare users 48 Aug 24 21:09 a
drwxr-xr-x 2 chare users 32 Aug 24 21:08 a2
drwxr-xr-x 2 chare users 32 Aug 24 21:08 micro_light
-rw-r- -r- - 1 chare users 38 Aug 24 21:09 output
$ cd a
ksh: a: bad directory
$ touch a/new
touch: a/new cannot create
$ cp output a2
$

On UNIX systems, the read bit enables the user to list the contents of the directory, but on DEC
Ultrix systems, it takes more than the read bit to list the files. The execute permission bit restricts
access to the directory by controlling if you can use the cd command to go into it. On SCO
systems, if you have the execute bit set and not the read bit, then you can cd into the directory and
use a file if you know the name. The write bit enables users to create or remove files in the
destination.

Interactions

Interaction between the directory and the file can create problems. When a user wants to create a
file, the permissions on the directory are checked. If the user has write permission on the

70
directory, then the file will be created. The following code illustrates the file that has write
permission for all users, but no write permission on the directory.

$ ls -l
total 4
dr- - r- -r- - 2 chare users 48 Aug 24 21:09 a
drwxr-xr-x 2 chare users 32 Aug 24 21:08 a2
drwxr-xr-x 2 chare users 32 Aug 24 21:08 micro_light
-rw-r- -r- - 1 chare users 38 Aug 24 21:09 output
$ ls -ld a2
dr-xr-xr-x 2 chare users 48 Aug 24 21:12 a2
$ ls a2
output
$ rm a2/output
rm: a2/output not removed. Permission denied
$ ls -l a2
total 1
-rw-r- -r- - 1 chare users 38 Aug 24 21:12 output
$ date > a2/output
$ ls -l a2
total 1
-rw-r- -r- - 1 chare users 29 Aug 24 21:14 output
$

Here, the user cannot remove the file using the rm command the file can be essentially removed
because of the write permission on the directory.

The chmod Command

Changing the permissions on a file or directory is done with the chmod command. The syntax of
the command is as follows:

$ chmod mode file (s)

The mode is the permissions that you want to assign. You can write the mode in two ways. One
is called symbolic and other absolute. The symbolic format uses letters to represent the different t
permissions, and the absolute uses a numeric format with octal digits representing the different
permission levels.

Bear in mind that only the owner of the file can change the permissions associated with it.
Remember, though, that the super user or root can also alter the permissions as well.

Symbolic

The symbolic mode uses letters to represent the different permissions that can be assigned, as
outlined in the given table

71
Symbol Meaning

r read

w write

x execute or search

There are different groups of users to which you want to grant permissions. These are the owner,
members of the same group, and all other users.

Symbol Meaning

u Owner or user of the file

g members of the same group

o all other users

a all users

Three operators are used to indicate what is to be done with the permission and the user group.

Symbol Meaning

+ Add the permission

- Remove the permission

= Set the permissions equal to this

To define a mode using symbolic format, you need to decide which users you affect. After you
select which users are to be affected, you need to decide if you are adding or removing the
permission, and then that what permission are you working with. Several examples are shown in
the following code:

$ ls -l
total 8
dr- - r- -r- - 2 chare users 48 Aug 24 21:09 a
dr-xr-xr-x 2 chare users 48 Aug 24 21:12 a2
-rw-r- -r- - 1 chare users 25 Aug 24
21:28 alpha.code- rwxr-xr-x 2 chare users 32 Aug
24 21:08 micro_light -rw-r- -r- - 1 chare users 38 Aug
24 21:08 output -rw-r- -r- - 1 chare users 29 Aug
24 21:28 test2 -rw-r- -r- - 1 chare users 12 Aug
24 21:28 test_1 drwxr-xr-x 2 chare users 32 Aug

72
24 22:19 uVAro $ chmod -r test2
$ ls -l test2
--w- - - - - 1 chare users
29 Aug 24 21:28 test2 $ chmod g+rwx test2
$ ls -l test2
--w-rwx- - -
1 chare users 29 Aug 24 21:28 test2 $
chmod =r test2
$ ls -l test2
-r- -r- -r- - 1 chare users 29 Aug 24 21:28 test2
$ chmod u+rwx, g+r, o+r test2
$ ls -l test2
-rwxr- -r- - 1 chare users 29 Aug
24 21:28 test2 $ls -l test_1
- - -x- -x- -x- - 1 chare users
12 Aug 24 21:28 test_1 $

The first example demonstrates the removal of read permission from the file test2. This results in
the permissions being write only for the owner. The second example illustrates the addition of
read, write , and execute permissions for the group owners. With group option, the permission
change doesn’t affect any other users. The third example shows how to use the = operator. This
instructs chmod to make the permissions on the file.

The next example illustrates how to make multiple changes at once. You could execute chmod
three different times to make the desired changes. If chmod for any reason cannot access the file
or make the requested change, an error message is printed to indicate the problem.

Absolute

The absolute method requires that the permissions for all users be specified, even for those that
are not changing. This method uses a series of octal digits to represent each of the permissions.
The octal values are added together to give the actual permission

Octal representation:

-r w– r- - r - -

42 1 421421

6 4 4

Notice in the following code that the read permission has an octal value of 4; write has a value of
2; and execute a value of 1. To calculate the permissions, add the octal values for each group of
users. You can run the chmod command using the octal value of 644 for the permissions instead
of the symbolic values. The following code shows the example using the absolute method of
chmod.

$ ls -l
total 5
drwxrwxrwx 2 chare users 48 Aug 24 21:12 a2
-rw-rw-rw- 1 chare users 25 Aug 24 21:28 alpha.code

73
-rw-rw-rw- 1 chare users 29 Aug 24 21:28 test2
-rw-rw-rw- 1 chare users 12 Aug 24
21:28 test_1 drwxrwxrwx 2 chare users 32 Aug
24 22:19 uvA $ chmod 200 test2
$ ls -l test2
- -w - - - - - 1 chare users
29 Aug 24 21:28 test2 $ chmod 270 test2
$ ls -l teat2
- -w-rwx- - -
1 chare users 29 Aug 24 21:28 test2 $
chmod 444 test2 $ ls -l
test2 -r- -r-
-r- - 1 chare users 29 Aug 24 21:28 test2 $ ls -l
test2 -rwxr-
-r- - 1 chare users 29 Aug 24 21:28 test2 $
chmod 111 test_1
- - -x- - x- -x 1 chare users 12 Aug 24 21:28 test_1
$

The commands are equivalent, so let’s compare them.

chmod – r test2 chmod 200 test2

chmod g+ rwx test2 chmod 270 test2

chmod=r test2 chmod 444 test2

chmod u+rwx, g+r, o+r test2 chmod 744 test2

chmod –r, -w, a+x test_1 chmod 111 test_1

In the first example, in which the permission mode is 200, you are assigning write only to the
owner. In the second example, you assign write only to the owner, and read, write, and execute
for the members of the same group, with no permissions for other users.

Default File Permissions- umask

The default permissions for a file or directory are established when the file or directory is created.
The default permissions are controlled by a value called umask. The default permissions for afile
and directory, with a mask value as shown in the following:

$ umask

022

The umask command enables you to change the default file permissions when you create a file or
directory. The umask is applied to the default permissions for a file and a directory. For example,
the default file permissions for a file is 666, or read and write access for everyone on the system.
This is not the most optimal thing to do, so apply the umask, which here is 022.

74
666

022

__

644

The operation here is not subtracting although it appears that way. This is a bitwise exclusive
OR(XOR) operation. This example results in read and write for the owner, with read only for
everyone else. The following example applies a umask value of 001.

666

011

___

666

In this case, a umask value of 001 has no effect because the execute bits are not turned on. If you
want read or write for the owner, with no access rights for anyone else. In this case the umask
value will be 066.

666

066

___

600

When the umask value is used, it removes the read and write bits for the group and other users,
which leaves the read and write bits for the owner intact.

But the umask applies to directories as well, so if you are going to customize the value, you must
consider the impact on the directories. The default permissions for a directory are 755, which
gives the owner read, write and search, with read and search for all other users. Like the file, the
actual default is 777, so any user can do anything.

777

022

___

755

The preceding example shows that the umask is working correctly. How about in the next case:

75
777

011

___

766

This example means that the group and other users can list files and create or delete them, but
they cannot cd to this directory.

777

066

___

711

A umask value of 066 demonstrates how to allow people access to a directory while preventing
them from creating, listing, or removing files.

In the situation where you want to prevent access except yourself, then you need to remove the
read, write, and execute / search bits for all users except the owner. This is accomplished using a
value of 077, which changes the default permission on the directory to 700.

777

077

___

700

In the above example, directories are protected by preventing access for any user but the owner.
The umask value of 077 is used to protect your files.

The following example illustrates how to change umask and that files and directories created
have the new permissions.

$ ls -l
total 5
drwxrwxrwx 2 chare users 48 Aug 24 21:12 a2
-rw-rw-rw- 1 chare users 25 Aug 24 21:28 alpha.code
-rw-rw-rw- 1 chare users 29 Aug 24 21:28 test2
-rw-rw-rw- 1 chare users 12 Aug 24
21:28 test_1 drwxrwxrwx 2 chare users 32 Aug 24
22:19 uvA $ umask
022
$ umask 077

76
$ umask
077

$ mkdir new_dir
$ ls -ld new_dir
drwx - - - - - 2 chare users 32 Aug 25 01:20
new_dir $touch new_file
$ ls -l new_file
-rw- - - - - - 1 chare users 0 Aug
25 01:21 new_file $

Advanced Permissions

Several advanced permissions are available in UNIX. They are Set User ID, or SUID, and Set
Group ID, or SGID. Two types of identification numbers are carried for you by the system: your
real UID and your effective UID. The real UID matches the user name you logged in with. Your
effective UID is your alias.

For example, the file/etc/passwd, which you know contains some information about your account,
is protected by a set of file permissions. The following string shows those permissions:

-rw – r - - r- - 1 root 1364 Apr 14 10:45 /etc/passwd

Now, if the /etc/passwd file is not writable by anyone but root, then how can you change your
password? Look at the passwd command, which is usually found in /bin, but may be located
elsewhere, as in the ollowing example:

- rwxr– xr – x 3 root 303104 Mar 19 1991 /usr/bin/passwd

Notice that the permissions bits are different on this program. An ‘s’ is where an ‘x’ would be in
the owner’s permissions. This is an SUID program. If you could run the id command while you
were running the passwd command, you would see that your effective id is root. So, the SUID bit
means that while you are running the program, you look like the owner of the program.

The second permission is SGID. An example of an SGID command is as follows:

- rwsr – sr – x 1 root kmem 180224 Apr 5 1991 /usr/bin/mail

This command, /usr/bin/mail, is an SGID program. This means that when the user runs
/usr/bin/mail, he seems to be root and belong to a group called kmem.

The chown Command

The third of the advanced permissions is called the sticky bit. This was used as a memory
management tool. When set, it instructed the kernel to keep a copy of the program in RAM, even
if no one was using the program at the time.

When using directories, the sticky bit has different meanings on files and directories. The
following example illustrates how this sticky bit is shown in ls:

77
$ ls -d /tmp

drwxrwxrwt | root root 512 Aug 15 14:03

The sticky bit is shown by the letter ‘t’ in the ‘x’ position of the ‘other’ permission bits. This
means that even though the directory has read and write for all users, you cannot remove the file
if you don’t own it.

Every user on the system is assigned a unique UID. Suppose the user don’t need a particular file
and want to give it to someone else. You can do this only if you are a owner of the file.

To accomplish the owner change, use the command chown, which has the following command
syntax:

$ chown user file (s)

The user name used in the chown command can be either a numerical UID, or the textual login
name. The following code exemplifies chown in action:

$ touch new
$ /etc/chown patc new
chown: can’t change ownership of new: Not owner
$

From the above example, some versions of chown are restricted. If the user is not a root, the
super-user, then they can’t execute the command successfully.

$ chown andreewg new


chown: unknown user id andreewg
$ chown andrewg new
$ ls - l new
- rw – r - - r - - 1 andrewg group 25 Aug 24 21:28 new
$

The file in this example is a sample chown from a version of System V UNIX. In this case, the
use of chown is not restricted.

The chgrp Command

Like chown, a command exists to enable users to change the group to which the file belongs.
Consider the following example. If you want allow a group of users to access a file, then you
may change the group to be the same as theirs. Like the chown command, only the owner of the
file may change the group name.

$ ls _ l new
$ chgrp gopher new
chgrp: you are not a member of the gopher group.
$ chgrp tech new
$
ls - l new

78
The chgrp command gives a user the opportunity to change the group who owns the file. As
illustrated in the previous code, the capability to change the group on a file requires that you also
be a member of the group.

4.3 grep and find

Understanding the difference between grep and find

The grep is equivalent to the find command under DOS. Both these commands look for text in a
file. Grep is one one of three commands in the grep family, namely grep, egrep, and fgrep.

The filnd command looks for file names in the directory structure based upon a wide range if
criteria such as fil name, file size, permissions, and owner.

Using Regular Expressions

A regular expression is a method of describing a string using a series of metacharacters, like in


the shell. The metacharacters are assigned a special meaning when used in the regular expression
context, but some of the metacharacters used in regular expressions overlap with the shell.

Wild Cards and Metacharacters

The table given below shows the wild cards and metacharacters.

Character Description

c Any non special character; c matches itself


\c Turns off any special meaning for character c
^ Positions expressions at the beginning if a line
^c Any line in which c is the first character
$ Positions the expression at the end of of a line
c$ Any line in which c is the last character
. Any single character
[…] Any one of characters in ….; ranges like a-z are legal
[^…] Any single character not in …; ranges are legal
\n W what the nth \(…\) expression matches
r* Zero or more occurrences of r
r+ One or more occurrences of r
r? Zero or one occurrences of r
r1r2 r1 followed by r2
r1\r2 r1 or r2
\(r\) Tagged regular expressions
(r) Regular expression r

The ^ operator anchors a pattern to the beginning of a line. For example, the pattern :

79
^ the

matches occurrences in which the word ‘the’ is at the beginning of a line. In fact, the t must be
the first character on the line.

The $ operator anchors patterns to the end of a line, as illustrated in the following example:

the$

Although a special character that represents a new line does exists, no metacharacter is used is to
match only a new line. This operator does not count the new line in its search. In the preceding
example, the word the is only matched when the letter e is in fact the last letter in the line.

The single character wild card is a period (.). Consider the following example:

th.

This example matches any word that has the letters t and h, followed by any character. Any
number of . can be put together to match a string.

^the..

..th$

In these two examples, you are still looking for the occurrences of th and some letter. In these
cases, both expect to find two letters before or after the th.

The next meta character is a character class. The character class in regular expressions is the
same as in the shell. Any single character in the group indicated in the class is matched. The table
given below shows some sample classes that can be used in either the […] or [^…].

Example Description

[abc] Matches one of a or b or c

[a-z] Matches one of any lowercase letter between a and z

[A-Z] Matches one of any uppercase letter between A and Z

[0-9] Matches any one number between 0 and 9

[^0-9] Matches any character other than between 0 and 9

[-0-9] Matches any character between 0 and 9, or a “-“

[0-9-] Matches any character between 0 and 9 or a “-“

80
[^-0-9] Matches any character other than between 0 and 9 and “-“

[a-zA-Z0-9] Matches any alphabetic or numeric character


The [^..] operator indicates that you do not want to match certain letters, as in the following
example:

th[^ae]n

This example does not match words like then and than, but does allow a match for thin.

The next operator in regular expressions is the closure operator, or the *. This applies to the
preceding pattern and collectively matches any number of successive occurrences of that pattern.
For example, the following does not match chare:

Char *

This example matches char, or charr, charr, and so on.. Consider a few other examples:

[a-z]* -> matches any single character plus zero or more occurrences of that same character. If
the first letter matched from this class is an r, then [a-z]* matches any number of the letter r after
the first one.

grep Command

grep stands for Global Regular Expression Print. The command itself is a filter. It is designed to
accept input and filter it to reduce the output. Grep accepts its input and compares the pattern to
match with the input. If the pattern matches, then the line containing the match is printed.
Otherwise, no output is generated. The following code illustrates using grep to extract
information from the password file:

$ grep chrish /etc/passwd


$ grep chare /etc/passwd
chare: A/49w7Ab:1003:104:Chris
Hare:/home/chare:/usr/bin/ksh

This illustrates the format of the grep command, which is as follows:

grep pattern file (s)

In the first line in the preceding example, the pattern is chrish, and the file is /etc/passwd. grep
reads the contents of the file looking for chrish. Because grep did not print anything, you know
that the file does not contain chrish. In the second line pattern is chare. In this case, grep prints
one line from /etc/passswd.

Using grep

The table lists the options available for grep:

81
Options Description

-b Prints the line with the block number

-c Prints count of matching lines only

-e pattern Used when the pattern starts with a-

-f filename Takes the pattern from the file

-h Does not print the file name

-i Ignores case of letters in the comparison

-l Prints only file names with matching lines

-n Prints line numbers

-s Suppresses file error messages

-w Search for an expression as for a word

-y Ignores case of letters in the comparison

-v Prints non matching lines

-x Prints exact lines matched by their entirety

Counting occurrences (-c)

How many directories are there in emp1.lst and emp2.lst? The –c option counts the occurrences,
and the following example reveals that there are two of them in each file:

$ grep -c ‘director’ emp?.lst


emp1.lst:2
emp2.lst:2

Displaying line numbers (-n)

The –n option can be used to display the line numbers containing the pattern, along with the lines:

$ grep -n ‘marketing’ emp.lst 3:5347|


sumit |d.g.m |marketing |19/04/43|8000 11:3254|dfdfg|director |
marketing |12/04/35|8900

The line numbers are shown at the begining of each line, separated from the actual line by a :

82
Deleting Lines (-v)

The –v option selects all but the lines containing the pattern. Thus you can create a file otherlist
containing all but directors:

$ grep -v ‘director’ emp.lst > otherlist


$ wc -l otherlist
11 otherlist

Displaying Filenames (-l)

The –l option displays only the names of files where a pattern has been found:

$ grep -l ‘manager’ *.lst


design.lst
emp.lst

Ignoring Case (-i)

When you look for a name, but are not sure of the case, grep offers the –I option, whose ignores
case for pattern matching:

$ grep –I ‘agarwal’ emp.lst 3564|


sudhir Agarwal |executive|personnel |06/07/47|5400

this locates the name “Agarwal”, but it can’t match the names “agrawal” and “aggarwal” that are
spelled in a similar manner, while possessing some minor differences. With the –e option (SCO
UNIX), you can match the three agarwals by using grep.

Printing the Neighbourhood

GNU grep in Linux has a option that enables locating not only the line matching the pattern, but
also a certain number of lines above and below it. For instance, the command

grep -5 “do loop” update.sql

locates the string “do loop” and displays five lines on either side of it.

Printing a specific Number of Lines (-N)

The –N option lets you know quickly whether a pattern occurs in a file or not. The following
command displays the first occurrences of “Bill Gates” and exists to the shell:

grep -N 1 “Bill Gates” nt_unix.txt

The argument with –N controls the number of occurrences. So, -N2 would list two occurrences.

egrep: Extending grep

83
The egrep command offers all the options of grep. egrep’s set includes some additional
characters not used by either grep or sed.

Expression Significance

Ch+ Matches one or more occurrences of character ch

Ch? Matches zero or more occurrences of character ch

exp1 | exp2 Matches expression exp1 or exp2

(x1|x2) x3 Matches expression x1x3 or x2x3

Searching for Multiple Patterns

How do you now locate both “sengupta” and “dasgupta” from the file, a thing that grep can only
do using multiple -e options? Delimit the two expressions with the |, and the job is done:

$ egrep ‘sengupta|dasgupta’ emp.lst 2365|


barun sengupta |director |personnel | 11/05/47|7800 1265|
s.n. dasgupta |manager| sales | 12/09/63|5600

egrep thus handles the problem easily, but offers an even better alternative. You can group
patterns using a pair of parantheses, as well as the pipe:

$ egrep ‘(sen|das)gupta’ emp.lst 2365|


barun sengupta |director |personnel | 11/05/47|7800 1265|
s.n. dasgupta |manager | sales | 12/09/63|5600

The –f option: storing patterns in a file

If the patterns are more, egrep offers the –f option to take such patterns from the file. Let’s fill up
a file with some patterns:

$ cat pat.lst admin|accounts|


sales

This file must contain the patterns, suitably delimited in the same way as they are specified in the
command line. When you execute egrep with the –f option in this way:

egrep -f pat.lst emp.lst

The command takes the expression from pat.lst

egrep enhances the power of grep by accepting both alternative patterns, as well as patterns from
the file.

84
fgrep: Multiple string Searching

fgrep, like egrep, accepts multiple patterns, both from the command line and a file, but unlike
grep and egrep, doesn’t accept regular expression. So, if the pattern to search for is a simple
string, or a group of them, fgrep is recommended.

Alternative patterns in fgrep are specified by separating one pattern from another by the newline
character. You may specify these patterns in the command line itself, or store them in a file in
this way:

$ cat pat1.lst
sales
personnel
admin

Now you can use fgrep :

fgrep -f pat1.lst emp.lst

find Command

find is not a filter. It cannot be in the middle or at the end of a pipe. It can provide information at
the top of the pipe. find is used to search the UNIX file system looking for files given a certain
criteria. The command syntax for find is as follows:

find path predicate – list

The path is from where find starts the search. It must be specified. The predicate list consists of
the search criteria and command that you want to do with the located files. The examples in the
following code illustrate find in action. The –print option prints the names of the files that are
found. When find exists, it sets a return code that tells the shell that it did find some files.

$ find /usr -atime 10 -print


/usr/bin/id
/usr/bin/egrep
/usr/lib/ua/uasig
/usr/include/grp.h
/usr/include/pwd.h
/usr/include/stdio.h
/usr/include/sys/types.h
/usr/include/time.h

Apart from the –print option, the code also illustrates how to use the –atime option to find
accessed files. Three separate times are stored in the inode for each file. The find command in
the previous code looks for file accessed in the last ten days. The following code illustrates two
more options. The –name option requires an argument. This argument can use the file
substitution wild cards, but if it does, it must be enclosed in quotes. This example looks for files
that end in .ps. Notice that it starts from the current directory. After the files are found, you can
print the names, and then write the file to the device specified with the –cpio option. In this case,
the device is /dev/rmt0h. When all the file names are printed, the number of blocks written to the
device is printed.

85
$ find . –name “*.ps” -print -cpio /dev/rmt0h
./a.out.ps
./a1.ps
./a2.ps
./a3.ps
./a4.ps
.
50 blocks
$

find also has two Boolean style operators, which provide the capability for find to match files
based on the one or more criteria. For example, the following code illustrates that files named
care or ending in .bak are to be removed:

$ find / \ (-name core -o -name “*.bak” \) -exec rm -f() \:

The –ctime option is only different in that it looks at the date the inode information was last
changed. This information generally is the permissions on the file.

$ find / -ctime 3 -print


/usr/adm
/usr/spool/lp/request/Epson
/usr/spool/uucp/SYSLOG
/usr/spool/uucp/o.Log-WEEK
$

Using the –exec option, you can execute any command on the found files:

$ find . –name “*.ps” -exec ls -l {} \;


-rw - - - - - 1 chare users 6356 Aug 25 19:25 ./a.out.ps
-rw - - - - - 1 chare users 66 Aug 25 19:25 ./a1.ps
-rw - - - - - 1 chare users 356 Aug 25 19:25
./a2.ps -rw - - - - - 1 chare users 56 Aug 25 19:25
./a3.ps -rw - - - - - 1 chare users 6356 Aug 25 19:25
./a4.pss $

This option instructs find to execute the named command on each file found. The tricky part is
the syntax for the option. The syntax involves the command to be executed and its own options, a
pair of curly braces, and a command terminator, as shown here:

-exec command {} \;

The curly braces instruct find to place the found file here. For example the following code:

-exec ls -l {} \;

matches the file sampler.ps in the find statement, then the following exec instruction executes the
command:

ls -l sampler.ps

The \; is used to terminate the instruction as it is passed to the shell.

86
The following code examines finding files based upon the group that owns the file. This example
looks for files that have a group ownership of mail.

$ find / -group mail –print


/bin/mail
/bin/rmail
/u/chare/FileCabinet/choreo/policy/usr/itools/frame/dead.letter
/usr/mail
/usr/mail/uucp
/usr/mail/chare
/usr/spool/uucp/LOGDEL
/usr/spool/uucp/SYSLOG

/usr/spool/uucp/Log-WEEK
/usr/spool/uucp/o.Log-WEEK
/usr/spool/uucp/LOGFILE
/usr/local/elm
/usr/local/filter
$

The following example looks at the –ok option, which is like the –exec option. The difference is
that while the –exec option simply executes the command, the –ok option asks the user if the
command should be run.

$ find . –name “*.ps” -ok ls -l {}\;


<ls . . . ./a.out.ps >? n
< ls . . . ./a1.ps >? Y
-rw- - - - - 1 chare users 290 Aug 25 19:25
./a1.ps < ls . . . ./a2.ps >? Y
-rw- - - - 1 chare users 390 Aug 25
19:25 ./a2.ps < ls . . . ./a3.ps >? N
< ls . . . ./a4.ps >? N
$

The –type option enables you to locate the files based upon the file type. The given table lists the
valid arguments for the –type option.

Symbol Description
b block special file

c character special file

d directory

f regular file

p named pipes and FIFOs

As illustrated in the following example, you can find files that are only directories:

87
$ find . –type d -print
./Filecabinet
./wastebasket
./Clipboard
$

88
4.4 Extracting data

Extracting the data involves some commands that allow the manipulation of data directly in files.
Specifically, this section discusses controlling which part of a file you can look at using the
commands head and tail; how to extract information from a file using cut; and put it back again
together using a paste. The join command is like paste but is used to join lines of text based upon
a common field in each file

head Command

The head command is used to print the top of the file. The command syntax for head is as
follows:

$ head file(s)

or

$ head –num file(s)

The first format of the command prints the top ten lines of each of the named files.
You can specify a line count and display., say, the first three lines of the file. Use the symbol
followed by a numeric argument:

$ head -3 emp.lst
2365|barun sengupta |director |personnel |11/05/47|7800
1265|s.n. dasgupta |manager |sales |12/09/63|5600
6456|Damarla |GM |production |19/04/65|6000

You can use head to find out the record length by word counting the first line of the file:

$ head -1 emp.lst | wc -c
58

tail command

The tail command displays the end of the file. It provides the last ten lines when used with
arguments. The last three lines are displayed in this way:

$ tail -3 emp.lst
8764|sudhir Agarwal |executive |personnel |06/07/67|7500
8765|sanju |g,m |marketing |12/05/78|8000 7657|
nity |g.m |sales |24/09/34|9000

tail has a –f option that enables you to monitor the growth of a file. The system administrator
often uses a tail –f option to view the log file that is written by the installation process of many
software.

tail -f /oracle/app/oracle/product/7.3.2/orainst/install.log

The $ prompt doesn’t return even after the work is over. With this option, you have to abort the
process to exit to the shell.. Use the interrupt key applicable on your system.

89
cut and paste commands

The cut command is used to cut the information from files, based upon character position, or field
within the file. The syntax for the cut command is as follows:

$ cut options files

The features of cut command will be illustrated with specific reference to the file shortlist, which
stores the first five lines of emp.lst

$ head -5 emp.lst 2365|


Barun Sen Gupta |Director |Personnel | 11/05/47|7800 1265|
Das Gupta |Manager | Sales | 12/09/63|5600 6456|
Damarla |GM |Production |19/04/65|6000 8764|Sudhir
Agarwal |Executive |Personnel |06/07/67|7500 8765|Sanju
| GM | Marketing |12/05/78|8000

cut can be used to extract specific columns from this file, say those signifying the name and
designation. The name starts from column number 6 and goes upto column number 27, while the
designation data occupies columns 29 through 50. Use cut with the –c option for cutting
columns:

$ cut -c 6-22, 24-32 shortlist


Barun Sen Gupta Director
Das Gupta Manager
Damarla GM
Sudhir Agarwal Executive
Sanju Marketing

Files don’t contain fixed length records, in which case, it is better to cut fields rather than
columns. There are two options used for this purpose. The –d option for the field delimiter, and
–f for specifying the field list. This is how you cut the second and third field.

$ cut -d \| -f 2,3 shortlist | shortlist | tee cutlist1


Barun Sen Gupta Director
Das Gupta Manager
Damarla GM
Sudhir Agarwal Executive
Sanju Marketing

To cut out fields numbered 1, 4, 5 and 6, and save the output in cutlist2, follow a similar
procedure:

cut -d “|” -f 1,4- shortlist > cutlist2

The paste Command

It is a special type of concatenation in that it pastes files vertically, rather than horizontally. cut
was used to create two files cutlist1 and cutlist2 containing two cutout portions of the same file.
Using paste, you can fix them laterally:

90
$ paste cutlist1 cutlist2
Barun Sen Gupta |Director 2365|Personnel |11/05/47|7800
Das Gupta |Manager 1265| Sales |12/09/63|5600
Damarla |GM 6456| Production |9/04/65|6000
Sudhir Agarwal |Executive 8764|Personnel |6/07/67|7500
Sanju | GM 8765|Marketing |2/05/78|
8000

By default, paste uses the tab character for pasting files, but you can specify a delimit of your
choice with the –d option:

$ paste -d \| cutlist1 cutlist2


Barun Sen Gupta |Director |2365 |Personnel |11/05/47|7800
Das Gupta |Manager |1265 | Sales |12/09/63|5600
Damarla | GM |6456 | Production |
19/04/65|6000 Sudhir Agarwal |Executive |8764 |Personnel
|06/07/67|7500 Sanju | GM |8765 |
Marketing |12/05/78|8000

Even though paste uses at least two files for concatenating lines, the data for one file can be
supplied through the standard input. If, for instance, cutlist2 doesn’t exist, you can provide the
character stream by cutting out the first, fourth, fifth and sixth fields from shortlist, and piping the
output to paste:

$ cut -d \| -f 1,4 - shortlist |paste -d “|” cutlist1 –


Barun Sen Gupta |Director |2365 |Personnel |11/05/47|7800
Das Gupta |Manager |1265 | Sales |12/09/63|5600
Damarla | GM |6456 | Production |
19/04/65|6000 Sudhir Agarwal |Executive |8764 |Personnel
|06/07/67|7500 Sanju | GM |8765 |
Marketing |12/05/78|8000

You can also reverse the order of pasting by altering the location of the –sign:

cut -d “|” -f 1,4- shortlist |paste -d “|” - cutlist1

join Command

The join command takes lines from two files and joins them together, based upon a common field
or key. To use join, both files must share a common piece of data, called the key or primary field.
The files must be in the same sorted order. Let’s look at the source file for an example:

File1
$ cat list1.txt
BC 604
ALTA 403
SASK 306
MAN 204

File2
$ cat list2.txt
BC British Columbia

91
ALTA Alberta
SASK Saskatchewan
MAN Manitoba

$ join list1.txt list2.txt


BC 604 British Columbia
ALTA 403 Alberta
SASK 306 Saskatchewan
MAN 204 Manitoba

The following code shows using the join command to merge the two files and what the output
look like:

“$ join list1.sort list2.sort > list” followed by “$ cat list” will give the same output as above.

92
4.5 Redirection and Piping

Redirection is a process of sending the output of a command to a place other than the terminal. A
pipe allows the output of one command to be sent directly to the input of another command.

Standard Input, Standard Output and Standard Error

For every command, three files are opened – standard, input, standard output and standard error.
Standard input is where the input for a command comes from e.g., keyboard. The Standard
output is where the output of the command is sent during processing or after the process is
complete e.g., video device connected to a terminal or work station. Standard error is separate
from standard output, even though it generally sends its output to the same place. It provides a
mechanism for error messages to the user executing the command.

Redirection

Redirection is used to connect the output of a command to a file or take input for the command
from a file.

Example: $ command > file


$ ls > dirlist

Input and output redirection can be used at the same time; they are not mutually exclusive as the
following example illustrates.

Example: $ sort < infile > outfile

Here, sort reads data to be sorted from the file “infile” and puts the sorted data in the output file
“outfile”. Redirection does not interact with other commands like pipes. Redirection can be used
on the command line, shell scripts and in the cron command.

Pipes

Pipes are a method of connecting the output of one command to the input of another, i.e., it
connects the output of the command on the left of the pipe with the input of the command on the
right. Any number of commands can be grouped together in a pipe.

Example: $ ls | sort | pg
$ grep word file | wc

Using redirection (output) will involve a temporary file for storing the output and then redirecting
(input) it to another command. The pipe can avoid the temporary file.

Example:
Using redirection,
$ ls > list
$ sort < list

Using pipes, the same can be accomplished by the command.

93
$ ls | sort

The advantage of using pipes is to avoid large number of temporary files in case of large shell
scripts.

Example: $ cat file | sort


$ sort file
$ sort < file

All these commands accomplish the same thing. The difference is that the first command must
start the command cat, open the file, and pipe the output to sort, whereas the second ad third
commands simply open the file and sort it. The end result can be achieved with one command. It
is inefficient to use the first type of command.

tee command

The tee command is used in the pipeline to save the output in a file. This command sends a copy
of its input to standard output, and another copy to the file named on the command line.

Example: $ ls tee list | sort

The output of ls will get saved in the file list and it also goes as the input to sort. If there is no
pipe, then the output of ls would go to the standard output.

tee is a useful command when the end result of the pipe is not what you are expecting. To solve
this type of problem, inserting the tee command at various places in the pipeline enables to look
at the output of different commands to determine what the problem is.

94
4.6 Sorting and Comparing

UNIX sort performs its usual function. It has several options. When the command is invoked
without options, the entire line is sorted.

Consider the emp.lst file discussed in the previous section:

$ sort emp.lst

2233|anbu |GM |sales |12/12/52|6000


9876|jai |director |production |12/03/50/7000
2365|barun |director |personnel |11/05/47/7800

Sorting starts with the first character of each line, and proceeds to the next character only when
the characters in two lines are identical.

Like cut and paste, sort also works on fields, and the default field separator is the space character.
The –t option, followed by the delimiter, overrides the default.

$ sort -t \| +1 emp.lst
2233|anbu |GM |sales |12/12/52|6000
2365|barun |director |personnel |11/05/47/7800
9876|jai |director |production |12/03/50/7000

The argument +1 Indicates that sorting should start after skipping the first field. To sort on the
third field, you should use

sort -t “|” +2 emp.lst

The sort order can be reversed with the –r option. The following sequence reverses a previous
sorting order:

$ sort -t “|” -r +1 emp.lst


9876|jai |director |production |12/03/50/7000
2365|barun |director |personnel |11/05/47/7800
2233|anbu |GM |sales |12/12/52|6000

This sequence can be written as :

sort -t “|” +1r emp.lst

Since sort also a filter, the sorted output can be redirected to a file with the “>” operator. This is
done with the –o option.

sort -o sortedlist +3 emp.lst


sort -o emp.lst emp.lst

And if you want to check whether the file has actually been sorted, you must use:

$ sort -c emp.lst
9876|jai |director |production |12/03/50/7000

95
Sorting on a Secondary key: You can sort on more than one fields, you can provide a
secondary key to sort. If the primary key is the third field, and the secondary key is the second
field, you can use

$ sort -t | | +2 -3 +1 emp.lst
2365|barun |director |personnel |11/05/47/7800
9876|jai |director |production |12/03/50/7000
2233|anbu |GM |sales |12/12/52|6000

This sorts the file by designation and name. -3 indicates stoppage of sorting after the third field,
and +1 indicates its resumption after the first field. To resume sorting from the first field, use +o

Sorting on columns: You can also specify a character position within a field to be the
beginning of sort. If you are to sort the file according to the year of birth, then you need to sort
on the seventh and eighth positions within the fifth field:

$ sort -t “|” +4.6 -4.8 emp.lst

+4.6 signifies the starting sort position-the seventh column of the fifth field. Similarly, -4.8
implies that sorting should stop after the eighth column of the same field.

Numeric Sort: When sort acts on numerals, strange things can happen. When you sort a file
containing only numbers, you get a curous result:

$ sort numfile
10
2
24
4

This is probably not what you expected. This can be overridden by the option –n:

$ sort -n numfile
2
4
10
24

Removing Duplicate Lines: The –u option lets you purge duplicate records from a file.

There are some commands whose purpose is to determine what the differences are between two
files that appear to be the same.

diff Command

The diff command is used to determine the difference that exists between two files. The syntax is
as follows:
$ diff options file1 file2

The options are used to redefine some of the output presented by diff. The diff command looks
in the named directory for a file with the same name as that specified and compares the two files.
Other wise, both arguments may be a file name. The following code shows the output of diff:

96
$ cat dfile1
This is a small file which will be used to test diff
$ cat dfile2
This is a SMALL file that can be used to illustrate how diff operates
$ cat dfile1 dfile2
1c1
< This is a small file which will be used to test diff

-
> This is a SMALL file that can be used to illustrate how diff operates
$

In this example, the first bit is a line of three characters, 1c1. This is the ed command, which is
used to synchronize the two files. In this case, the command says that line one would be changed
with line 1 of the other file. The lines after this command are the lines that would be affected by
the command listed first. The lines in the first file are preceded by a <, and the lines in the second
file are preceded by >.
The –e option instructs diff to create an ed command script that recreates file2 from file1. For
example:

$ diff -e dfile1 dfile2 > cfile


$ cat cfile
1c
This is a SMALL file that can be used to illustrate how diff operates.
$

comm Command

The comm. Command selects or rejects lines common to two files. For example;

$ sort c1 > c1s


$ sort c2 > c2s
$ cat c1s
apple
banana
tomato
$ cat c2s
apples
kiwi
$ comm c1s c2s
apples
banana
kiwi
tomato
$

comm. Command prints three columns of output. The first column contains items in the first file,
but not in the second. The second column contains items in the second file, but not in the first.
The third column contains items in both files.

uniq Command

97
uniq is meant to find the repeated lines in a file. This command reads the input and and compares
the adjacent lines looking for duplicate entries. Usually, the second and repeating lines of the file
are removed. The remainder is written to either standard output or a file. The syntax is as
follows:

$ uniq options input – file output – file

The output – file is optional. For example:

$ sort c3>c3s
$ cat c3s
apples
apples
banana
kiwi
tomato
$ uniq c3s
apples
banana
kiwi
tomato
$

98
Chapter 5

5. The vi editor
No matter what work you do with the UNIX system, you will eventually write some C programs
or shell (or perl) scripts. You may have to edit some of the system files at times. If you are
working on databases, you will also need to write SQL query scripts, procedures and triggers.
For all this, you must learn to use an editor, and UNIX provides a very versatile one – vi.

vi is a full-screen editor now available with all UNIX systems, and is widely acknowledged as
one of the most powerful editors available in any environment. Another contribution of the
University of California, Berkeley, it owes its origin to William (Bill) Joy, a graduate student who
wrote this unique program. It became extremely popular, leading Joy to later remark that he
wouldn’t have written it had he known that it would become famous!

vi offers cryptic, and sometimes mnemonic, internal commands for editing work. It makes
complete use of the keyboard, where practically every key has a function. vi has innumerable
features, and it takes time to master most of them. You don’t need to do that anyway. As a
beginner, you shouldn’t waste your time learning the frills and nuances of this editor. Editing is a
secondary task in any environment, and a working knowledge is all that is required initially.

Linux features a number of “vi” editors, of which vim (improved) is the most common. Apart
from vi, there are xvi, nvi and elvis that have certain exclusive functions not found in the
Berkeley version. All of them, barring nvi, let you split up the screen into multiple windows.

The Three Modes

A vi session begins by invoking the command vi with (or without) a file name:

vi visfile

You are presented a full empty screen, each line beginning with a tilde. This is vi’s way of
indicating that they are non-existent lines. For text editing, vi uses 24 of the 25 lines that are
normally available in a terminal. The last line is reserved for some commands that you can enter
to act on the text. This line is also used by the system to display messages. The filename appears
in this line with the message “visfile” [New file].

When you open a file with vi, the cursor is positioned at the top left-hand corner of the screen.
You are said to be in the command mode. This is the mode where you can pass commands to act
on the text, using most of the keys of the keyboard. Pressing a key doesn’t show it on screen, but
may perform a function like moving the cursor to the next line, or deleting a line. You can’t use
the command mode to enter or replace text.

There are two command mode functions that you should know right at this stage – the spacebar
and the backspace key. The spacebar takes you one character ahead, while the backspace key (or
<ctrl – h > ) takes you a character back. Backspacing in this mode doesn’t delete the text at at all.

To enter text, you have to leave the command mode and enter the input mode. There are ten keys
which, when pressed, take you to this mode, and whatever you enter shows up on the screen.

99
Backspacing in this mode, however, erases all characters that the cursor passes through. To leave
this mode, you have to press <Esc> key.

You have to save your file or switch to editing another file. Sometimes, you need to make a
global substitution in the file. Neither of the two modes will quit the work for you. You then
have to use the ex mode or line mode, where you can enter the instruction in the last line of the
screen. Some command mode functions also have ex mode equivalents. In this mode, you can
see only one line at a time, as you see when using EDLIN in DOS.
With this knowledge, we can summarize the three modes in which vi works:

1. Input Mode – Where any key depressed is entered as text


2. Command Mode – Where keys are used as commands to act on text.
3. ex Mode – Where ex mode commands can be entered in the last line of the screen to act on text

The relationship between these three modes is depicted in figure.

Comman
d Mode

<Enter>

<Esc>

Edit Ex mode
Mode

Figure – 6 vi Modes

100
5.1 Command mode
This is the mode you come to when you have finished entering or changing your text. When you
press a key in the command mode, it doesn’t show up on the screen, but simply performs its
function. That is why you can see changes on the screen without knowing the command that has
caused them.

5.2 Ex mode

SAVING TEXT AND QUITTING - THE ex MODE

When you edit a file using vi, of for that matter, any editor, the original file isn’t disturbed as
such, but only a copy of it that is placed in a buffer. From time to time, you should save your
work by writing the buffer contents to disk. You may also need to quo\it vi after or without
saving the changes.

Saving your Work

To enter any command in this mode, enter a:, which appears at the last line of the screen, then the
corresponding ex mode command, and finally the <Enter> key. To save a file and remain in the
editing mode, use the w (write) command:

:W<Enter>
“sometext”, 8 lines, 275 characters

The message shows the name of a file, along with the number of lines and characters saved.

Saving and quitting

The above command keeps you in the command mode so that you can continue editing.
However, to save and quit the editor, use the X command instead:

:X<Enter>
“sometext”, 8 lines, 303 characters
$-

5.3 Edit mode

Input Mode – Adding and Replacing Text

If you are a beginner to vi, it’s better you issue the following command after invoking vi, and
before you start editing:

: set show mode<Enter>

Enter a: (the ex mode prompt), followed by the two words, and then the <Enter> key. This is a
command in the ex mode, and when you enter the:, you will see it appearing in the last line of the
screen. This command sets one of the parameters of the vi environment, and displays a suitable

101
message whenever the input mode is invoked. The message appears at the bottom line of the
screen that is quite self-explanatory. This show mode setting is not necessary when using vim in
Linux, which sets it to this mode by default.

Before you attempt to enter text into the file, you need to change the default command mode to
input mode. There are several methods of entering this mode, depending on the type of input you
wish to key in, but in every case the mode is terminated by pressing the <Esc> key.

Insertion of Text

The simplest type of input is insertion of text. Whether the file contains any text or not, when vi
is invoked, the cursor is always positioned at the first character of the first line. To insert text at
this position, press

i # Existing text will be shifted right

The character doesn’t show up on the screen, but pressing this key, changes the mode from
command to input. Since the show mode setting was made at the beginning (with : set show
mode), you will see the words “INSERT MODE” at the bottom right corner of the screen.
Further key depressions will result in text being entered and displayed on the screen. Start
inserting a few lines of text, each line followed by <Enter>.

The lines containing text along with the “empty lines” (actually non –existent lines, as shown
with a –against each) approximate the screen shown here. The cursor is now positioned in the
last character of the last line. This is known as the current line, and the character where the
cursor is stationed is known as the current cursor position. If you notice a mistake in this line,
you can use the backspace key to erase any inserted text, one character at a time. The input mode
is terminated by pressing the <Esc> key, which takes you back to the command mode.

You started insertion with I, which put text at the left of the cursor position. If the i command is
invoked with the cursor positioned on existing text, text on its right will be shifted further without
being over written. The insertion of text with i is shown in figure along with the position of the
cursor (a)

There are other methods of inputting text. To append text to the right of the cursor position, use

a # Existing text will also be shifted right

followed by the text you wish to key . After you have finished editing, press <Esc> .

With i and a, you can append several lines of text in this way. They also have their uppercase
counterparts performing similar functions. I inserts texts at the beginning of a line, while A
appends text at the end of a line.

Opening a New Line

You can also open a new line by positioning the cursor at any point in a line and pressing

o # Opens a new line below the current line

This inserts an empty line below the current line.

102
o also opens a line, but above the current line. In either case, the show mode setting tells you
that you are in the input mode. You are free to enter as much text as you choose, spanning
multiple lines if required. Press the <Esc> key after completing text input.

Replacing Text

Text is replaced with the r,R, s and S keys. To replace one single character by another, you
should use

R # No <Esc> required

followed by the character that replaces the one under the cursor. You can replace a single
character only on this way.

vi momentarily switches from the command mode to the input mode when ris pressed. It returns
to the command mode as soon as the replacing character is entered. There is no need to press the
<Esc > key when using r, followed by the character, since vi expects a single character anyway.

To replace more than a single character, use

R # Replaces text as cursor moves right

followed by the text. Existing text will be overwritten as the cursor moves forward. This
replacement is, however, restricted to the current line only.

The s character replaces a single character with text irrespective of its length. S replaces the
entire line irrespective of the cursor position.

Vi Quick Reference:

103
Chapter 6
6. Shell Programming
6.1 Variables

Like every programming language, the shell offers the facility to define and use variables in the
command line. These variables are called shell variables. Shell variables are assigned with the =
operator, but evaluated by prefixing the variable name with a $.
Example:

$ x=37
$ echo $x
37

All shell variables take on the generalized form variable=value. They are of the string type,
which means that the value is stored in ANCII rather than in binary format. When the shell reads
command line, it interprets any word preceded by a $ as a variable, and replaces the word by the
value of the variable.

All shell variables are initialized to null strings by default. For example:

$ echo $xys
$-

Null strings can also be assigned explicitly by either of the following:

x=
x= ‘’

To assign multi word strings to a variable, you should quote the value:

$ msg= ‘you have mail’ ; echo $msg


you have mail

A variable name can consists of letters of the alphabet, numerals and underscore character. The
first character must be a letter. The shell is sensitive to case; the variable x is different from X.

The shell uses a pair of curly braces to enclose a variable name. Here is an alternative form of
evaluating the variable fname:

$ echo ${fname}
emp.sh

This form has certain advantages; you can tag a string to it without needing to quote it. This way
you can generate a second set of file names by affixing the character x to each one:

$ echo ${fname}x

104
emp.shx

Variables are concatenated by placing them adjacent to one another.

$ x=abcd ; y=efgh
$ z=$x$y
$ echo $z
abcdefgh

Applications of Shell Variables

Shell variables are used in intelligent manner, they can speed up your interaction with the system.
You can easily assign the path name /usr/kumar/progs/data to a variable, and then use its
shorthand representation:

$ pn=’usr/kumar/progs/data’
$ cd $pn
$ pwd
/usr/kumar/progs/data

A shell variable can be used to replace even the command itself. When a command is assigned to
a variable, the variable should be evaluated by simply specifying the $- prefixed variable as the
only word in the command line:

$ count=”wc unit01 unit02”


$ $count
436 6463 37986 unit01
892 8273 48420 unit02
1318 14736 86406 total

You can also use the feature of command substitution to set variables. For instance, if you were
to set the complete pathname of the present directory to a variable mydir, you could use

$ mydir=`pwd`
$ echo $mydir
/usr/kumar

105
6.2 Command-Line arguments

Shell procedures accept arguments in the command line. This non interactive method of
specifying arguments is quite useful for scripts requiring few inputs. It also forms the basis of
developing tools that can be used with redirection and pipelines.

When arguments are specified with a shell procedure, they assigned to certain special “variables”
or rather positional parameters. The first argument is read by the shell into the parameter $1, the
second argument into $2, and so on. In addition to these positional parameters, there are a few
other special parameters used by the shell. The next script illustrates these features:

$ cat emp2.sh
echo “program: $0
The number of arguments specified is $#
The arguments are $*”
grep “$1” $2
echo “\nJob Over”

The parameter $* stores the complete set of positional parameters as a single string. $# is set to
the number of arguments specified. This lets you design scripts that check whether the right
number of arguments have been entered. The command itself is stored in the parameter $0.

Invoke this script with the pattern “director” and the file name emp1.lst as the two arguments:

$ emp2.sh director emp1.lst


program: emp2.sh
The number of arguments specified is 2
The arguments are director emp1.lst
1006|chanchal singhvi | director | sales |03/09/38/|6700
6521| lalit chowdury | director | marketing |26/09/45|8200

Job Over

In this way, the first word is assigned to $0, the second word to $1, and the third word to $2. You
can use more positional parameters in this way up to $9 (and using the shift statement, you can go
beyond).

106
6.3 Decision-making constructs

In any programming language, the capability to make decisions and alter the program flow based
upon those decisions is a requirement to perform any work.

The if Command

The if statement takes two-way decisions, depending on the fulfillment of a certain condition. In
the shell, the statement uses the following forms:

if condition is true
then
execute commands
else
execute commands
fi

if evaluates a condition that accompanies its “command line”. If the condition is fulfilled, the
sequence of commands following it is executed. Every if must have a corresponding fi. The else
statement, if present, specifies the action in case the condition is not fulfilled. This statement is
not always required.

All UNIX commands returns a value. In the next example, grep is first executed, and if uses its
return value to control the program flow:

$ if grep “director” emp.lst


> then echo “pattern found – Job Over”
> else “pattern not found”
>fi

Numeric Comparison with test

When you utilize if to evaluate expressions, the test statement is often used as its control
command. test uses certain operators to evaluate the condition on its right, and returns either a
true or false exit status, which is then used by if for taking decisions.

The relational operators have a different form when used by test. They always begin with a –
(hyphen), followed by a two character word, and enclosed on either side by white space. The
complete set of operators is shown in the given table:

Operator Meaning
-eq Equal to
-ne Not equal to
-gt Greater than
-ge Greater than or equal to
-lt Less than
-le Less than or equal to

107
test doesn’t display any output, but simply returns a value, which is assigned to the parameter $?.
For example:
$ x=5; y=7; z=7.2
$ test $x –eq $y : echo $?
1
$ test $x –lt $y : echo $?
0
$ test $z –gt $y : echo $?
1
test $z
–eq $y : echo $?
0

You can now use the test in the command line of the if conditional. The next script uses three
arguments to take a pattern, as well as the input and output filenames. First it checks whether the
right number of arguments have been entered:

$ cat emp3.sh
if test S# -ne 3
then
echo “you have not keyed in 3 arguments”
exit 3
else
if grep “$1” $2 > $3
then
echo “pattern
found – Job Over” else

echo “pattern not found – Job Over”


fi
fi

Here, you have two if constructs, each terminated with its fi. One if is nested within the other.
When you run this script with one, or even no argument, this condition evaluates to true and grep
is executed. The second if construct now tests for grep’s exit status, and echoes suitable
messages.

Short hand for test test is so widely used that fortunately there exists a shorthand method of
executing it. A pair of rectangular brackets enclosing the expression can replace the word test.
Thus the following two forms are equivalent:

test $x –eq &y

[$x –eq $y]

if – elif: Multi-way Branching

if also permits multi-way branching; you can evaluate more conditions if the previous condition
fails. The format is if-then-elif-then-else-fi, where you can have as many elif as you want, while
the else remains optional. For example:

108
if [$# -ne 3] ; then
echo “You have not keyed in 3 arguments”; exit 3
elif grep “$1” $2 > $3 2>/dev/null ; then
echo “pattern found – Job Over”
else
echo “pattern not found
– Job Over” ; rm $3 fi

All these scripts have a serious shortcoming. They don’t indicate why a pattern wasn’t found.
This message appears even if the file doesn’t exist at all, and the redirection of the diagnostic
stream with 2> ensures that grep’s complaints are not seen on the terminal.

Test: String Comparisons

Test can be used to compare strings with the set of operators. Equality is performed with =, while
the C-type operator != checks for inequality. The table lists string handling tests:

Test Exit Status

-n stg True if string stg is not a null string

-z stg True if string stg is a null string

s1=s2 True if string s1=s2

s1!= s2 True if string s1 is not equal to s2

stg True if string stg is assigned and not null

You can use the string comparison features in the next script to check whether the user actually
enters a string, or presses the <Enter> key:

$ cat emp4.sh
echo “Enter the string to be searched: \c”
read pname
if [ -z “$pname” ] ; then
echo “You have entered the string” ;
exit 1 else
echo “Enter the file
to be used: \c” read flname
if [! –n :$flname” ] ;
then echo
“You have not entered the filename” ; exit 2 else

grep “$pname” “$flname” || echo “pattern not found”


fi
fi

109
The script checks whether the user actually enters something when the script pauses at the first
two points: once in accepting the pattern, and then while accepting the filename. Note that the
check for a null string is made with [ -z “$pname” ], as well as with [ ! –n “$flname” ]. The script
aborts if one of the inputs is a null string:

$ emp4.sh
Enter the string to be searched: director
Enter the file to be used: <Enter>
You have not entered the filename
$ emp4.sh
Enter the string to be searched: director
Enter the file to be used:
emp1.lst 1006| iuyrhb |
director | sales |03/09/38|6700 6532|
hg udhf | director |marketing |26/09/45|8200

test also permits the checking of more than one condition in the same line, using the –a (AND)
and –o (OR) operators. You can now simplify the above script to illustrate this feature:

if [ -n “$pname” –a –n “$flname” ] ; then


grep “$pname” $flname” || echo “pattern not found”
else
echo “At least one input was a null string”
exit 1
fi

The test output is true only if both variables are non null strings, i.e., the user enters some non
white space characters when the script pauses twice.

Test: File Tests

Test can be used to test the various file attributes. For example, you can test whether a file has
the necessary read, write or executable permissions. The table lists file related tests with test:

Test Exit Status

-e file True if file exists

-f file True if file exists and is a regular file

-r file True if file exists and is readable

-w file True if file exists and is writable

-x file True if file exists and is executable

110
-d file True if file exists and is a directory

-s file True if file exists and has a size greater than zero
Any of the test options can be negated by the ! operator. Thus, [ ! – f file ] negates [ -f file ]. The
file testing syntax used by test is quite compact, and you can test some attributes of the file
emp.lst at the prompt:

$ ls –l emp.lst
-rw-rw-rw- 1 kumar group 870 Jun 8 15:52 emp.lst
$ [ -f emp.lst ] ; echo $?
0
$ [ -x emp.lst ] ; echo $?
1
$ [! –w emp.lst ] || echo
“False that file is not writable”

Using these features, you can design a script that accepts a filename as argument, and then
performs a number of tests o it:

$ cat filetest.sh
if [ ! –e $1 ] ; then
echo “File does not exist”
elif [ ! –r $! ] ; then
echo “File is not readable”
elif [ ! –w $1 ] ; then
echo “File is not writable”
else
echo “File is
both readable and writable” fi

Test the script with two filenames.

$ filetest.sh emp3.lst
File does not exist

$ filetest.sh emp.lst
File is both readable and writable

The case Conditional

The case statement is the second conditional offered by the shell. The statement matches an
expression more than one alternative, and uses a compact construct to permit multi-way
branching. The general syntax for case statement is as follows:

case expression in
pattern1) execute commands ;;
pattern2) execute commands ;;
pattern3) execute commands ;;

111
……
esac

case matches the expression first for pattern1, and if successful, executes the commands
associated wit it. If it doesn’t then it falls through and matches pattern2, and so on. Each
command list is terminated by a pair of semi-colons, and the entire construct is enclosed with
esac.

For example, you can devise a script menu.sh, which accepts values from 1 to5, and performs
some action depending on the number keyed in:

$ cat menu.sh
echo “ MENU\n
1. List of files\n2. Processes of user\n3. Today’s date
4. Users of system\n5. Quit to UNIX\nEnter your option: \c”
read choice
case “$choice” in
ls –l ;;
ps –f ;;
date ;;
who ;;
exit

The five menu choices are displayed with a multi-line echo statement. case matches the value of
the variable $choice for strings 1, 2, 3, 4 and 5. It then relates each value to a command that has
to be executed.

The same logic can also be implemented using the if statement, but the case certainly is more
compact. However, case can’t handle relational and file tests: it can only match strings. It’s also
most effective when the string is fetched by command substitution.

Matching Multiple Patterns

case can also match more than one pattern. Programmers frequently encounter a logic that has to
test a user response for both y and Y ( or n and N). To implement this logic with if, you need to
use the compound condition feature:

If [ “$choice” = “y” –o “$choice” = “y” ]

case, on the other hand, has a quite compact solution. Like egrep, it uses the |to delimit multiple
patterns. Thus the expressions y|Y can be used to match both upper and lowercase:

echo “Do you wish to continue? (y/Y): \c”


read answer
case”$answer” in
y|Y) ;;
n|N) exit ;;
esac

112
Wild Cards: case Uses Them

Like the shell’s wild-cards, case also uses the filename matching metacharacters *, ? and the
character class. However, they are used case for string matching only. You can match a four-
character string with the model ????, and if it must contain numerals only, [0-9] [0-9] [0-9] [0-9]
will be just right For example:

case “$answer” in
[yY] [eE]*) ;;
[nN] [oO] ) exit ;;
*) echo “Invalid response”
esac

The null Command

The : command is a no-op. This means that the shall does nothing when it encounters this
command. When it is often used as the first line of a Bourne or Korn shell program. The null
command can be used to hold spaces in shell scripts. For example, as you test the script and
insert if statements, you can use the null command as the command to be executed in the if
statement while you test things.

if [ ! $! ]
then
: # we’ll add this code later
fi

The && and || Constructs

The && executes the command following the && when the previous command returns true. For
example:

who | grep “chare” 2>/dev/null && echo “chris is logged on”

If the first command returns true, as it would if the user chare is logged in, the echo statement is
printed informing the user running the command that the user is on the system. The || is used
when the command on the left of the symbol returns a false value.

who | grep “chare” 2>/dev/null | | echo “chris is not logged on”

In the above example, if the user chare is not logged on, the echo statement is printed. These are
used with the test command to execute a command if the test returns true or false.

The second part of the statement executes only if the first part is unsuccessful.

[ -f $file ] && more $file

The preceding line executes the more command on the specified file if the test command says that
it is regular file.

113
6.4 Looping Constructs

At this point, you must be able to change the flow of commands within the scripts and to execute
the same commands over and over again. The commands that you can use are while, for, and
until.

The while Command

The while statement should be quite familiar to most programmers. It repeatedly performs a set of
instructions till the control command returns a true exit status. The general form of this command
is as follows:

While condition is true


do
execute commands
done

The set of instructions enclosed by do and done are to be performed as long as the condition
remains true. For example, the emp5.sh script accepts a code and description in the same line,
and then writes out the line to newfile. It then prompts you for more entries:

$ cat emp5.sh
# Program: emp5.sh

answer=y
while [ “$answer” = “y” ]
do
echo “Enter the code and description: \c”
read code description
echo “$code | $description” >> newlist
echo “Enter any more (y/n)? \c”
read anymore
case $anymore in
y*|Y*) answer=y ;;
n*|N*) answer=n ;;
*) answer =y ;;
esac
done

There are situations when a program needs to read a file that is created by another program. The
monitfile.sh script periodically monitors the disk for the existence of the file, and then executes
the program once the file has been located.

$ cat monitfile.sh
while [ ! –r invoice.lst ]
do
sleep 60
done

114
invoice_alloc.pl

The loop executes as long as the file invoice.lst can’t be read. If the file becomes readable, the
loop is terminated and the proram invoice_alloc.pl is executed. This shell script is an ideal
candidate to be run in the background like this:

Monitfile.sh &

The sleep command is used to give some delay in shell scripts.

The until Command

With the while command, the loop is executed until the expression evaluates false. The format of
the until command is as follows:
until expr
do
commands
done

If the expression evaluates as false, the commands between do and done are executed. When the
expression evaluates true, the commands are no longer executed.

The for Command

The for command is used for processing a list of information. The syntax of the command is as
follows:

for var in word1 word2 word3


do
commands
done

Each word on the command is assigned to the variable var. the commands between the do and
done statements are then executed. This process continues until no more words are left to
process. When the last word is assigned and the commands are processed, the for loop is
terminated, and execution continues at the first command following the done.

for var in one test three four five


do
echo “var = $var”
done

when you run this program, the for command assigns each word to the variable var, and then runs
the echo command. It looks like the following:

$ for1
var = one
var = test
var = three
var = four

115
var = five

This command has several special formats, which indicates the use of positional parameters
currently assigned, or the files in the current directory.

for var
do
commands
done

is equivalent to

for var in $*
do
commands
done

In both cases, the variables $1 to $9 are assigned to the variable var in turn. This is useful for
processing the arguments on the command line, as shown in the following example:

for var in $*
do
echo “var = $var”
done

In addition to processing command-line arguments and positional parameters, you can also
process the filename substitution wildcards:

for file in *
do
echo “Found$file”
done

This example produces a list of the files in the current directory, one per line. Now you can add
the for command to the rat program, so that you can work on multiple command line arguments.
This new version of rat is as follows:

:
# rat version 3
#
# if we have no files, then report an error
#
if [ ! “$1” ]
then
echo “Usage: `basename $0` file”
exit 1
fi
#
# Loop around each argument on the command line
#

116
for file in $*
do
#
# If the first argument is a directory, then report that it is as such
if [ -d “$file” ]
then
echo “$file is a directory”
#
# If the argument is executable, then run the command
#
elif [ -x “$file” ]
then
$file
#
# If the first argument is a file, then run the more command on it
elif [ -f “$file” ]
then
echo “_________$file___________”
cat $file
else
echo “sorry, I don’t know what to do with this file.”
fi
done
$

The preceding example changes the rat program to include the for command. This enables you to
specify more than one file on the command line and have the rat program process it.
Another example of the for command in use is used to rename a number of files at the same time:

for i in * .doc
do
mv $i `basename $i doc` abc
done

117
6.5 Reading data

One command (read) is used to directly read data from a user, or other source, and a second
command (exec) can be used to adjust where the output or input comes from for a specific script.

The read Command

This command accepts on its command line a list of variables into which the information is to be
placed. The syntax for read is as follows:

read vars

/the read command works by waiting for the user to enter some text, which it accepts up the
newline.

$ read num street


2435 103rdstreet
$ echo “>$num< >$street<”
>2435< >103rd street<
$

Here you see where the user is prompted to enter some text, which will be assigned to the
variables num and street. The first word is assigned to the variable num and the remaining words
are assigned to the street variable. It is important to note that using only one variable on the line
for read has the effect of assigning the entire line to the variable.
For example:

$ read info
CF – 18A Hornet
$ echo $info
CF – 18A Hornet
$

Now read also can be used to accept input from a pipe in a loop. For example:

ls fi* | while read file


do
echo $file
done

In this example, the ls command provides a list of files that is fed to the while loop using a pipe.
For each file name, it is read using the read command and saved into the variable file. In this
case, the read command knows when no more data exists, and so it exist, allowing the loop to
continue.

The exec Command

The exec command can be used to do several things depending on the arguments given to it.
These things include the following:

118
• Replace the shell with a specific command
• Redirect input or output permanently
• Open a new file for the shell using a file descriptor

exec is most commonly used to replace the current shell with another program. This is done in
the user’s login file to replace the login shell with an application. This means that when the user
exists in the application, their session to the system is terminated. In effect, the application or
command now replaces the login shell.

$ exec /bin/data
Fri Aug 12 22:46:52 EST 1994
Welcome to AT&T UNIX
Please login:

The preceding code is an example of using exec to replace the current login shell with the
program / bin / date. After the command exists, the user must login in again.

The second use for exec is to redirect where input or output is to be sent on a semipermanent
basis. Inside the shell script, you can include a line such as the following:

`em begin code `em


exec > /tamp/trace
`em end code `em

that instructs the shell that all text destined for standard output will be instead written to the file
/tmp/trace. This remains in effect until the script exists, or the following command is executed to
send the output back to the terminal:

exec > /dev/tty

Standard error can be redirected using the following command:

exec 2> /tmp/log

And finally, input also can be redirected using the following format:

exec < /tmp/commands

Unless these commands are typed directly at the command line, they remain in effect only until
shell exists.
The final case for exec is when you want other files to open for input or output besides the
standard three that each shell gets. In the script, you need to use the following format:

exec fd mode file

In which fd is the file descriptor number. Remember that 0, 1, and 2 are already used. The
maximum value that a file descriptor can be is nine.

119
6.6 Functions
Instead of having to have multiple occurrences of the code in the file, and slowing the execution
time by waiting for other files to start, you can define a function in the shell script that executes
when you need it.

A shell function is defined following a specific syntax:

function_name ()
{
commands
}

The function name can be any series of characters and letters. In some respects, the function is
like a Korn shell or Cshell. However, functions are available in most Bourne shell - and are in
Korn shell.

# @ (#) search – search for a file from $HOME


# Usage search filename
search ( )
{
if test $# - lt 1
then
echo “Usage: search FILENAME”
return 1
fi

FILE=$1

echo “searching…”
find $HOME – name $1 –print

echo “search complete.”


Return 0
}

The preceding example is a sample shell function that accepts as an argument the name of a file
to search for. The function uses the find command to locate and print the name of the file. There
is no exit command is used in this function. When using functions, the current shell executes
them. This means that an exit command has the effect of logging the user off the system. As a
result, use the return command, which exists in the function and provides a return code back to
the system. The return and exit commands serve the same purpose, but the first is for a script,
and the second is for function.

120
Chapter 7

7. Basics of UNIX Administration


7.1 Login Process and Run Levels

The Login Process

For a direct connection to the system, the login process involves the following commands: getty,
login, and init.

Initialization

The initialization for a terminal session is started by the init command. The init is responsible for
starting the getty command to listen on a terminal port for an incoming connection. init does not
work on network terminal port. init accomplishes this task by looking at the /etc/inittab file on
the system V UNIX systems, and the /etc/ttys file on BSD- based system. The /etc/inittab file,
which follows, shows a sample initialization line used to start getty.

11:23:respawn:/etc/getty/ tty11 9600 # VT420 workstation

This example shows a sample entry from a System V inittab indicating that this getty should be
respond each time it exists. That means when the user’s login shell exists, a new getty should be
started.

Login Phase 1: getty

When you press Enter on your system, the system responds with a login prompt. This prompt
informs you that the system is ready for you to log in. The program responsible printing the login
prompt is the command getty. getty prompts for the user’s login name, as shown in the
following:

Welcome to UNIX
Please Login:

This login name has been assigned by the system administrator. The name can be up to eight
characters in length.

Login Phase 2: login

The second phase of the process involves the command login. login prompts the user to enter the
password. The password isn’t printed on the screen for security reasons.

Welcome to UNIX
Please Login: chare
Password:

121
If the user enters the password incorrectly, the system with a generic message, login incorrect.

Login Phase 3: Login Shell

The third phase of the process is entered after the user has entered the correct password for the
login. This phase sets the parameters around which the user is set up. For example, the login
shell is started, the init command starts the user’s login shell as specified in the /etc/passwd file.
This process is illustrated in the following:

Welcome to the AT&T UNIX pc


Please login: chare
Password:
Starting login procedure …
Please type the terminal name and press RETURN:
vt100 48% of the storage space is
available.

Logging In through telnetd and rlogin

When users want to log in to a system through a TCP/IP network, there are two primary access
methods. One uses the telnet protocol, and the other uses the rlogin protocol. telnet and rlogin
use a process on the UNIX system called server. This server is started when an incoming
connection for the specified protocol is received. In both cases, these commands prompt for the
user’s login name and the password.

The Global Initialization Files

There are two global initialization files used for the Bourne, Korn, and C shells. These
initialization files are executed for each user who logs in to the system.

The /etc/profile File

The shell script is executed for the users of the Bourne and Korn shells only. The profile file is
executed for each user when he logs in. The profile is a shell script, so learning how to read it
will help you create your own profile and customize your environment.

The output shown in the following list illustrates what the /etc/profile is doing:

Welcome to the AT&T UNIX pc

Please login: chare

Password:

Starting login procedure …

Please type the terminal name and press RETURN: vt100

122
48% of the storage space is available.

Logged in on Thu Sep 1 18:18:15 EST 1994 on /dev/tty000

Last Logged in on Thu Sep 1 18:17:44 EST 1994 on /dev/tty000

Customizing the User Login Files

There are a number of files that are user-dependent and can be modified to create an environment
more suited to the user’s tasks.

The .profile File

The .profile file is equivalent to the /etc/profile file, except that it enables the user to customize
and alter the configuration established by the system administrator. A user profile illustrates
adding to the configuration that was performed in the /etc/profile script.

To show that the processing of both of these files is done, the .profile prints several lines. One
lists the keys that are configured for the erase, interrupt, and kill commands, which are part of the
terminal driver.

The C Shell .login File

When a user logs in to the system his shell is the c shell, a file called .login is executed. This file
can contain any valid Unix or C shell command. The .login file typically contains commands
regarding the terminal driver, such as the erase character and the interrupt character.

The C shell .logout File


The .logout file is executed when the login shell is terminated. When the user types the command
logout or exit, the shell terminates and executes that are in the .logout file. The following
illustrates the execution of the commands in the .logout file:

%
%logout
logged out chare at Thu sep 1 18:20:56 EST 1994

please login:

Welcome to UNIX

The C Shell .cshrc File

The .cshrc file is executed each time a new C shell is started. The .cshrc file typically contains
commands that are loaded for each shell. Examples are prompts and information regarding

123
aliases. An alias is anther name for a command. For example, if your system doesn’t have the
command lc, you could create an alias in the C shell to define lc as ls –CF.

A .logout File for the Bourne and Korn Shell

The Bourne and Korn shell do not have the equivalent of a .logout file, although this facility can
be easily mimicked by using the shell’s capability to trap signals.

In the following example, you see the output. In fact, the files could be the same, as long as there
were no shell specific commands in them. You can simulate the .logout file by adding the
following lines to the .profile file in the user’s home directory. This line catches, or trap, when
the user requests the logout, and it executes the command in the .checkout file.

Trap $home /.checkout 0

Run Levels

Under SV flavors, run levels have become to a machine what permissions are to a user.
Operating at one run level restricts a machine from performing certain tasks; running at another
enables these functions to run. There are eight accepted run levels:

O (shoutdown state requiring a manual reboot). When changing to this level, files are
synchronized with the disk, and the system is left in a state where is safe to power it off. This
level is also known on some systems as Q or q

Q or q (On systems where these are not equal to zero). They force the system to reread the inittab
files and take into account any changes that have occurred since the last boot.

• 1 (Single user mode). This mode is also known as S or s mode on many systems. It
allows only one user to access the system. Additionally it allows only one terminal to log
in. If the change is to1, then that one terminal allowed to log in is the one defined as the
console. If the change is to S/s, then the only terminal allowed to log back in is the one
that changed the run level.
• 2 (Multiple user mode). The traditional state allowing more than one user to log in at a
time. This level is where background processes start up, and additional file systems – if
present – are mounted.
• 3 (Network mode). The same as level 2 only with networking or remote file sharing
enabled.
• 4 (User defined)
• 5 (Hardware state)
• 6 (Shutdown and automatic reboot). Performs the same action as changing to run level 0
then rebooting the machine.

124
7.2 Processes

There are three facilities that exists in UNIX to run jobs when you aren’t around, when the
load permits, or over and over again. They are at, batch, and cron.

The at command is used to run a single command at a specific time. Syntax of the at
command:

at time [data ] [increment]

The time is not optional on the command line, but you optionally specify the date the
command should be run or an increment from the current time.

Setting the Time

The time component for at may be specified as one, two, or four digits. If only one or two
digits are used, the time is assumed to be hours. In case of four digits, a colon may be used.
Example for at command

$ date
Sat Aug 6 20:02:49 EDT 1994
$ at 2004
date
Ctrl+D
Job 8765465.a at Sat Aug 6 20:04:00 1994
$

In the preceding list, you checked date and time to ensure that you set your job up
appropriately. The command line to at indicates that you want to run the job at 2004,or 8:04
p.m. Once all the commands to execute have been entered, press Ctrl+D on a new line. This
informs at that no more commands are going to be entered. Before terminating at prints out a
job number for the newly queued job. This job number can be used to find out information
on the job in future.

Controlling Output

If you are logged on, the output could be sent to your terminal, but that might cause other
problems with the application you could be using at the time. Normally, at saves all output
from the commands queued unless the output is redirected somewhere else. The output is
then mailed to the user who queued the job after it has completed.

Being More Precise

The at command will also accept a date to be more precise on when the commands should be
scheduled. The date information can be a full date with month, day, and year, or a day of the
week. Two special keywords, today and tomorrow, are recognized. If no date is given and
the hour is greater than the current hour, today is assumed by at. If the hour is less than the
current hour, tomorrow is assumed. The following code illustrates valid date formats for use
with at:

125
at 22:05 today at 1 pm tomorrow at 11:30 Jan 24

at 1201 am Jan 1, 1995 at 4:30 pm Tuesday

Listing at Jobs

The –l option will instruct at to list the contents of the at queue. This will list only the jobs
that have been queued by the invoking user. For example:

$ at –l
5423654.a Sat Aug 13 20:43:00 1994
4434356.a Sun Jan 1 00:01:00 1995
$

Removing at Jobs

The at command also provides the –r option, which is used to remove queued jobs from the at
command queue. For example

$ at -l
5423654.a Sat Aug 13 20:43:00 1994
4434356.a Sun Jan 1 00:01:00 1995
$ at -r 5423654.a
$ at -l
4434356.a Sun Jan 1 00:01:00 1995

Interactive versus batch

With batch, the commands are executed at a time when the system is free enough to handle
such requests. Commands are entered at the command line, with a Ctrl+D entered on a new
line to terminate the command list. This is illustrated in the following:

$ batch
date
who
df
pwd
Ctrl+D
Job 78657.a at Sat Aug 6 21:23:06 1994
$

Here you have scheduled these commands ti be executed by way of batch. The output of the
commands is saved and returned to the user through the electronic mail system on the UNIX
system. If the command has been written to save its output somewhere else, or if the output
is redirected, there will be no mail message.

126
7.3 Archiving and backup

Users often accidentally delete their own files and then rush to the administrator to restore them.
The administrator has to plan his backups carefully so that he doesn’t back up the same file over
and over again even though it has not been accessed for ages.

The two most popular programs for backups are tar and cpio. Both combine a group of files into
a single file (called an archive), with suitable headers preceding the contents of each file

Backup Strategy

Some files are modified more often than others, while some are not accessed at all. How often
the data changes in the system influences the backup strategy and determines the frequency of
backup. There should be a complete backup of all files once a week, and a daily incremental
backup of only those files which have been changed or modified

In modern times where one tape cartridge can back up several gigabytes of data at once,
incremental backups can now prove quite meaningless for some. For installations maintaining a
few hundred megabytes of data, a daily complete backup can now make a lot of sense. You
should use cron to schedule your backups.

tar or cpio

tar: The Tape Archiver


tar is one of several commands that can be used to save data to an archive device. Typically tar is
used with tapes or floppy disks. The syntax of the tar is as follows:

$ tar key files

The key includes any arguments that go along with the options. The key consists of a function
that instruct tar what you want it to do, whereas the additional options alter the way that tar does
the work. The key or option list that is given to tar does not have to contain a leading hyphen.
These commands are equivalent:

$ tar –x
$ tar x

tar is used with a number of key options. They are listed in the given table:

127
Option Significance (Key Options)

-c Creates a new archive

-x Extracts files from archive

-t Lists contents of archive

-r Replaces the file at end of archive

-u Like r, but only if files are newer than those in archive

Option Significance (Non-key options)

-v Verbose option-lists files in long format

-w Confirms from user about action to be taken

-f dvc Uses pathname dvc as name of device instead of the default

-b n Uses blocking factor n, where n is restricted to 20

-m changes modification time of file to time of extraction

-k num Multi volume backup (SCO UNIX only)

-m Multi volume backup (Linux only)

-z compresses while copying (Linux only)

-z Decompresses while extracting (Linux only)

Backing Up Files

tar accepts directory and filenames directly on the command line. The –c key option is used to
copy files to the backup device:

# tar -cvf /dev/rdsk/f0ql8dt /home/sales/SQL/*.sql

This backs up all SQL scripts with their absolute pathnames to the floppy diskette. The single
character a before each pathname indicates that the file is appended. The verbose option (-v)
shows the number of blocks used by each file.

128
When files are copied in this way with absolute pathnames, the same restrictions apply; they can
only be restored in the same directory. However, if you choose to keep the option open of
installing the files in a different directory, then you should first “cd” to /home/sales/SQL, and
then use a relative pathname:

cd /home/sales/SQL
tar -cvf /dev/rdsk/foql8dt ./*.sql

The command will also execute faster if used with a block size of 18

tar -cvfb /dev/rdsk/foql8dt 18 *.sql

since both –f and –b each have to be followed by an argument, the first word (/dev/rdsk/f0q18dt)
after the option string –cvfb will denote the argument for –f, and the second (18) will line up with
–b.

Restoring Files

Files are restored with the –x option. When no file or directory name is specified, it restores all
files from the backup device. The following command restores the files just backed up:

# tar -xvfb /dev/rdsk/f0q18dt 18

Displaying the Archive

The –t option simply displays the contents of the device without restoring the files. When
combined with the –v option, they are displayed in a long format:

# tar -tvf /dev/rdsk/f0q18dt

Appending to the Archive

A file can be added to the archive with the –u option. Since this is a copying operation, the –c
option can’t be used in combination with this option. The unusual thing is that an archive can
contain several versions of the same file:

# tar -uf /dev/rdsk/f0a18dt ./func.sh


# tar -tvf /dev/rdsk/f0q18dt

Interactive Copying and Restoration

tar offers the facility of interactive copying and restoration. When used with the –w option, it
prints the name of the file, and prompts for the action to be taken. With this facility, the earlier
version of the file can be restored easily:

# tar -xvwf / dev/rdsk/f0a18dt ./func.sh

When there are several versions of a single file, it is better to include the verbose option so that
the modification times can be seen.

129
Compress and Copy

GNU tar in Linux, while lacking in some facilities, offers several features, notable among which
is the simultaneous compression facility during backup. The –z option is used for compressing
and –Z for decompressing during restoration:

tar -cvzf /dev/rct0


tar -xvZf /dev/rect0

cpio: Copy Input Output

The cpio command can be used to copy files to and from a backup device. It uses standard input
to take the list of filenames, and then copies them with their contents and headers, into a single
archive that is written to the standard output. This means that cpio can be used with redirection
and piping. It uses two options, -o (output) and –I (input). Other options can be used with either
of these two options.

Backing Up Files

ls can be used to generate a list of filenames for cpio to use as input. The –o option is used to
create an archive, which can be redirected to a device file.

# ls | cpio –ov > /dev/rdsk/f0q18dt


array.pl
calendar

The –v option makes cpio operate in the verbose mode, so that the filename is seen on the
terminal when it’s being copied. Cpio needs as input is a list of files to be backed up.

Incremental Backups: find can also produce a file list, so any files that satisfy its selection
criteria can also be backed up. You will frequently need to use find and cpio in combination
to back selected files, for instance, those that have been modified in the last two days:

find . –type -mtime -2 -print | cpio -ovB > /dev/rdsk/f0q18dt

Since the path list of find is a dot, the files are backed up with their relative pathnames.
The –B option sets the block size to 5120 bytes for input and output. For higher, the –C option
has to be used.

Multi-Volume Backups: When the created archive in the backup device is larger than the capacity
of the device, cpio prompts for inserting a new diskette into the drive.

Restoring Files

A complete archive or selected files can be restored with the –I option. To restore all the files
that were backed up with a previous cpio command, the shell’s redirection operator (<) must be
used to take input from the device:

# cpio -iv < /dev/rdsk/f0q18dt


array.pl

130
calendar

Other options

 The –r option lets you rename each file before starting the copying process.
 The –f option, followed by an expression, causes cpio to select all files except those in
the expression:

cpio -ivf “*.C” </dev/rdsk/f0q18dt

 The –c option tells cpio to use ASCII characters, rather than the binary format,
for creating headers.

Displaying the Archive

The –t option displays the contents of the device without restoring the files. This option must be
combined with the –i option:

# cpio -itv </dev/rdsk/f0q18dt

131
7.4 Security

We need to address two types of security – Physical and Logical security. The physical is
concerned with where the machine is located and the access controls to the machine. The Logical
security component addresses the issues of security in the software such as user name and
password.

The physical security issues are as follows:

 Physical location of the machine


 Availability of removal of the machine
 Access to distribution and backup media

To prevent these issues, distribution and backup media should be identified and controlled under
lock and key to prevent unauthorized access.

The logical security issues are as follows:

 User Account Management


 Password Management
 Educate the users about the importance of security

132
Chapter 8
8. Communication

8.1 Basic UNIX mail


Computer system that communicates with each other to pass UNIX mail actually use a language
or protocol called Simple Mail Transport Protocol (SMTP). Communication between UNIX and
non UNIX mail systems that generally use non-SMTP protocols requires an e-mail gateway. The
gateway acts as a translator speaking both languages.

UNIX Mail Concepts

UNIX mail is built on the concept of a mailbox, which is the repository for your mail messages.
Messages that you receive and want to keep for later reference are stored mail files. The
directory in which a mail file resides is called a mail folder.

Starting mail

To start mail, enter mail at your shell’s command line. If you have mail in your mail box, the
mail headers will appear on the screen.

% mailx
mailx version 5.0 Mon Sep 27 07:25:51 PDT 1993 Type ? for help.
“/var/mail/steve” : 10 messages 2 new 0 unread
0 41 Sun managers –relay Fri Aug 26 16:58 41/1869 DLT on solaris 1.x or 2.x
0 42 ron Fri Aug 26 17:08 17/435 my phone number
{mail}&

mail displays the version and date of the mail program that you are running, and name of the mail
you are reading. Mail then displays total number of messages, number of new and unread
messages. The first column indicates the status of the message.

N New, unread messages


O Old messages
R New messages that has been read
U Unread messages

The second column shows the message number. The third column shows the sender’s name. The
fourth column shows the date and time that the message arrived in your mail box, and size of the
messages, both in number of lines as well as number of bytes. The last column shows a listing of
the subject line of the mail message. At the end of the header display, the mail program shows
the command prompt ({mail}&) and waits for you to enter a command.

Reading Your Mail

To read a specific message shown from the mail header display, enter the message number at the
command prompt. The message is displayed one page at a time for you to read. Your mail
program may be set to use number of paging commands such as more or pg, so a specific key
used to move to the next page may be different.

133
The displayed message consists of two parts, the message header and the body of the message.
The message header contains information such as who sent the message, when the message was
sent, who the recipient is, what the subject of the message is, and who, if anyone, is on the carbon
copy list.

Composing Mail

To compose a message to be sent by mail, enter the following where username@mailaddress is


the e-mail address of the recipient:

{Mail}& mail username@mailaddress


Subject: My address

You may now type your message. Once finished you message, enter Ctrl+D and press Enter.
Then your message is delivered and you are back at the {mail}& prompt.

Mail Headers

At any {Mail}& prompt enter “h” to redisplay the mail headers. To scroll forward one screen of
messages, enter “z”. The next group of mail headers is displayed. To scroll backward one screen
of messages, enter “z –“. If you want to display the headers for a group of messages containing a
specific message number, enter h message number.

Replying to Mail

The mail program provides a number of choices on how to reply: you may reply to the sender of
the message, or you may reply to all of the recipients of the message as well as the sender.

{Mail}& r Reply to the sender of the message after you have read it.

{Mail}$ r 13 Reply to the sender of the specific message number

{Mail}& R Reply to the sender and all of the recipients

Deleting Messages

To delete the message that you read, enter the following

{Mail}& d

This message is now marked for deletion. The message will be permanently removed from your
mailbox when you quit the mail. Multiple messages may be deleted with one command by
specifying the message numbers, separated by spaces, on the command line.
Example:

{Mail}& d 2 4 5

Series of messages may also be deleted by inserting a dash between the starting and ending
message numbers.
Example:

134
{Mail}& d 8-46

The deleted mail messages can be undeleted using the undelete mail command.
Example:

{Mail}& u 11

Saving Mail Messages

To save a mail message into a specific mail file, type “s” followed by the mail file name. For
example, enter the following to save the message 6 into a mail file called UNIX:

{Mail}& s 6 UNIX

To view the mail file, you specify a mail file name on the UNIX command line. For example, if
you want to view the messages in your personnel mail file, enter the following:

% mail -f personnel

Once you have finished reading your mail and deleting or saving them, you need to end your mail
session.
{Mail}& x
%

Advanced features

You can also send mail from the UNIX command line by composing a specific mail message, or
by redirecting output fro another UNIX program into mail.

Mailing a Single Message

Mailing a single message from the command line is as simple as running the mail command and
specifying the address of the recipients on the command line.
Example:

% mail steve
Subject: Phone number
Our new office phone number is 7548776. Please update your records.
.
Cc
%
When you want the output of a UNIX command to be mailed to you, or to a specific user, this
capability is used to mail the results of a program to yourself, or another use.
Example:

% ls -al | mail -s “Steve’s ome directory” joe@sun.com

In this example, the user is sending a directory listing of the current directory to a user
joe@sun.com. The –s option enables a subject: heading to be added to the message.

135
When you composing a mail message, you may want to include the text from another message for
the recipient to view or reference. To include a mail message into the current message, enter the
following on a blank line:

~ m message number

If you want to include a text file in your mail message, enter the following:

-r filename

Decoding an Encoded mail Message

If you receive an encoded mail message, its contents are valuable to you only if you can extract
the file. If the file has been encoded with the uuencode command, you should be able to read its
initial encoding line, which has the following format:

begin xyz filename

Where xyz is the UNIX numerical representation of the file modes, and filename is the name of
the extracted file. The first step in decoding the mail message is to save the message to a
temporary mail file. Once the temporary mail file is created, you must run the uudecode
command from your UNIX shell. The uudecode command’s only argument is the name of the
encoded message. The uudecode command will search for the begin statement I the body of the
message, discard all of the mail headers, decode the encoded file, and then create a file with
permissions and a file name that match those of the begin statement.

Handling Large files in mail

Many mail delivery systems have a limit as to the size of a mail message that they will process.
Two common ways of mailing large files are to compress them and to split them.

File compression uses the various algorithms to detect repeated patterns in files and represent
these patterns by shorter character strings. To send a compressed binary file, you must compress
the file before beginning the encoding process.
Example:

ls -al csh*
-rw – r – r - - 1 steve 2887 Aug 28 09:23 cshrc
% compress csh
-rw – r – r - - 1 steve 1699 Aug 28 09:23 cshrc.Z
% uuencode cshrc.Z > cshrc.Z.uu
ls -al csh*
-rw – r – r - - 1 steve 1699 Aug 28 09:23 cshrc.Z
-rw – r – r - - 1 steve 2368 Aug 28 09:24 cshrc.Z.uu

After a file is in compressed, uuencode format, you can include it in your mail message and sent
it. Compressing files using the compress command creates a new file with a.Z extension.
The split command divides a file into individual files of a user supplied length, each of which
then can be included in your mail message. The wc command is used to count the number of lines
in the encoded file.

136
8.2 Communicating with other users

This section covers the means by which an administrator, or user, can communicate with another
user, other than by e-mail.

write- ing to Other User

to send a message to another user currently on the system is with the write command. Use who
to find out if a user is on the system and then summon write.
Example:

$ who
hanna ttya Aug 2706:35
evan ttyb Aug 31 19:24
$

$ write cliff
cliff is not logged on
$

$ write evan

At this point, the other user’s terminal beeps twice and displays the following:
Message from {you} on {node name} {terminal} [ {date} ]…
Message from hanna on NRP (ttya) [ Tue Aug 31 06:41:53 ]…

The prompt disappears from both terminals. Begin typing your message and write utility copies
the lines from your terminal to the terminal of the other user every time the Return key is pressed.

Using talk

talk is an interactive form of write that exists in many versions of UNIX. If a network is present,
you can write or talk to a user on a system other than the one you are on by specifying the user
and machine name, separated by the at sign(@).
Example:
$ write cliff@NEWRIDER

To prevent other users from writing messages to you, and interrupting that very important job on
which you are working, use mesg n. mesg permits or denies messages. When no parameter is
given, it tells you the status of the mesg flag, as in the following example:
$ mesg
is y
$

With an argument n, it prevents messages by revoking write permissions on the terminal of the
user. When used with an argument of y, it switches, and reinstates the permission.

To see who is accepting or denying messaging through the use of write, use the who –T
command.

137
$ who –T
hanna + ttya Aug 2 08:42
root - ttyb Aug 5 19:24

Here, - sign between the user name and terminal port indicates that he is not accepting messages,
while +sign indicates that he can receive messages.

When there is a message for everyone, you can use wall to send a message, but it only goes to the
users currently logged on. If they are not currently logged on, you can send a mail to them, but
that means you have to send mail to everyone.

With news, only one file is created and all users read the same file. This reduces the amount of
clutter on a system. When a user logs in, if the news command is in his login routine, the
contents of any files in /usr/news, or /var/news are displayed on his screen. Once he sees it, his
name is removed from the list of users needing to do so, and the file is not shown to him again. If
delete is pressed while a news item is being shown, the display stops and the next item is started.

Using echo

To utilize echo, you need to know which terminal the user is using and how to address it. who
will show you where the user is and, tty gives the full address.

Example: $ who
user1 tty8 Dec 30 08:30
user2 tty4 Dec 30 07:20
$

Although the terminal is abbreviated tty8, the true address is slightly longer.

$ tty
/dev/tty8
$

Using this complete address, a message can be sent with the following command.

$ echo “Hello” > /dev/tty8

To have a beep sound also,

$ echo “\7\7 Hello” > /dev/tty8

To send the output of a command to other screens, single quotations are used.

$ echo `date` > /dev/tty8

138
8.3 Accessing the Internet
The internet is a community of people who do things their way because, for better or worse, that
is the way they like them. Internet Protocols are the low level tools that bind the machines on the
Internet into a useful whole. Ips specify the kinds of communications that can occur between
machines and how connections can be made to allow those communications. To be on the
internet, a machine must support the IP. One of the main important of these is Transmission
Control Protocol (TCP/IP).

The Internet treats all of its communications as packets of data, each of which has an address.
Machines on the Internet maintain tables that describe addresses of local and remote machines,
and routes for packets.

The Internet has its roots in a networking initiative and associated protocols. The Internet and its
protocol grow and evolve in response to user needs and currently available resources.
Mechanisms are available for formalizing, altering, and replacing IPs. The protocols themselves
are started in documents freely available from various sites on the Internet. You can see a list of
the ones on your machine by entering the command:
% cat /etc/protocols

File Encoding

A file might be encoded on the Internet for different reasons: to assure its privacy, to encapsulate
it in an archive, to compress it, or to send a binary files using an ASCII transmission method like
mail or Netnews.
As long as the sender has e-mail, the binary files are available by using the uuencode and
uudecode commands. These programs convert an arbitrary stream of bytes into ASCII and back
again.
To use uuencode, type the following:

% uuencode file label > out_file

Where file is the file to be encoded and label is the name of the file will have when it is decoded
with uudecode.

Using ping

ping provides the Internet version of a “hello, are you there” query. ping sends network packets
to a machine on the Internet, which you designate either by name or by address. Ping sent five
packets and each one arrived at designation successfully.

finger Command

The finger command can be used over the Internet to show you information about users on other
machines. The exact information you receive depends on the command options you use and
what the user you are asking about has made available in his .plan and .project files. For
example, you want to see who is currently logged on a machine on the internet, you can type the
following:

% finger @bigbiz.com

139
The result is:

Login Name TTY Idle When Where


Buff boon p2 1 Tue 19:26bigbiz.com
Ijfj sanju q3 21: Mon 08:45 bigbiz.com

finger can also show information about a specific user whether or not that user is currently logged
in.
finger buff@bigbiz.com

might show the following:

Login name:buff
Directory: /user/mnmt/buff
On since Jan 26 14:32:05 on ttyrc
4 days 23 hours Idle Time
Plan:
Manager, Big biz company
Mail Code: 2746
Extension: 134 (phone: 6425458)
Office: HU, 43665e
Motto: “If it’s big business, it’s our business!”

140
8.4 e-mail on the Internet
Sending and receiving messages electronically is certainly one of the most visible and attractive
benefits of computer networking. The propagation of e-mail requires different mechanism than
the creation and use of it. Electronic mail can originate from hand-held computers, home
computers, desktops, terminals etc. All these machines can be on the networks other than the
Internet. Certainly each one of the machines does not use UNIX and mail. Each uses its own
mail programs.

E-mail Address Directories

No single directory can be searched for an individual’s e-mail address. Two resources on the
internet might help you locate an e-mail address. They are the whois and the Knowbot
Information Service offered by the Corporation For National Research Initiatives (CNRI)

Using whois

The whois program searches in a database for matches to a name you type in at tha command
line. To search for all records that match the string boon, type the following:

$ whois bon

The result is:


$ whois bon
using default whois server rs.inter.net

Bon, Naida (NB76) nab@sss.com


+1 xxx xxx xxx

Bon, Paul (PB61) pwb@sss.com


(xxx) xxxx-xxx

To get help on the current state of the whois command,

$ whois help

Using CNRI’s knowbot

It is possible to type in the target string for which you seek matches just once and have many
different databases searched. By submitting a single query to KIS (Knowbat Information
Service), a user can search a set of remote “white pages” services and see the results of the search
in the uniform format.

Start up a telnet session. At the prompt, enter the knowbot address.

% telnet
telnet> open info. Cnri.reston.va.us
At the KIS prompt, type the following:

>query bon

141
You see the output shown previously and much more besides since KIS searches different
databases.

Mail Lists

Mail lists consists of users with a common interest in some topic and store desire to read
everything that other people have to say on the topic. Of course, some lists are private but many
are open to anyone who wishes to join.

Mail Servers

Mail servers are programs that distribute files or information. These respond to the e-mail
messages that conform to a specific syntax by e-mailing files or information requested in the
message back to the sender.

142
Chapter 9
9. Makefile concepts
9.1 Introduction
You need a file called a makefile to tell make what to do. Most often, the makefile tells make
how to compile and link a program.

In this chapter, we will discuss a simple makefile that describes how to compile and link a text
editor application that consists of eight C source files and three header files. The makefile can
also tell make how to run miscellaneous commands when explicitly asked (for example, to
remove certain files as a clean-up operation).

When make recompiles the editor, each changed C source file must be recompiled. If a header file
has changed, each C source file that includes the header file must be recompiled to be safe. Each
compilation produces an object file corresponding to the source file. Finally, if any source file has
been recompiled, all the object files, whether newly made or saved from previous compilations,
must be linked together to produce the new executable editor.

A simple makefile consists of "rules" with the following shape:

target ... : prerequisites ...


command
...
...
A target is usually the name of a file that is generated by a program; examples of targets are
executable or object files. A target can also be the name of an action to carry out, such as `clean'
(see section Phony Targets).

A prerequisite is a file that is used as input to create the target. A target often depends on several
files.

A command is an action that “make” carries out. A rule may have more than one command, each
on its own line. Usually a command is in a rule with prerequisites and serves to create a target file
if any of the prerequisites change. However, the rule that specifies commands for the target need
not have prerequisites. For example, the rule containing the delete command associated with the
target `clean' does not have prerequisites.

A rule, then, explains how and when to remake certain files that are the targets of the particular
rule. make carries out the commands on the prerequisites to create or update the target. A rule can
also explain how and when to carry out an action.

A makefile may contain other text besides rules, but a simple makefile need only contain rules.
Rules may look somewhat more complicated than shown in this template, but all fit the pattern
more or less.

Here is a straightforward makefile that describes the way an executable file called edit depends
on eight object files that, in turn, depend on eight C source and three header files.

143
In this example, all the C files include `defs.h', but only those defining editing commands include
`command.h', and only low level files that change the editor buffer include `buffer.h'.

edit : main.o kbd.o command.o display.o \


insert.o search.o files.o utils.o
cc -o edit main.o kbd.o command.o display.o \
insert.o search.o files.o utils.o

main.o : main.c defs.h


cc -c main.c
kbd.o : kbd.c defs.h command.h
cc -c kbd.c
command.o : command.c defs.h command.h
cc -c command.c
display.o : display.c defs.h buffer.h
cc -c display.c
insert.o : insert.c defs.h buffer.h
cc -c insert.c
search.o : search.c defs.h buffer.h
cc -c search.c
files.o : files.c defs.h buffer.h command.h
cc -c files.c
utils.o : utils.c defs.h
cc -c utils.c
clean :
rm edit main.o kbd.o command.o display.o \
insert.o search.o files.o utils.o

We split each long line into two lines using backslash-newline; this is like using one long line, but
is easier to read.

To use this makefile to create the executable file called `edit', type: make

To use this makefile to delete the executable file and all the object files from the directory, type:
make clean

In the example makefile, the targets include the executable file `edit', and the object files `main.o'
and `kbd.o'. The prerequisites are files such as `main.c' and `defs.h'. In fact, each `.o' file is both a
target and a prerequisite. Commands include `cc -c main.c' and `cc -c kbd.c'.

When a target is a file, it needs to be recompiled or relinked if any of its prerequisites change. In
addition, any prerequisites that are themselves automatically generated should be updated first. In
this example, `edit' depends on each of the eight object files; the object file `main.o' depends on
the source file `main.c' and on the header file `defs.h'.

A shell command follows each line that contains a target and prerequisites. These shell commands
say how to update the target file. A tab character must come at the beginning of every command
line to distinguish commands lines from other lines in the makefile.

144
The target `clean' is not a file, but merely the name of an action. Since you normally do not want
to carry out the actions in this rule, `clean' is not a prerequisite of any other rule. Consequently,
make never does anything with it unless you tell it specifically. Note that this rule not only is not
a prerequisite, it also does not have any prerequisites, so the only purpose of the rule is to run the
specified commands. Targets that do not refer to files but are just actions are called phony
targets.

By default, make starts with the first target (not targets whose names start with `.'). This is called
the default goal. Goals are the targets that make strives ultimately to update.

In the simple example of the previous section, the default goal is to update the executable
program `edit'; therefore, we put that rule first.

Thus, when you give the command:

make
make reads the makefile in the current directory and begins by processing the first rule. In the
example, this rule is for relinking `edit'; but before make can fully process this rule, it must
process the rules for the files that `edit' depends on, which in this case are the object files. Each of
these files is processed according to its own rule. These rules say to update each `.o' file by
compiling its source file. The recompilation must be done if the source file, or any of the header
files named as prerequisites, is more recent than the object file, or if the object file does not exist.

The other rules are processed because their targets appear as prerequisites of the goal. If some
other rule is not depended on by the goal (or anything it depends on, etc.), that rule is not
processed, unless you tell make to do so (with a command such as make clean).

Before recompiling an object file, make considers updating its prerequisites, the source file and
header files. This makefile does not specify anything to be done for them--the `.c' and `.h' files are
not the targets of any rules--so make does nothing for these files. But make would update
automatically generated C programs, such as those made by Bison or Yacc, by their own rules at
this time.

After recompiling whichever object files need it, make decides whether to relink `edit'. This must
be done if the file `edit' does not exist, or if any of the object files are newer than it. If an object
file was just recompiled, it is now newer than `edit', so `edit' is relinked.

Thus, if we change the file `insert.c' and run make, make will compile that file to update `insert.o',
and then link `edit'. If we change the file `command.h' and run make, make will recompile the
object files `kbd.o', `command.o' and `files.o' and then link the file `edit'.

145
9.2 Using Variables in Makefile
In our example, we had to list all the object files twice in the rule for `edit' (repeated here):

edit : main.o kbd.o command.o display.o \


insert.o search.o files.o utils.o
cc -o edit main.o kbd.o command.o display.o \
insert.o search.o files.o utils.o

Such duplication is error-prone; if a new object file is added to the system, we might add it to one
list and forget the other. We can eliminate the risk and simplify the makefile by using a variable.

Variables allow a text string to be defined once and substituted in multiple places later.

It is standard practice for every makefile to have a variable named objects, OBJECTS, objs,
OBJS, obj, or OBJ which is a list of all object file names. We would define such a variable
objects with a line like this in the makefile:

objects = main.o kbd.o command.o display.o \


insert.o search.o files.o utils.o
Then, each place we want to put a list of the object file names, we can substitute the variable's
value by writing `$(objects)'

Here is how the complete simple makefile looks when you use a variable for the object files:

objects = main.o kbd.o command.o display.o \


insert.o search.o files.o utils.o

edit : $(objects)
cc -o edit $(objects)
main.o : main.c defs.h
cc -c main.c
kbd.o : kbd.c defs.h command.h
cc -c kbd.c
command.o : command.c defs.h command.h
cc -c command.c
display.o : display.c defs.h buffer.h
cc -c display.c
insert.o : insert.c defs.h buffer.h
cc -c insert.c
search.o : search.c defs.h buffer.h
cc -c search.c
files.o : files.c defs.h buffer.h command.h
cc -c files.c
utils.o : utils.c defs.h
cc -c utils.c
clean :
rm edit $(objects)

146
It is not necessary to spell out the commands for compiling the individual C source files, because
make can figure them out: it has an implicit rule for updating a `.o' file from a correspondingly
named `.c' file using a `cc -c' command. For example, it will use the command `cc -c main.c -o
main.o' to compile `main.c' into `main.o'. We can therefore omit the commands from the rules for
the object files.

When a `.c' file is used automatically in this way, it is also automatically added to the list of
prerequisites. We can therefore omit the `.c' files from the prerequisites, provided we omit the
commands.

Here is the entire example, with both of these changes, and a variable objects as suggested above:

objects = main.o kbd.o command.o display.o \


insert.o search.o files.o utils.o

edit : $(objects)
cc -o edit $(objects)

main.o : defs.h
kbd.o : defs.h command.h
command.o : defs.h command.h
display.o : defs.h buffer.h
insert.o : defs.h buffer.h
search.o : defs.h buffer.h
files.o : defs.h buffer.h command.h
utils.o : defs.h

.PHONY : clean
clean :
-rm edit $(objects)
This is how we would write the makefile in actual practice.

147
9.3 Writing Makefiles

The information that tells make how to recompile a system comes from reading a data base called
the makefile.

What Makefiles Contain

Makefiles contain five kinds of things: explicit rules, implicit rules, variable definitions,
directives, and comments. Rules, variables, and directives are described at length in later
chapters.

• An explicit rule says when and how to remake one or more files, called the rule's targets.
It lists the other files that the targets depend on, call the prerequisites of the target, and
may also give commands to use to create or update the targets. See section Writing Rules.
• An implicit rule says when and how to remake a class of files based on their names. It
describes how a target may depend on a file with a name similar to the target and gives
commands to create or update such a target. See section Using Implicit Rules.
• A variable definition is a line that specifies a text string value for a variable that can be
substituted into the text later. The simple makefile example shows a variable definition
for objects as a list of all object files (see section Variables Make Makefiles Simpler).
• A directive is a command for make to do something special while reading the makefile.
These include:
o Reading another makefile (see section Including Other Makefiles).
o Deciding (based on the values of variables) whether to use or ignore a part of the
makefile (see section Conditional Parts of Makefiles).
o Defining a variable from a verbatim string containing multiple lines (see section
Defining Variables Verbatim).
• `#' in a line of a makefile starts a comment. It and the rest of the line are ignored, except
that a trailing backslash not escaped by another backslash will continue the comment
across multiple lines. Comments may appear on any of the lines in the makefile, except
within a define directive, and perhaps within commands (where the shell decides what is
a comment). A line containing just a comment (with perhaps spaces before it) is
effectively blank, and is ignored.

What Name to Give Your Makefile

By default, when make looks for the makefile, it tries the following names, in order: `makefile'
and `Makefile'.

Normally you should call your makefile either `makefile' or `Makefile'.

If make finds none of these names, it does not use any makefile. Then you must specify a goal
with a command argument, and make will attempt to figure out how to remake it using only its
built-in implicit rules.

If you want to use a nonstandard name for your makefile, you can specify the makefile name with
the `-f' or `--file' option. The arguments `-f name' or `--file=name' tell make to read the file name
as the makefile. If you use more than one `-f' or `--file' option, you can specify several makefiles.

148
All the makefiles are effectively concatenated in the order specified. The default makefile names
`makefile' and `Makefile' are not checked automatically if you specify `-f' or `--file'.

Including Other Makefiles

The include directive tells make to suspend reading the current makefile and read one or more
other makefiles before continuing. The directive is a line in the makefile that looks like this:

include filenames...
filenames can contain shell file name patterns.

Extra spaces are allowed and ignored at the beginning of the line, but a tab is not allowed. (If the
line begins with a tab, it will be considered a command line.) Whitespace is required between
include and the file names, and between file names; extra whitespace is ignored there and at the
end of the directive. A comment starting with `#' is allowed at the end of the line. If the file names
contain any variable or function references, they are expanded.

For example, if you have three `.mk' files, `a.mk', `b.mk', and `c.mk', and $(bar) expands to bish
bash, then the following expression

include foo *.mk $(bar)


is equivalent to

include foo a.mk b.mk c.mk bish bash


When make processes an include directive, it suspends reading of the containing makefile and
reads from each listed file in turn. When that is finished, make resumes reading the makefile in
which the directive appears.

One occasion for using include directives is when several programs, handled by individual
makefiles in various directories, need to use a common set of variable definitions or pattern rules

Another such occasion is when you want to generate prerequisites from source files
automatically; the prerequisites can be put in a file that is included by the main makefile. This
practice is generally cleaner than that of somehow appending the prerequisites to the end of the
main makefile as has been traditionally done with other versions of make.

If the specified name does not start with a slash, and the file is not found in the current directory,
several other directories are searched. First, any directories you have specified with the `-I' or `--
include-dir' option are searched (see section Summary of Options). Then the following directories
(if they exist) are searched, in this order: `prefix/include' (normally `/usr/local/include' (1))
`/usr/gnu/include', `/usr/local/include', `/usr/include'.

If an included makefile cannot be found in any of these directories, a warning message is


generated, but it is not an immediately fatal error; processing of the makefile containing the
include continues. Once it has finished reading makefiles, make will try to remake any that are
out of date or don't exist. See section How Makefiles Are Remade. Only after it has tried to find a
way to remake a makefile and failed, will make diagnose the missing makefile as a fatal error.

149
If you want make to simply ignore a makefile which does not exist and cannot be remade, with no
error message, use the -include directive instead of include, like this:

-include filenames...
This is acts like include in every way except that there is no error (not even a warning) if any of
the filenames do not exist. For compatibility with some other make implementations, sinclude is
another name for -include.

The Variable MAKEFILES

If the environment variable MAKEFILES is defined, make considers its value as a list of names
(separated by whitespace) of additional makefiles to be read before the others. This works much
like the include directive: various directories are searched for those files. In addition, the default
goal is never taken from one of these makefiles and it is not an error if the files listed in
MAKEFILES are not found.

The main use of MAKEFILES is in communication between recursive invocations of make. It


usually is not desirable to set the environment variable before a top-level invocation of make,
because it is usually better not to mess with a makefile from outside. However, if you are running
make without a specific makefile, a makefile in MAKEFILES can do useful things to help the
built-in implicit rules work better, such as defining search paths.

Some users are tempted to set MAKEFILES in the environment automatically on login, and
program makefiles to expect this to be done. This is a very bad idea, because such makefiles will
fail to work if run by anyone else. It is much better to write explicit include directives in the
makefiles.

How Makefiles Are Remade

Sometimes makefiles can be remade from other files, such as RCS or SCCS files. If a makefile
can be remade from other files, you probably want make to get an up-to-date version of the
makefile to read in.

To this end, after reading in all makefiles, make will consider each as a goal target and attempt to
update it. If a makefile has a rule which says how to update it (found either in that very makefile
or in another one) or if an implicit rule applies to it, it will be updated if necessary. After all
makefiles have been checked, if any have actually been changed, make starts with a clean slate
and reads all the makefiles over again. (It will also attempt to update each of them over again, but
normally this will not change them again, since they are already up to date.)

If you know that one or more of your makefiles cannot be remade and you want to keep make
from performing an implicit rule search on them, perhaps for efficiency reasons, you can use any
normal method of preventing implicit rule lookup to do so. For example, you can write an explicit
rule with the makefile as the target, and an empty command string.

If the makefiles specify a double-colon rule to remake a file with commands but no prerequisites,
that file will always be remade. In the case of makefiles, a makefile that has a double-colon rule
with commands but no prerequisites will be remade every time make is run, and then again after
make starts over and reads the makefiles in again. This would cause an infinite loop: make would
constantly remake the makefile, and never do anything else. So, to avoid this, make will not

150
attempt to remake makefiles which are specified as targets of a double-colon rule with commands
but no prerequisites.

If you do not specify any makefiles to be read with `-f' or `--file' options, make will try the default
makefile names; see section What Name to Give Your Makefile. Unlike makefiles explicitly
requested with `-f' or `--file' options, make is not certain that these makefiles should exist.
However, if a default makefile does not exist but can be created by running make rules, you
probably want the rules to be run so that the makefile can be used.

Therefore, if none of the default makefiles exists, make will try to make each of them in the same
order in which they are searched for until it succeeds in making one, or it runs out of names to try.

When you use the `-t' or `--touch' option, you would not want to use an out-of-date makefile to
decide which targets to touch. So the `-t' option has no effect on updating makefiles; they are
really updated even if `-t' is specified. Likewise, `-q' (or `--question') and `-n' (or `--just-print') do
not prevent updating of makefiles, because an out-of-date makefile would result in the wrong
output for other targets. Thus, `make -f mfile -n foo' will update `mfile', read it in, and then print
the commands to update `foo' and its prerequisites without running them. The commands printed
for `foo' will be those specified in the updated contents of `mfile'.

However, on occasion you might actually wish to prevent updating of even the makefiles. You
can do this by specifying the makefiles as goals in the command line as well as specifying them
as makefiles. When the makefile name is specified explicitly as a goal, the options `-t' and so on
do apply to them.

Thus, `make -f mfile -n mfile foo' would read the makefile `mfile', print the commands needed to
update it without actually running them, and then print the commands needed to update `foo'
without running them. The commands for `foo' will be those specified by the existing contents of
`mfile'.

How make Reads a Makefile

“make” does its work in two distinct phases. During the first phase it reads all the makefiles,
included makefiles, etc. and internalizes all the variables and their values, implicit and explicit
rules, and constructs a dependency graph of all the targets and their prerequisites. During the
second phase, make uses these internal structures to determine what targets will need to be rebuilt
and to invoke the rules necessary to do so.

It's important to understand this two-phase approach because it has a direct impact on how
variable and function expansion happens; this is often a source of some confusion when writing
makefiles. We say that expansion is immediate if it happens during the first phase: in this case
make will expand any variables or functions in that section of a construct as the makefile is
parsed. We say that expansion is deferred if expansion is not performed immediately. Expansion
of deferred construct is not performed until either the construct appears later in an immediate
context, or until the second phase.

Variable Assignment

Variable definitions are parsed as follows:

151
immediate = deferred
immediate ?= deferred
immediate := immediate
immediate += deferred or immediate

define immediate
deferred
endef
For the append operator, `+=', the right-hand side is considered immediate if the variable was
previously set as a simple variable (`:='), and deferred otherwise.

Conditional Syntax

All instances of conditional syntax are parsed immediately, in their entirety; this includes the
ifdef, ifeq, ifndef, and ifneq forms.

Rule Definition

A rule is always expanded the same way, regardless of the form:

immediate : immediate ; deferred


deferred
That is, the target and prerequisite sections are expanded immediately, and the commands used to
construct the target are always deferred. This general rule is true for explicit rules, pattern rules,
suffix rules, static pattern rules, and simple prerequisite definitions.

152
9.4 Sample Makefile

# Put all your source files here.


SRC=main.c source1.c source2.cpp
OBJ1=$(SRC:.c=.o)
OBJ=$(OBJ1:.cpp=.o)

# This is the name of your output file


OUT=runme

# This specifies all your include directories


INCLUDES=-I/usr/local/include -I.

# Put any flags you want to pass to the C compiler here.


CFLAGS=-g -O2 -Wall

# And put any C++ compiler flags here.


CCFLAGS=$(CFLAGS)

# CC speficies the name of the C compiler; CCC is the C++ compiler.


CC=cc
CCC=CC

# Put any libraries here.


LIBS=-L/usr/local/lib -lm
LDFLAGS=

##### RULES #####

# All rules are in the format:


# item: [dependency list]
# command
# This means that "item" depends on what's in the dependency list; in other
# words, before "item" can be built, everything in the dependency list must
# be up to date.
# Note that this MUST be a tab, not a set of spaces!

.SUFFIXES: .c .c .ln

default: dep $(OUT)

.c.o:
$(CC) $(INCLUDES) $(CFLAGS) -c $< -o $@

.cpp.o:
$(CCC) $(INCLUDES) $(CCFLAGS) -c $< -o $@

$(OUT): $(OBJ)

153
$(CC) $(OBJ) $(LDFLAGS) $(LIBS) -o $(OUT)

depend: dep

dep:
makedepend -- $(CFLAGS) -- $(INCLUDES) $(SRC)

clean:
/bin/rm -f $(OBJ) $(OUT)

154

S-ar putea să vă placă și