Documente Academic
Documente Profesional
Documente Cultură
1. Introduction to Solaris
1.1. History of UNIX
UNIX originated as a research project at AT&T Bell Labs in 1969. In 1976, it was made available at no charge to
universities and thus became the basis for many operating systems classes and academic research projects.
As the UNIX OS offered by AT&T evolved and matured, it became known as System V (five) UNIX. As the
developer of UNIX, AT&T licensed other entities to produce their own versions of the UNIX OS. One of the more
popular of these licensed UNIX variants was developed by The University of California at Berkeley Computer
Science Research Group. The Berkeley UNIX variant was dubbed Berkeley Software Distribution (BSD) UNIX.
The BSD version of UNIX rapidly incorporated networking, multiprocessing, and other innovations, which
sometimes led to instability. In an academic environment these temporary instabilities were not considered major
problems, and researchers embraced the quickly evolving BSD UNIX environment. In contrast, corporate
computing centers were wary of converting to OS with a history of instability.
Unlike BSD UNIX, AT&Ts System V UNIX offered stability and standardization. New capabilities were introduced
at a slower rate, often after evaluating the results of introducing the same capabilities in the BSD UNIX releases.
Corporate computing centers tended to favor the stability of AT&Ts version of UNIX over that of BSD UNIX.
Solaris is foreign to BSD sys-admins. SunOS is the heart of the Solaris OE. Like all OSs, SunOS is a collection of
software that manages system resources and schedules system operations.
Command Line Interface
A command line interface (CLI) enables users to type commands in a terminal or console window to interact with
an operating system. Users respond to a visual prompt by typing a command on a specified line, and receive a
response back from the system. Users type a command or series of commands for each task they want to
perform.
Graphical User Interfaces
A graphical user interface (GUI) uses graphics, along with a keyboard and a mouse, to provide an easy-to-use
interface to a program. A GUI provides windows, pull-down menus, buttons, scrollbars, iconic images, wizards,
other icons, and the mouse to enable users to interact with the operating system or application. The Solaris 10
operating environment supports two GUIs, the Common Desktop Environment (CDE) and the GNOME desktop.
Common Desktop Environment
The Common Desktop Environment (CDE) provides windows, workspaces, controls, menus, and the Front Panel
to help you organize and manage your work. You can use the CDE GUI to organize your files and directories,
read, compose and send email, access files, and manage your system.
GNOME Desktop
GNOME (GNU Network Object Model Environment) is a GUI and set of computer desktop applications. You can
use the GNOME desktop, panel, applications, and tool set to customize your working environment and manage
your system tasks. GNOME also provides an application set, including a word processor, a spreadsheet program,
a database manager, a presentation tool, a Web browser, and an email program.
The Internet address, similar to a telephone number, enables hosts to communicate with one another. For
example, in the case of a long distance telephone call, the caller dials the area code, exchange number, and line
number in order to communicate with a specific telephone location. In the same way, a hosts Internet address
describes where a host is on the Internet, which in turn allows network traffic to be directed to the host.
In the previous illustration, the hosts are connected to the Internet network number 192.168.0.0. The Internet
address assigned to a host can and often changes over the hosts lifetime.
Host Ethernet Address
An Ethernet address functions like a passport number in that it is unique and permanent hardware address
assigned by the hardware manufacturer. Hosts are identified via such addresses on the Ethernet, which in turn
enables them to communicate.
For example, the /export filesystem of the boot server contains a directory known as root. The /export/root
directory contains a sub-directory for each diskless client supported by the server.
The diskless client swap is used in a fashion similar to the root area. Special swap files, one per supported
diskless client, are located under the /export/swap file system. Take a look at the illustration:
192.168.0.0 Network
root
swap
/export/root
Diskless Boot Server
/export/swap
Client Disk
Diskless Client and Boot Server Configuration
Host
root, swap
Disk
A standalone host configuration
2. Solaris Installation
2.1. Solaris Software Installation
2.1.1. The Solaris 10 OE Installation and Upgrade Options
There are a number of ways to install the Solaris 10 OE on your system. They include:
1. Solaris installation program
2. Solaris installation program over the network
3. Custom Jumpstart
4. Solaris Flash archives
5. WAN boot
6. Solaris Live Upgrade
7. Solaris Zones
Solaris Zones
After the Solaris OS is installed, you can install and configure zones. In a zones environment, the global zone is
the single instance of the operating system that is running and is contained on every Solaris system. The global
zone is both the default zone for the system and the zone that is used for system-wide administrative control.
A non-global zone is a virtualized operating system environment. Solaris Zones are a software partitioning
technology used to virtualize operating system services and provide an isolated and secure environment for
running applications. When you create a zone, you produce an application execution environment in which
processes are isolated from all other zones. This isolation prevents processes that are running in one zone from
monitoring or affecting processes that are running in any other zones. Even a process running in a non-global
zone with superuser credentials cannot view or affect activity in any other zones. A process running in the global
zone with superuser credentials can affect any process in any zone.
The global zone is the only zone from which a non-global zone can be configured, installed, managed, or
uninstalled. Only the global zone is bootable from the system hardware. Administration of the system
infrastructure, such as physical devices, routing, or dynamic reconfiguration (DR), is only possible in the global
zone. Appropriately privileged processes running in the global zone can access objects associated with any or all
other zones.
SPARC systems
1. Ensure that you have the following media.
a. For a DVD installation, the Solaris 10 Operating System for SPARC platforms DVD
b. For a CD installation, use Solaris 10 Software CDs.
2. Verify that your system meets the minimum requirements.
3. Your system should meet the following requirements.
a. Memory 128 Mbytes or greater
b. Disk space 12 Gbytes or greater
c. Processor speed 200 MHz or greater
x86 systems
1. Ensure that you have the following media.
a. If you are installing from a DVD, use the Solaris 10 Operating System for x86
platforms DVD.
b. If you are installing from CD media, use Solaris 10 Software CDs.
c. Check your system BIOS to make sure you can boot from CD or DVD media.
2. Verify that your system meets the minimum requirements.
3. Your system should meet the following requirements.
a. Memory 128 Mbytes or greater
b. Disk space 12 Gbytes or greater
c. Processor speed 120 MHz or greater with hardware floating point
The table summarizes the different Software Groups and their space requirements
Software Group Description Recommended
Disk Space
Entire Solaris Software Contains the packages for the Entire Solaris Software Group 6.7 Gbytes
Group Plus OEM Support plus additional hardware drivers, including drivers for hardware
that is not on the system at the time of installation.
Entire Solaris software Group Contains the packages for the Developer Solaris Software 6.5 Gbytes
Group and additional software that is needed for servers.
Developer Solaris Software Contains the packages for the End User Solaris Software 6.0 Gbytes
Group Group plus additional support for software development. The
additional software development support includes libraries,
include files, man pages, and programming tools. Compilers
are not included.
End User Solaris Software Contains the packages that provide the minimum code that is 5.0 Gbytes
Group required to boot and run a networked Solaris system and the
Common Desktop Environment.
Core System Support Contains the packages that provide the minimum code that is 2.0 Gbytes
Software Group required to boot and run a networked Solaris system.
Reduced Network Support Contains the packages that provide the minimum code that is 2.0 Gbytes
Software Group required to boot and run a Solaris system with limited network
service support. The Reduced Network Support Software
Group provides a multiuser text-based console and system
administration utilities.
This software group also enables the system to recognize
network interfaces, but does not activate network services.
Installation Media
This release has one installation DVD and several installation CDs. The Solaris 10 Operating System DVD
includes the content of all the installation CDs.
Solaris Software 1 This CD is the only bootable CD. From this CD, you can access both the Solaris installation
graphical user interface (GUI) and the console-based installation. This CD also enables you to install selected
software products from both the GUI and the console-based installation.
For both CD and DVD media, the GUI installation is the default (if your system has enough memory). However,
you can specify a console-based installation with the text boot option. The installation process has been
simplified, enabling you to select the language support at boot time, but select locales later.
To install the OS, simply insert the Solaris Software - 1 CD or the Solaris Operating System DVD and type one of
the following commands.
For the default GUI installation (if system memory permits), type boot cdrom.
For the console-based installation, type boot cdrom - text.
Regular File
The above illustrate what a regular file contains and how they can be created.
Directory File
Directories hold a list of file names and the inode numbers associated with them.
3.1.4. Links
There are 2 types of links:
Symbolic link
Hard Link
Symbolic Link:
A symbolic link is a file that points to another file. They contain the path name of the file to which they are
pointing. The size of a symbolic link always matches the number of characters in the path name it contains. In the
following example, the symbolic link called /dev/dsk/c0t0d0s0 points to the physical device
./devices/pci@1f, 0/pci@1, 1/ide@3/dad@0, 0:a. The size of the symbolic link is 46 bytes because
the path name ./devices/pci@1f, 0/pci@1, 1/ide@3/dad@0, 0:a contains 46 characters.
# cd /dev/dsk
# ls -l c0t0d0s0
lrwxrwxrwx 1 root root 46 Oct 22 11:22 c0t0d0s0 ->
../../devices/
pci@1f,0/pci@1,1/ide@3/dad@0,0:a
Creating a Symbolic Link:
The ln command with s option is used to create symbolic links
# ln -s file1 link1
# ls -l link1
lrwxrwxrwx 1 root root 5 Oct 22 15:56 link1 -> file1
From the output, we see that link1 is pointing towards file1. (Symbolic links are similar to short cuts in
Windows).
HARD LINKS
A hard link is the association between a file name and an inode. Information in each inode keeps count of the
number of file names associated with it. This is called a link count. In the output from the ls -l command, the
link count appears between the column of file permissions and the column identifying the owner. In the following
example, the file called alice uses one hard link.
# cd dir1
# touch alice
# ls -l
total 0
-rw-r--r-- 1 root root 0 Oct 22 16:18 alice
A new hard link for a file name increments the link count in the associated inode. For example:
# ln alice humpty-dumpty
# ls -l
total 0
-rw-r--r-- 2 root root 0 Oct 22 16:18 alice
-rw-r--r-- 2 root root 0 Oct 22 16:18 humpty-dumpty
# ls -li
total 0
16601 -rw-r--r-- 2 root root 0 Oct 22 16:18 alice
16601 -rw-r--r-- 2 root root 0 Oct 22 16:18 humpty-
dumpty
# find . -inum 16601
./alice
./humpty-dumpty
The ln command creates new hard links to regular files. Unlike symbolic links, hard links cannot span file
systems.
3.2. Devices
A device file provides access to a device. The inode information of device files holds numbers that point to the
devices. For example:
# cd /devices/pci@1f,0/pci@1
# ls -l
total 4
drwxr-xr-x 2 root sys 512 Oct 22 13:11 pci@2
crw------- 1 root sys 115, 255 Oct 22 16:04 pci@2:devctl
drwxr-xr-x 2 root sys 512 Oct 22 13:11 scsi@1
crw------- 1 root sys 50, 0 Oct 22 16:04 scsi@1:devctl
crw------- 1 root sys 50, 1 Oct 22 16:04 scsi@1:scsi
A long listing shows 2 numbers:
major number
minor number
A major device number identifies the specific device driver required to access a device. A minor device number
identifies the specific unit of the type that the device driver controls.
Device files fall into two categories: character-special devices and block-special devices. Character-special
devices are also called character or raw devices. Block-special devices are often called block devices. Device
files in these two categories interact with devices differently.
DEFINITIONS
Sector: The smallest addressable unit on a platter. One sector can hold 512 bytes of data. Sectors are also
known as disk blocks.
Track: A series of sectors positioned end-to-end in a circular path.
Cylinder: A stack of tracks.
Disk Slices
In Solaris, disks are logically divided into individual partitions known as disk slices. Disk slices are groupings of
cylinders that are commonly used to organize data by function. For example, one slice can store critical system
files and programs while another slice on the same disk can store user-created files.
Note Slices are sometimes referred to as partitions. Certain interfaces, such as the format utility, refer to slices
as partitions.
Figure shows the eight-character string that represents the full name of a disk slice.
Controller number: Identifies the host bus adapter (HBA), which controls communications between the system
and disk unit. The HBA takes care of sending and receiving both commands and data to the device. The
controller number is assigned in sequential order, such as c0, c1, c2, and so on.
Target number: Target numbers, such as t0, t1, t2, and t3, correspond to a unique hardware address that is
assigned to each disk, tape, or CD-ROM. Some external disk drives have an address switch located on the rear
panel. Some internal disks have address pins that are jumpered to assign that disks target number.
Disk number: The disk number is also known as the logical unit number (LUN). This number reflects the number
of disks at the target location.
Slice number: A slice number ranging from 0 to 7.
The Figures shown below apply only to SPARC machines. For x86 machines, there is no target number for IDE
architecture. On an IDE disk for an x86 machine, if we want to refer to slice 3 on a disk, which is primary master,
the slice name would be c0d0s3.
IDE Configuration
Use the prtconf command to display the systems configuration information, including the total amount of
memory installed and the configuration of system peripherals, which is formatted as a device tree. The prtconf
command lists all possible instances of devices, whether the device is attached or not attached to the system. To
view a list of only attached devices on the system, perform the command:
# prtconf | grep -v not
System Configuration: Sun Microsystems sun4u
Memory size: 512 Megabytes
System Peripherals (Software Nodes):
SUNW,Ultra-5_10
scsi_vhci, instance #0
options, instance #0
pci, instance #0
pci, instance #0
ebus, instance #0
power, instance #0
su, instance #0
su, instance #1
fdthree, instance #0
SUNW,CS4231, instance #0
network, instance #0
SUNW,m64B, instance #0
ide, instance #0
sd, instance #2
dad, instance #1
pseudo, instance #0
The format Command
Use the format command to display both logical and physical device names for all currently available disks. To
view the logical and physical devices for currently available disks, perform the command:
# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0t0d0 <ST320413A cyl 38790 alt 2 hd 16 sec 63>
/pci@1f,0/pci@1,1/ide@3/dad@0,0
Specify disk (enter its number):
4. Install the peripheral device. Make sure that the address of the device being added does not conflict with
the address of other devices on the system.
5. Turn on the power to all external devices.
6. Turn on the power to the system. The system boots to the login window.
7. Verify that the peripheral device has been added by issuing either the prtconf or format command.
After the disk is recognized by the system, begin the process of defining disk slices.
Note If the /reconfigure file was not created before the system was shut down, you can invoke a manual
reconfiguration boot with the programmable read-only memory (PROM) level command: boot -r.
Using the devfsadm Command
Many systems are running critical customer applications on a 24-hour, 7-day-a-week basis. It might not be
possible to perform a reconfiguration boot on these systems. In this situation, you can use the devfsadm
command.
The devfsadm command performs the device reconfiguration process and updates the /etc/path_to_inst
file and the /dev and /devices directories during reconfiguration events.
The devfsadm command attempts to load every driver in the system and attach all possible device instances. It
then creates the device files in the /devices directory and the logical links in the /dev directory. In addition to
managing these directories, the devfsadm command also maintains the /etc/path_to_inst file.
To restrict the operation of the devfsadm command to a specific device class, use the -c option.
# devfsadm -c device_class
The values for device_class include disk, tape, port, audio, and pseudo. For example, to restrict the devfsadm
command to the disk device class, perform the command:
# devfsadm -c disk
Use the -c option more than once on the command line to specify multiple device classes. For example, to
specify the disk, tape, and audio device classes, perform the command:
# devfsadm -c disk -c tape -c audio
To restrict the use of the devfsadm command to configure only devices for a named driver, use the -i option.
# devfsadm -i driver_name
The following examples use the -i option.
To configure only those disks supported by the dad driver, perform the command:
# devfsadm -i dad
To configure only those disks supported by the sd driver, perform the command:
# devfsadm -i sd
To configure devices supported by the st driver, perform the command:
# devfsadm -i st
To print the changes made by the devfsadm command to the /dev and /devices directories perform the
command:
# devfsadm -v
To invoke cleanup routines that remove unreferenced symbolic links for devices, perform the command:
# devfsadm -C
At the time UNIX was first released, most machines that ran UNIX used 16-bit hardware. Consequently, a 16-bit
unsigned integer number could address 65,536 sectors on the disk drive. As a result, the disk drive could be no
larger than 65,536 sectors * 512 bytes/sector, or roughly 32Mb.
When large disk drives (300 Mb or more) became available, provisions were made to allow their use on UNIX
systems. The solution was to divide such drives into multiple logical drives, each consisting of 32 MB of data. By
creating several logical drives on a single physical drive, the entire capacity of the 300-Mb disk drive could be
utilized. These logical drives became known as partitions. Each disk drive may have 8 partitions. Under Solaris
the partitions are numbered zero (0) through (7).
Let us look at the other advantages of partioning.
On a system with multiple partitions, the administrator has more control over disk space usage. A user shouldnt
be capable of running the system out of disk space with an errant job. Due to the buffering characteristics of the
Solaris disk drivers, it is often desirable to make multiple partition systems, and place the most active filesystems
in the middle of the disk drive. This allows for more optimal I/O performance, as the disk read-write heads are
certain to pass over these partitions quite often. It is also usually desirable to have the swap space spread across
several drives, and the easy way to do this is to create multiple partitions on the systems disks.
Another reason to create a multiple parition server is to enable control over exported file systems. When the disks
are partitioned, the administrator has better control over which files are exported to other systems, This occurs
because the NFS allows the administrator to export the /opt partition to all hosts without giving those hosts access
to the /usr file system on the server. On single-partition system, the administrator would have to export the entire
disk to other systems, thereby giving the client machines access to files that might be sensitive in nature.
Partitions are also called as slices in Solaris.
A Solaris slice can be used as:
Filesystem
Swap Space
Raw Device
Disk-based file systems are stored on physical media such as hard disks, CD-ROMs, and diskettes. Each type of
disk-based file system is customarily associated with a particular media device, as follows:
UFS with hard disk
HSFS with CD-ROM
PCFS with diskette
UDF with DVD
PCFS with diskette
Network-Based File Systems
The network file system allows users to share files among many types of systems on the network. The NFS file
system makes part of a file system on one system appear as though it were part of the local directory tree.
Virtual File Systems
Virtual file systems are memory-based file systems that provide access to special kernel information and
facilities. Most virtual file systems do not use file system disk space. However, the CacheFS file system uses a file
system on the disk to contain the cache. Also, some virtual file systems, such as the temporary file system
(TMPFS), use the swap space on a disk. CacheFS software provides the ability to cache one file system on
another. In an NFS environment, CacheFS software increases the client per server ratio, reduces server and
network loads, and improves performance for clients on slow links, such as Point-to-Point Protocol (PPP). You
can also combine a CacheFS file system with the AutoFS service to help boost performance and scalability.
Temporary File System
The temporary file system (TMPFS) uses local memory for file system reads and writes. Typically, using memory
for file system reads and writes is much faster than using a UFS file system. Using TMPFS can improve system
performance by saving the cost of reading and writing temporary files to a local disk or across the network. For
example, temporary files are created when you compile a program. The OS generates a much disk activity or
network activity while manipulating these files. Using TMPFS to hold these temporary files can significantly speed
up their creation, manipulation, and deletion. Files in TMPFS file systems are not permanent. These files are
deleted when the file system is unmounted and when the system is shut down or rebooted.
TMPFS is the default file system type for the /tmp directory in the Solaris OS. You can copy or move files into or
out of the /tmp directory, just as you would in a UFS file system. The TMPFS file system uses swap space as a
temporary backing store. If a system with a TMPFS file system does not have adequate swap space, two
problems can occur: The TMPFS file system can run out of space, just as regular file systems do. Because
TMPFS allocates swap space to save file data (if necessary), some programs might not execute because of
insufficient swap space.
The process file system (PROCFS) resides in memory and contains a list of active processes, by process
number, in the /proc directory. Information in the /proc directory is used by commands such as ps. Debuggers
and other development tools can also access the address space of the processes by using file system calls.
Caution Do not delete files in the /proc directory.
Additional, virtual filesystems supported in Solaris are:
CTFS
CTFS (the contract file system) is the interface for creating, controlling, and observing contracts. A contract
enhances the relationship between a process and the system resources it depends on by providing richer error
reporting and (optionally) a means of delaying the removal of a resource. The service management facility (SMF)
uses process contracts (a type of contract) to track the processes that compose a service, so that a failure in a
part of a multi-process service can be identified as a failure of that service.
MNTFS
Provides read-only access to the table of mounted file systems for the local system
OBJFS
The OBJFS (object) file system describes the state of all modules currently loaded by the kernel. This file system
is used by debuggers to access information about kernel symbols without having to access the kernel directly.
SWAPFS
Used by the kernel for swapping
DEVFS
The devfs file system manages devices in this Solaris release. Continue to access all devices through entries in
the /dev directory, which are symbolic links to entries in the /devices directory.
Bootblock: This stores the bootable objects that are necessary for booting the system. And though space is
reserved for boot block in all the cylinder groups, only the first cylinder group has bootable information.
Superblock: This stores all the information about the filesystem, like:
Size and status of the file system
Label, which includes the file system name and volume name
Size of the file system logical block
Date and time of the last update
Cylinder group size
Number of data blocks in a cylinder group
Summary data block
File system state
Path name of the last mount point
Because the superblock contains critical data, multiple superblocks are made when the file system is created (and
they are called back-up superblocks).
A summary information block is kept within the superblock. The summary information block is not replicated, but
is grouped with the primary superblock, usually in cylinder group 0. The summary block records changes that take
place as the file system is used. In addition, the summary block lists the number of inodes, directories, fragments,
and storage blocks within the file system.
Inodes: An inode contains all the information about a file except its name, which is kept in a directory. An inode is
128 bytes. The inode information is kept in the cylinder information block, and contains the following:
The type of the file
The mode of the file (the set of read-write-execute permissions)
The number of hard links to the file
The user ID of the owner of the file
The group ID to which the file belongs
The number of bytes in the file
An array of 15 disk-block addresses
The date and time the file was last accessed
The date and time the file was last modified
The date and time the file was created
Data Blocks
Data blocks, also called storage blocks, contain the rest of the space that is allocated to the file system. The size
of these data blocks is determined when a file system is created. By default, data blocks are allocated in two
sizes: an 8-Kbyte logical block size, and a 1-Kbyte fragment size.
Free Blocks
Blocks that are not currently being used as inodes, as indirect address blocks, or as storage blocks are marked as
free in the cylinder group map. This map also keeps track of fragments to prevent fragmentation from degrading
disk performance.
As files are created or expanded, they are allocated disk space in either full logical blocks or portions of logical
blocks called fragments. When disk space is needed for a file, full blocks are allocated first, and then one or
more fragments of a block are allocated for the remainder. For small files, allocation begins with fragments. The
ability to allocate fragments of blocks to files, rather than just whole blocks, saves space by reducing
fragmentation of disk space that results from unused holes in blocks. You define the fragment size when you
create a UFS file system. The default fragment size is 1 Kbyte. Each block can be divided into 1, 2, 4, or 8
fragments, which results in fragment sizes from 8192 bytes to 512 bytes (for 4-Kbyte file systems only). The lower
bound is actually tied to the disk sector size, typically 512 bytes.
For multiterabyte file systems, the fragment size must be equal to the file system block size.
Note The upper bound for the fragment is the logical block size, in which case the fragment is not a fragment at
all. This configuration might be optimal for file systems with very large files when you are more concerned with
speed than with space. When choosing a fragment size, consider the trade-off between time and space: A small
fragment size saves space, but requires more time to allocate. As a general rule, to increase storage efficiency,
use a larger fragment size for file systems when most of the files are large. Use a smaller fragment size for file
systems when most of the files are small.
Minimum Free Space
The minimum free space is the percentage of the total disk space that is held in reserve when you create the file
system. The default reserve is ((64 Mbytes/partition size) * 100), rounded down to the nearest integer and limited
between 1 percent and 10 percent, inclusively. Free space is important because file access becomes less and
less efficient as a file system gets full. As long as an adequate amount of free space exists, UFS file systems
operate efficiently. When a file system becomes full, using up the available user space, only root can access the
reserved free space.
Commands such as df report the percentage of space that is available to users, excluding the percentage
allocated as the minimum free space. When the command reports that more than 100 percent of the disk space in
the file system is in use, some of the reserve has been used by root. If you impose quotas on users, the amount
of space available to them does not include the reserved free space. You can change the value of the minimum
free space for an existing file system by using the tunefs command.
Optimization Type
The optimization type parameter is set to either space or time.
Space When you select space optimization, disk blocks are allocated to minimize fragmentation and disk use is
optimized.
Time When you select time optimization, disk blocks are allocated as quickly as possible, with less emphasis on
their placement. When sufficient free space exists, allocating disk blocks is relatively easy, without resulting in too
much fragmentation. The default is time.
You can change the value of the optimization type parameter for an existing file system by using the tunefs
command.
Number of Inodes (Files)
The number of bytes per inode specifies the density of inodes in the file system. The number is divided into the
total size of the file system to determine the number of inodes to create. Once the inodes are allocated, you
cannot change the number without re-creating the file system. The default number of bytes per inode is 2048
bytes (2 Kbytes) if the file system is less than 1 Gbyte. If the file system is larger than 1 Gbyte, the following
formula is used:
If you have a file system with many symbolic links, they can lower the average file size. If your file system is going
to have many small files, you can give this parameter a lower value. Note, however, that having too many inodes
is much better than running out of inodes. If you have too few inodes, you could reach the maximum number of
files on a disk slice that is practically empty.
Maximum UFS File and File System Size
The maximum size of a UFS file system is about 16 Tbytes of usable space, minus about one percent overhead.
A sparse file can have a logical size of one terabyte. However, the actual amount of data that can be stored in a
file is approximately one percent less than 1 Tbyte because of the file system overhead.
Maximum Number of UFS Subdirectories
The maximum number of subdirectories per directory in a UFS file system is 32,767. This limit is predefined and
cannot be changed.
Note on UFS logging
In some operating systems, a file system with logging enabled is known as a journaling file system.
UFS Logging
UFS logging bundles the multiple metadata changes that comprise a complete UFS operation into a transaction.
Sets of transactions are recorded in an on-disk log. Then, they are applied to the actual UFS file systems
metadata. At reboot, the system discards incomplete transactions, but applies the transactions for completed
operations. The file system remains consistent because only completed transactions are ever applied. This
consistency remains even when a system crashes. A system crash might interrupt system calls and introduces
inconsistencies into a UFS file system.
UFS logging provides two advantages: If the file system is already consistent due to the transaction log, you might
not have to run the fsck command after a system crash or an unclean shutdown. UFS logging improves or
exceeds the level of performance of non logging file systems. This improvement can occur because a file system
with logging enabled converts multiple updates to the same data into single updates. Thus, reduces the number
of overhead disk operations required.
The UFS transaction log has the following characteristics:
Is allocated from free blocks on the file system
Sized at approximately 1 Mbyte per 1 Gbyte of file system, up to a maximum of 64 Mbytes
Continually flushed as it fills up
Planning UFS File Systems
When laying out file systems, you need to consider possible conflicting demands. Here are some suggestions:
Distribute the workload as evenly as possible among different I/O systems and disk drives. Distribute the
/export/home file system and swap space evenly across disks.
Keep pieces of projects or members of groups within the same file system.
Use as few file systems per disk as possible. On the system (or boot) disk, you should have three file
systems: root (/), /usr, and swap space. On other disks, create one or at most two file systems; with
one file system preferrably being additional swap space. Fewer, roomier file systems cause less file
fragmentation than many small, over crowded file systems. Higher-capacity tape drives and the ability of
the ufsdump command to handle multiple volumes make it easier to back up larger file systems.
If you have some users who consistently create very small files, consider creating a separate file system
with more inodes. However, most sites do not need to keep similar types of user files in the same file
system.
The table above displays the Solaris filesystems and their mount points.
The /etc/vfstab file lists all the file systems to be automatically mounted at system boot time. The file format
includes seven fields per line entry. By default, a tab separates each field, but any whitespace can be used for
separators. The dash (-) character is used as a placeholder for fields when text arguments are not appropriate.
To add a line entry, you need the following information:
device to mount The device to be mounted. For example, a local ufs file system /dev/dsk/c#t#d#s#, or
a
pseudo file system /proc.
device to fsck The raw or character device checked by the file system check program (fsck) if applicable.
A
pseudo file system has a dash (-) in this field.
mount point The name of the directory that serves as the attach mount point in the Solaris OE directory
hierarchy.
FS type The type of file system to be mounted.
fsck pass Indicates whether the file system is to be checked by the fsck utility at boot time. A 0
(zero) or a nonnumeric in this field indicates no. A 1 in this field indicates the fsck utility
gets
started for that entry and runs to completion. A number greater than 1 indicates that the
device
is added to the list of devices to have the fsck utility run. The fsck utility can run on up to
eight
devices in parallel. This field is ignored by the mountall command.
mount at boot Enter yes to enable the mountall command to mount the file systems at boot time. Enter
no to
prevent a file system mount at boot time.
mount options A comma-separated list of options passed to the mount command. A dash (-) indicates the
use of default mount options.
Note For / (root), /usr, and /var (if it is a separate file system) file systems, the mount at boot field value is
specified as no. The kernel mounts these file systems as part of the boot sequence before the mountall command
is run.
Introducing the /etc/mnttab File
The /etc/mnttab file is really an mntfs file system that provides read-only information directly from the kernel
about mounted file systems on the local host. Each time a file system is mounted, the mount command adds an
entry to this file. Whenever a file system is unmounted, its entry is removed from the /etc/mnttab file.
Mount Point The mount point or directory name where the file system is to be attached within the / (root) file
system (for example, /usr, /opt).
Device Name The name of the device that is mounted at the mount point. This block device is where the file
system is physically located.
Mount Options The list of mount options in effect for the file system.
dev=number The major and minor device number of the mounted file system.
Date and Time The date and time that the file system was mounted to the directory hierarchy.
In this example, the default action mounts the file system with the following options: read/write, setuid, intr,
nologging, and largefiles, xattr, and onerror. The following list explains the default options for the mount
command.
read/write Indicates whether reads and writes are allowed on the file system.
setuid Permits the execution of setuid programs in the file system.
intr/nointr Allows and forbids keyboard interrupts to kill a process that is waiting for an operation on a
locked
file system.
nologging Indicates that logging is not enabled for the ufs file system.
largefiles Allows for the creation of files larger than 2 Gbytes. A file system mounted with this option can
contain files larger than 2 Gbytes.
xattr Supports extended attributes not found in standard UNIX attributes.
onerror=action Specifies the action that the ufs file system should take to recover from an internal
inconsistency on
a file system. An action can be specified as:
panic Causes a forced system shutdown. This is the default.
lock Applies a file system lock to the file system.
umount Forcibly unmounts the file system.
The /etc/vfstab file provides you with another important feature. Because the /etc/vfstab file contains the mapping
between the mount point and the actual device name, the root user can manually mount by just specifying the
mount point.
# mount /export/home
Trusted certificate A certificate that holds a public key that belongs to another entity. The trusted certificate is
named as such because the keystore owner trusts that the public key in the certificate indeed belongs to the
identity identified by the subject or owner of the certificate. The issuer of the certificate vouches for this trust by
signing the certificate. Trusted certificates are used when verifying signatures, and when initiating a connection to
a secure (SSL) server.
User key Holds sensitive cryptographic key information. This information is stored in a protected format to
prevent unauthorized access. A user key consists of both the users private key and the public key certificate that
corresponds to the private key. The process of using the pkgadd or patchadd command to add a signed
package or patch to your system involves three basic steps:
1. Adding the certificates to your systems package keystore by using the pkgadm command
2. (Optional) Listing the certificates by using the pkgadm command
3. Adding the package with the pkgadd command or applying the patch by using the patchadd
command
If you use Patch Manager to apply patches to your system, you do not need to manually set up the keystore and
certificates, as it is automatically set up.
Using Suns Certificates to Verify Signed Packages and Patches
Digital certificates, issued and authenticated by Sun Microsystems, are used to verify that the downloaded
package or patch with the digital signature has not been compromised. These certificates are imported into your
systems package keystore. A stream-formatted SVR4-signed package or patch contains an embedded PEM-
encoded PKCS7 signature. This signature contains at a minimum the encrypted digest of the package or patch,
along with the signers X.509 public key certificate. The package or patch can also contain a certificate chain that
is used to form a chain of trust from the signers certificate to a locally stored trusted certificate.
The PEM-encoded PKCS7 signature is used to verify the following information:
1. The package came from the entity that signed it.
2. The entity indeed signed it.
3. The package hasnt been modified since the entity signed it.
4. The entity that signed it is a trusted entity.
All Sun certificates are issued by Baltimore Technologies, which recently bought GTE CyberTrust.
Access to a package keystore is protected by a special password that you specify when you import the Sun
certificates into your systems package keystore. If you use the pkgadm listcert command, you can view
information about your locally stored certificates in the package keystore. For example:
# pkgadm listcert -P pass:store-pass
Keystore Alias: GTE CyberTrust Root
Common Name: GTE CyberTrust Root
Certificate Type: Trusted Certificate
Issuer Common Name: GTE CyberTrust Root
Validity Dates: <Feb 23 23:01:00 1996 GMT> - <Feb 23 23:59:00 2006 GMT>
MD5 Fingerprint: C4:D7:F0:B2:A3:C5:7D:61:67:F0:04:CD:43:D3:BA:58
SHA1 Fingerprint: 90:DE:DE:9E:4C:4E:9F:6F:D8:86:17:57:9D:D3:91:BC:65:A6...
The following describes the output of the pkgadm listcert command.
Keystore Alias When you retrieve certificates for printing, signing, or removing, this name must be used to
reference the certificate.
Command Name The common name of the certificate. For trusted certificates, this name is the same as the
keystore alias.
Certificate Type Can be one of two types:
1. Trusted certificate A certificate that can be used as a trust anchor when verifying other certificates. No
private key is associated with a trusted certificate.
2. Signing certificate A certificate that can be used when signing a package or patch. A private key is
associated with a signing certificate.
Issuer Command Name The name of the entity that issued, and therefore signed, this certificate. For trusted
certificate authority (CA) certificates, the issuer common name and common name are the same.
Validity Dates A date range that identifies when the certificate is valid.
MD5 Fingerprint An MD5 digest of the certificate. This digest can be used to verify that the certificate has not
been altered during transmission from the source of the certificate.
SHA1 Fingerprint Similar to an MD5 fingerprint, except that it is calculated using a different algorithm.
Each certificate is authenticated by comparing its MD5 and SHA1 hashes, also called fingerprints, against the
known correct fingerprints published by the issuer.
Importing Suns Trusted Certificates
You can obtain Suns trusted certificates for adding signed packages and patches in the following ways:
Java keystore Import Suns Root CA certificate that is included by default in the Java keystore when you install
the Solaris release.
Suns Public Key Infrastructure (PKI) site If you do not have a Java keystore available on your system, you can
import the certificates from this site. https://ra.sun.com:11005/
PatchPros keystore If you have installed PatchPro for applying signed patches with the smpatch command, you
can import Suns Root CA certificate from the Java keystore.
Setting Up a Package Keystore
In previous Solaris releases, you could download the patch management tools and create a Java keystore, for
use by PatchPro, by importing the certificates with the keytool command.
If your system already has a populated Java keystore, you can now export the Sun Microsystems root CA
certificate from the Java keystore with the keytool command. Then, use the pkgadm command to import this
certificate into the package keystore. After the Root CA certificate is imported into the package keystore, you can
use the pkgadd and patchadd commands to add signed packages and patches to your system.
Note The Sun Microsystems root-level certificates are only required when adding Sun-signed patches and
packages.
Tools for Managing Software Packages
The following table describes the tools for adding and removing software packages from a system after the
Solaris release is installed on a system.
Although the pkgadd and pkgrm commands do not log their output to a standard location, they do keep track of
the package that is installed or removed. The pkgadd and pkgrm commands store information about a package
that has been installed or removed in a software product database.
By updating this database, the pkgadd and pkgrm commands keep a record of all software products installed
on the system.
Key Points for Adding Software Packages (pkgadd)
Keep the following key points in mind before you install or remove packages on your system:
1. Package naming conventions Sun packages always begin with the prefix SUNW, as in SUNWaccr,
SUNWadmap, and SUNWcsu. Third-party packages usually begin with a prefix that corresponds to the
companys stock symbol.
2. What software is already installed You can use the Solaris installation GUI, Solaris Product Registry
prodreg viewer (either GUI or CLI) or the pkginfo command to determine the software that is already
installed on a system.
3. How servers and clients share software Clients might have software that resides partially on a server
and partially on the client. In such cases, adding software for the client requires that you add packages to
both the server and the client.
Guidelines for Removing Packages (pkgrm)
You should use one of the tools listed in the above table to remove a package, even though you might be tempted
to use the rm command instead. For example, you could use the rm command to remove a binary executable
file. However, doing so is not the same as using the pkgrm command to remove the software package that
includes that binary executable. Using the rm command to remove a packages files will corrupt the software
products database. If you really only want to remove one file, you can use the removef command. This
command will update the software product database correctly so that the file is no longer a part of the package.
For more information, see the removef(1M) man page.
If you intend to keep multiple versions of a package, install new versions into a different directory than the already
installed package by using the pkgadd command.
For example, if you intended to keep multiple versions of a document processing application. The directory where
a package is installed is referred to as the base directory. You can manipulate the base directory by setting the
basedir keyword in a special file called an administration file. For more information on using an administration
file and on setting the base directory, see the Avoiding User Interaction When Adding Packages (pkgadd) on
page 266 and admin(4) man page.
Note If you use the upgrade option when installing Solaris software, the Solaris installation software checks the
software product database to determine the products that are already installed on the system.
Avoiding User Interaction When Adding Packages (pkgadd)
This section provides information about avoiding user interaction when adding packages with the pkgadd
command.
Using an Administration File
When you use the pkgadd -a command, the command consults a special administration file for information
about how the installation should proceed. Normally, the pkgadd command performs several checks and
prompts the user for confirmation before it actually adds the specified package. You can, however, create an
administration file that indicates to the pkgadd command that it should bypass these checks and install the
package without user confirmation.
The pkgadd command, by default, checks the current working directory for an administration file. If the pkgadd
command doesnt find an administration file in the current working directory, it checks the
/var/sadm/install/admin directory for the specified administration file. The pkgadd command also
accepts an absolute path to the administration file.
Note Use administration files judiciously. You should know where a packages files are installed and how a
packages installation scripts run before using an administration file to avoid the checks and prompts that the
pkgadd command normally provides.
The following example shows an administration file that prevents the pkgadd command from prompting the user
for confirmation before installing the package.
mail=
instance=overwrite
partial=nocheck
runlevel=nocheck
idepend=nocheck
rdepend=nocheck
space=nocheck
setuid=nocheck
conflict=nocheck
action=nocheck
networktimeout=60
networkretries=3
authentication=quit
keystore=/var/sadm/security
proxy=
basedir=default
Besides using administration files to avoid user interaction when you add packages, you can use them in several
other ways. For example, you can use an administration file to quit a package installation (without user
interaction) if theres an error or to avoid interaction when you remove packages by using the pkgrm command.
You can also assign a special installation directory for a package, which you might do if you wanted to maintain
multiple versions of a package on a system. To do so, set an alternate base directory in the administration file by
using the basedir keyword. The keyword specifies where the package will be installed. For more information,
see the admin(4) man page.
Using a Response File (pkgadd)
A response file contains your answers to specific questions that are asked by an interactive package. An
interactive package includes a request script that asks you questions prior to package installation, such as
whether optional pieces of the package should be installed.
If you know prior to installation that the package is an interactive package, and you want to store your answers to
prevent user interaction during future installations, use the pkgask command to save your response. For more
information on this command, see pkgask(1M).
Once you have stored your responses to the questions asked by the request script, you can use the pkgadd -
r command to install the package without user interaction.
Patches are identified by unique patch IDs. A patch ID is an alphanumeric string that is a patch base code and a
number that represents the patch revision number joined with a hyphen. For example, patch 108528-10 is the
patch ID for the SunOS 5.8 kernel update patch.
Managing Solaris Patches
When you apply a patch, the patch tools call the pkgadd command to apply the patch packages from the patch
directory to a local systems disk.
Caution Do not run the pkgadd command directly to apply patches.
More specifically, the patch tools do the following:
1. Determine the Solaris version number of the managing host and the target host
2. Update the patch packages pkginfo file with this information:
Patches that have been obsoleted by the patch being applied
Other patches that are required by this patch
Patches those are incompatible with this patch
While you apply patches, the patchadd command logs information in the /var/sadm/patch/patch-
id/log file.
3. The patchadd command cannot apply a patch under the following conditions:
The package is not fully installed on the system.
The patch packages architecture differs from the systems architecture.
The patch packages version does not match the installed packages version.
A patch with the same base code and a higher revision number has already been applied.
A patch that obsoletes this patch has already been applied.
The patch is incompatible with a patch that has already been applied to the system. Each patch that has
been applied keeps this information in its pkginfo file.
The patch being applied depends on another patch that has not yet been applied.
Note If an ASCII terminal is being used as the system console, use the Break sequence keys.
4. Manually synchronize the file systems by using the OpenBoot PROM sync command.
ok sync
This command causes the syncing of file systems, a crash dump of memory, and then a reboot of the system.
Makes it easy to debug and ask questions about services by providing an explanation of why a service
isnt running by using svcs -x. Also, this process is eased by individual and persistent log files for each
service.
Alows for services to be enabled and disabled using svcadm. These changes can persist through
upgrades and reboots. If the -t option is used, the changes are temporary.
Enhances the ability of administrators to securely delegate tasks to non-root users, including the ability
to modify properties and enable, disable, or restart services on the system.
Boots faster on large systems by starting services in parallel according to the dependencies of the
services. The opposite process occurs during shutdown.
Allows you to customize the boot console output to either be as quiet as possible, which is the default, or
to be verbose by using boot -m verbose.
Dependency statements define the relationships between services. SMF defines a set of actions that can be
invoked on a service by an administrator. These actions include enable, disable, refresh, restart, and maintain.
Each service is managed by a service restarter that carries out the administrative actions. In general, the
restarters carry out actions by executing methods for a service. Methods for each service are defined in the
service configuration repository. These methods allow the restarter to move the service from one state to another
state. The service configuration repository provides a per-service snapshot at the time that each service is
successfully started so that fallback is possible. In addition, the repository provides a consistent and persistent
way to enable or disable a service, as well as a consistent view of service state. This capability helps you debug
service configuration problems.
Changes in Behavior When Using SMF
Most of the features that are provided by SMF happen behind the scenes, so users are not aware of them. Other
features are accessed by new commands. Here is a list of the behavior changes that are most visible.
1. The boot process creates many fewer messages now. Services do not display a message by default
when they are started. All of the information that was provided by the boot messages can now be found in
a log file for each service that is in /var/svc/log. You can use the svcs command to help diagnose
boot problems. In addition, you can use the -v option to the boot command, which generates a
message when each service is started during the boot process.
2. Since services are automatically restarted if possible, it may seem that a process refuses to die. If the
service is defective, the service will be placed in maintenance mode, but normally a service is restarted if
the process for the service is killed. The svcadm command should be used to disable any SMF service
that should not be running.
3. Many of the scripts in /etc/init.d and /etc/rc*.d have been removed. The scripts are no longer
needed to enable or disable a service. Entries from /etc/inittab have also been removed, so that the
services can be administered using SMF. Scripts and inittab entries that are are locally developed will
continue to run. The services may not start at exactly the same point in the boot process, but they are not
started before the SMF services, so that any service dependencies should be OK.
SMF Concepts
SMF Service
The fundamental unit of administration in the SMF framework is the service instance. An instance is a specific
configuration of a service. A web server is a service. A specific web server daemon that is configured to listen on
port 80 is an instance. Multiple instances of a single service are managed as child objects of the service object.
Generically, a service is an entity that provides a list of capabilities to applications and other services, local and
remote. A service is dependent on an implicitly declared list of local services.
A milestone is a special type of service. Milestone services represent high-level attributes of the system. For
example, the services which constitute run levels S, 2, and 3 are each represented by milestone services.
Service Identifiers
Each service instance is named with a Fault Management Resource Identifier or FMRI. The FMRI includes the
service name and the instance name. For example, the FMRI for the rlogin service is
svc:/network/login:rlogin, where network/login identifies the service and rlogin identifies the
service instance.
Equivalent formats for an FMRI are as follows:
svc://localhost/system/system-log:default
svc:/system/system-log:default
system/system-log:default
The service names usually include a general functional category. The categories include the following:
application, device, milestone, network, platform, site, system.
Legacy init.d scripts are also represented with FMRIs that start with lrc instead of svc, for example:
lrc:/etc/rcS_d/S35cacheos_sh. The legacy services can be monitored using SMF. However, you cannot
administer these services. When booting a system for the first time with SMF, services listed in
/etc/inetd.conf are automatically converted into SMF services. The FMRIs for these services are slightly
different. The syntax for a converted inetd services is:
network/<service-name>/<protocol>
In addition, the syntax for a converted service that uses the RPC protocol is:
network/rpc-<service-name>/rcp_<protocol>
Where <service-name> is the name defined in /etc/inetd.conf and <protocol> is the protocol for the service.
For instance, the FMRI for the rpc.cmsd service is network/rpc-100068_2-/rpc_udp.
Service States
The svcs command displays the state, start time, and FMRI of service instances. The state of each service is one
of the following:
1. degraded The service instance is enabled, but is running at a limited capacity.
2. disabled The service instance is not enabled and is not running.
3. legacy_run The legacy service is not managed by SMF, but the service can be observed. This state is
only used by legacy services.
4. maintenance The service instance has encountered an error that must be resolved by the administrator.
5. offline The service instance is enabled, but the service is not yet running or available to run.
6. online The service instance is enabled and has successfully started.
7. uninitialized This state is the initial state for all services before their configuration has been read.
SMF Manifests
An SMF manifest is an XML file that contains a complete set of properties that are associated with a service or a
service instance. The files are stored in /var/svc/manifest. Manifests should not be used to modify the
properties of a service. The service configuration repository is the authoritative source of configuration
information. To incorporate information from the manifest into the repository, you must either run svccfg
import or allow the service to import the information during a system boot.
SMF Profiles
An SMF profile is an XML file that lists the set of service instances that are enabled when a system is booted. The
profiles are stored in /var/svc/profile. These are some the profiles that are included:
1. generic_open.xml This profile enables most of the standard internet services that have been enabled
by default in earlier Solaris releases. This is the default profile.
2. generic_limited_net.xml This profile disables many of the standard internet services. The sshd service
and the NFS services are started, but most of the rest of the internet services are disabled.
Service Configuration Repository
The service configuration repository stores persistent configuration information as well as SMF runtime data for
services. The repository is distributed among local memory and local files. SMF is designed so that eventually,
service data can be represented in the network directory service. The network directory service is not yet
available. The data in the service configuration repository allows for the sharing of configuration information and
administrative simplicity across many Solaris instances. The service configuration repository can only be
manipulated or queried using SMF interfaces.
SMF Repository Backups
SMF automatically takes the following backups of the repository:
1. The boot backup is taken immediately before the first change to the repository is made during each
system startup.
2. The manifest_import backup occurs after svc:/system/manifest-import:default completes,
if it imported any new manifests or ran any upgrade scripts.
Four backups of each type are maintained by the system. The system deletes the oldest backup, when
necessary. The backups are stored as /etc/svc/repository-type-YYYYMMDD_HHMMSWS, where
YYYYMMDD (year, month, day) and HHMMSS (hour, minute, second), are the date and time when the backup
was taken. Note that the hour format is based on a 24hour clock. You can restore the repository from these
backups, if an error occurs. To do so, use the /lib/svc/bin/restore_repository command.
SMF Snapshots
The data in the service configuration repository includes snapshots, as well as a configuration that can be edited.
Data about each service instance is stored in the snapshots. The standard snapshots are as follows:
initial Taken on the first import of the manifest
running Used when the service methods are executed
start Taken at the last successful start
The SMF service always executes with the running snapshot. This snapshot is automatically created if it does not
exist. The svcadm refresh command, sometimes followed by the svcadm restart command, makes a snapshot
active. The svccfg command is used to view or revert to instance configurations in a previous snapshot.
SMF Components
SMF includes a master restarter daemon and delegated restarters.
SMF Master Restarter Daemon
The svc.startd daemon is the master process starter and restarter for the Solaris OS. The daemon is
responsible for managing service dependencies for the entire system. The daemon takes on the previous
responsibility that init held of starting the appropriate /etc/rc*.d scripts at the appropriate run levels. First,
svc.startd retrieves the information in the service configuration repository. Next, the daemon starts services when
their dependencies are met. The daemon is also responsible for restarting services that have failed and for
shutting down services whose dependencies are no longer satisfied. The daemon keeps track of service state
through an operating system view of availability through events such as process death.
SMF Delegated Restarters
Some services have a set of common behaviors on startup. To provide commonality among these services, a
delegated restarter might take responsibility for these services. In addition, a delegated restarter can be used to
provide more complex or application-specific restarting behavior. The delegated restarter can support a different
set of methods, but exports the same service states as the master restarter. The restarters name is stored with
the service. A current example of a delegated restarter is inetd, which can start Internet services on demand,
rather than having the services always running.
The init command can be used to change the run-levels. Even the svcadm command can be used to change the
run level of a system, by selecting a milestone at which to run. The following table shows which run level
corresponds to each milestone.
Services
Step 4: The /sbin/init process starts
/lib/svc/bin/svc.startd, which
starts the system services and
runs the rc scripts.
Step 1
The PROM monitor has several functions. It can be used to modify basic hardware parameters such as serial port
configurations or the amount of memory that should be tested upon system power-up. Another PROM
configurable parameter is the system boot-device specification. This parameter tells the PROM monitor where it
should look for the next stage of the boot process. Most important, the PROM monitor has routines to load the
next stage into memory and start it running.
Similar to the PROM monitor, the boot block gets its name from the location in which it is stored. Typically, the
boot block is stored in the first few blocks (sectors 1 to 15) on the hard disk attached to the workstation. The boot
blocks job is to initialize some of the systems peripherals and memory, and to load the program, which in turn will
load the SunOS kernel. A boot block is placed on the disk as part of the Solaris installation process, or in some
circumstances, by the system administrator using the installboot program.
Step 2
Depending on the location of the boot block, its next action is to load a boot program such as ufsboot into memory
and execute it. The boot program includes a device driver as required for the device (such as a disk drive or
network adapter) that contains the SunOS kernel. Once started, the boot program loads the SunOS kernel into
memory, and then starts it running.
Step 3
The kernel is the heart of the operating system. Once loaded into memory by the boot program, the kernel has
several tasks to perform before the final stages of the boot process can proceed. First, the kernel initializes
memory and the hardware associated with the memory management. Next, the kernel performs a series of device
probes. These routines check for the presence of various devices such as graphics displays, Ethernet controllers,
disk controllers, disk drives, tape devices, and so on. This search for memory and devices is sometimes referred
to as auto-configuration.
With memory and devices identified and configured, the kernel finishes its start-up routine by creating init, the first
system process. The init process is given process ID number I and is the parent of all the processes on the
system. Process I (init) is also responsible for the remainder of the boot process.
Step 4
The init process and the files it reads and the shell scripts it executes are the most configurable part of the boot
process. Management of the processes that offer the login prompt to terminals, the start-up for daemons, network
configuration, disk checking, and more occur during this stage of the boot sequence. The init process starts the
svc.startd daemon, which starts all the services and it also executes the rc scripts for compatibility.
System Booting
We have different boot options for booting a Solaris Operating Environment:
Interactive boot (We can customize the kernel and device path)
Reconfiguration boot (To support newly added hardware)
Recovery boot (If the system is hung or its not coming up due to illegal entries)
When ufsboot loads these two files into memory, they are combined to form the running kernel.
On a system running in 32-bit mode, the two-part kernel is located in the directory /platform/uname -m/kernel.
On a system running in 64-bit mode, the two-part kernel is located in the directory /platform/uname -
m/kernel/sparcv9.
Note To determine the platform name (for example, the system hardware class), type the uname -m command.
For example, when you type this command on an Ultra 10 workstation, the console displays sun4u.
The kernel Initialization Phase
The following describes the kernel initialization phase:
The kernel reads its configuration file, called /etc/system.
The kernel initializes itself and begins loading modules.
Modules can consist of device drivers, binary files to support file systems, and streams, as well as other module
types used for specific tasks within the system. The modules that make up the kernel typically reside in the
directories /kernel and /usr/kernel. Platform-dependent modules reside in the /platform/uname -m/kernel and
/platform/uname -i/kernel directories. Each subdirectory located under these directories is a collection of similar
modules.
x86-boot process
The following describes the types of module subdirectories contained in the /kernel, /usr/kernel, /platform/uname
-m/kernel, or /platform/uname -i/kernel directories:
drv Device drivers
exec Executable file formats
fs File system types, for example, ufs, nfs, and proc
misc Miscellaneous modules (virtual swap)
sched Scheduling classes (process execution scheduling)
strmod Streams modules (generalized connection between users and device drivers)
sys System calls (defined interfaces for applications to use)
The /kernel/drv directory contains all of the device drivers that are used for system boot. The /usr/kernel/drv
directory is used for all other device drivers. Modules are loaded automatically as needed either at boot time or on
demand, if requested by an application. When a module is no longer in use, it might be unloaded on the basis that
the memory it uses is needed for another task. After the boot process is complete, device drivers are loaded when
devices, such as tape devices, are accessed.
The advantage of this dynamic kernel arrangement is that the overall size of the kernel is smaller, which makes
more efficient use of memory and allows for simpler modification and tuning.
rc1
rc2
rc3
rc5
rc6
rcS
For each rc script in the /sbin directory, there is a corresponding directory named /etc/rcn.d that contains scripts to
perform various actions for that run level. For example, /etc/rc2.d contains files that are used to start and stop
processes for run level 2. The /etc/rcn.d scripts are always run in ASCII sort order. The scripts have names of the
form:
[KS][0-9][0-9]*
Files that begin with K are run to terminate (kill) a system service. Files that begin with S are run to start a system
service.
Run control scripts are located in the /etc/init.d directory. These files are linked to corresponding run control
scripts in the /etc/rcn.d directories. The actions of each run control script are summarized in the following section.
The /sbin/rc0 Script
The /sbin/rc0 script runs the /etc/rc0.d scripts to perform the following tasks:
Stops system services and daemons
Terminates all running processes
Unmounts all file systems
The /sbin/rc1 Script
The /sbin/rc1 script runs the /etc/rc1.d scripts to perform the following tasks:
Stops system services and daemons
Terminates all running user processes
Unmounts all remote file systems
Mounts all local file systems if the previous run level was S
The /sbin/rc2 Script
The /sbin/rc2 script runs the /etc/rc2.d scripts to perform the following tasks, grouped by function:
Starts system accounting and system auditing, if configured
Configures serial device stream
Starts the Solaris PPP server or client daemons (pppoed or pppd), if configured
Configures the boot environment for the Live Upgrade software upon system startup or system shutdown
Checks for the presence of the /etc/.UNCONFIGURE file to see if the system should be reconfigured
Note Many of the system services and applications that are started at run level 2 depend on what software is
installed on the system.
The /sbin/rc3 Script
The /sbin/rc3 script runs the /etc/rc3.d scripts to perform the following tasks:
Starts the Apache server daemon (tomcat), if configured
Starts the Samba daemons (smdb and nmdb), if configured
The /sbin/rc5 and /sbin/rc6 Scripts
The /sbin/rc5 and /sbin/rc6 scripts run the /etc/rc0.d/K* scripts to perform the following tasks:
Kills all active processes
Unmounts the file systems
The /sbin/rcS Script
The /sbin/rcS script runs the /etc/rcS.d scripts to bring the system up to run level S.
Do not assign UIDs 0 through 99, which are reserved for system use, to regular user accounts. By definition, root
always has UID 0, daemon has UID 1, and pseudo-user bin has UID 2. In addition, you should give uucp logins
and pseudo user logins, such as who, tty, and ttytype, low UIDs so that they fall at the beginning of the passwd
file.
As with user (login) names, you should adopt a scheme to assign unique UID numbers. Some companies assign
unique employee numbers. Then, administrators add a number to the employee number to create a unique UID
number for each employee.
To minimize security risks, you should avoid reusing the UIDs from deleted accounts. If you must reuse a UID,
wipe the slate clean so that the new user is not affected by attributes set for a former user. For example, a
former user might have been denied access to a printer by being included in a printer deny list. However, that
attribute might be inappropriate for the new user.
Using Large User IDs and Group IDs
UIDs and group IDs (GIDs) can be assigned up to the maximum value of a signed integer, or 2147483647.
However, UIDs and GIDs over 60000 do not have full functionality and are incompatible with many Solaris
features. So, avoid using UIDs or GIDs over 60000. The following table describes interoperability issues with
Solaris products and previous Solaris releases.
UNIX Groups
A group is a collection of users who can share files and other system resources. For example, users who working
on the same project could be formed into a group. A group is traditionally known as a UNIX group.
Each group must have a name, a group identification (GID) number, and a list of user names that belong to the
group. A GID number identifies the group internally to the system. The two types of groups that a user can belong
to are as follows:
Primary group Specifies a group that the operating system assigns to files that are created by the user. Each
user must belong to a primary group.
Secondary groups Specifies one or more groups to which a user also belongs. Users can belong to up to 15
secondary groups.
Sometimes, a users secondary group is not important. For example, ownership of files reflect the primary group,
not any secondary groups. Other applications, however, might rely on a users secondary group memberships.
For example, a user has to be a member of the sysadmin group (group 14) to use the Admintool software in
previous Solaris releases. However, it doesnt matter if group 14 is his or her current primary group.
The groups command lists the groups that a user belongs to. A user can have only one primary group at a time.
However, a user can temporarily change the users primary group, with the newgrp command, to any other
group in which the user is a member.
When adding a user account, you must assign a primary group for a user or accept the default group, staff
(group 10). The primary group should already exist. If the primary group does not exist, specify the group by a
GID number. User names are not added to primary groups. If user names were added to primary groups, the list
might become too long. Before you can assign users to a new secondary group, you must create the group and
assign it a GID number.
Groups can be local to a system or managed through a name service. To simplify group administration, you
should use a name service such as NIS or a directory service such as LDAP. These services enable you to
centrally manage group memberships.
User Passwords
You can specify a password for a user when you add the user. Or, you can force the user to specify a password
when the user first logs in. User passwords must comply with the following syntax:
1. Password length must at least match the value identified by the PASSLENGTH variable in the
/etc/default/passwd file. By default, PASSLENGTH is set to 6.
2. The first 6 characters of the password must contain at least two alphabetic characters and have at least
one numeric or special character.
3. You can increase the maximum password length to more than eight characters by configuring the
/etc/policy.conf file with an algorithm that supports greater than eight characters.
Although user names are publicly known, passwords must be kept secret and known only to users. Each user
account should be assigned a password. The password can be a combination of six to eight letters, numbers, or
special characters.
To make your computer systems more secure, users should change their passwords periodically. For a high level
of security, you should require users to change their passwords every six weeks. Once every three months is
adequate for lower levels of security. System administration logins (such as root and sys) should be changed
monthly, or whenever a person who knows the root password leaves the company or is reassigned.
Many breaches of computer security involve guessing a legitimate users password. You should make sure that
users avoid using proper nouns, names, login names, and other passwords that a person might guess just by
knowing something about the user.
Good choices for passwords include the following:
1. Phrases (beammeup).
2. Nonsense words made up of the first letters of every word in a phrase. For example, swotrb for
SomeWhere over the RainBow.
3. Words with numbers or symbols substituted for letters. For example, sn00py for snoopy.
Do not use these choices for passwords:
1. Your name (spelled forwards, backwards, or jumbled)
2. Names of family members or pets
3. Car license numbers
4. Telephone numbers
5. Social Security numbers
6. Employee numbers
7. Words related to a hobby or interest
8. Seasonal themes, such as Santa in December
9. Any word in the dictionary
Home Directories
The home directory is the portion of a file system allocated to a user for storing private files. The amount of space
you allocate for a home directory depends on the kinds of files the user creates, their size, and the number of files
that are created.
A home directory can be located either on the users local system or on a remote file server. In either case, by
convention the home directory should be created as /export/home/username. For a large site, you should
store home directories on a server. Use a separate file system for each /export/homen directory to facilitate
backing up and restoring home directories. For example, /export/home1, /export/home2.
Regardless of where their home directory is located, users usually access their home directories through a mount
point named /home/username. When AutoFS is used to mount home directories, you are not permitted to create
any directories under the /home mount point on any system. The system recognizes the special status of /home
when AutoFS is active.
To use the home directory anywhere on the network, you should always refer to the home directory as $HOME, not
as /export/home/username. The latter is machine-specific. In addition, any symbolic links created in a users
home directory should use relative paths (for example, ../../../x/y/x) so that the links are valid no matter
where the home directory is mounted.
Name Services
If you are managing user accounts for a large site, you might want to consider using a name or directory service
such as LDAP, NIS, or NIS+. A name or directory service enables you to store user account information in a
centralized manner instead of storing user account information in every systems /etc files. When you use a
name or directory service for user accounts, users can move from system to system using the same user account
without having site-wide user account information duplicated on every system. Using a name or directory service
also promotes centralized and consistent user account information.
Users Work Environment
Besides having a home directory to create and store files, users need an environment that gives them access to
the tools and resources they need to do their work. When a user logs in to a system, the users work environment
is determined by initialization files. These files are defined by the users startup shell, such as the C, Korn, or
Bourne shell.
A good strategy for managing the users work environment is to provide customized user initialization files, such
as .login, .cshrc, .profile, in the users home directory.
Note Do not use system initialization files, such as /etc/profile or /etc/.login, to manage a users work
environment. These files reside locally on systems and are not centrally administered. For example, if AutoFS is
used to mount the users home directory from any system on the network, you would have to modify the system
initialization files on each system to ensure a consistent environment whenever a user moved from system to
system.
Guidelines for Using User Names, User IDs, and Group Ids
User names, UIDs, and GIDs should be unique within your organization, which might span multiple domains.
Keep the following guidelines in mind when creating user or role names, UIDs, and GIDs:
User names They should contain from two to eight letters and numerals. The first character should be a letter.
At least one character should be a lowercase letter.
Note Even though user names can include a period (.), underscore (_), or hyphen (-), using these characters is
not recommended because they can cause problems with some software products.
System accounts Do not use any of the user names, UIDs, or GIDs that are contained in the default
/etc/passwd and /etc/group files. UIDs and GIDs 0-99 are reserved for system use and should not be
used by anyone. This restriction includes numbers not currently in use.
For example, gdm is the reserved user name and group name for the GNOME Display Manager daemon and
should not be used for another user. For a complete listing of the default /etc/passwd and /etc/group
entries, see Table 46 and Table 49.
The nobody and nobody4 accounts should never be used for running processes. These two accounts are
reserved for use by NFS. Use of these accounts for running processes could lead to unexpected security risks.
Processes that need to run as a non-root user should use the daemon or noaccess accounts.
System account configuration The configuration of the default system accounts should never be changed.
This includes changing the login shell of a system account that is currently locked. The only exception to this rule
is the setting of a password and password aging parameters for the root account.
Password aging is available when you are using NIS+ or LDAP, but not NIS. Group information is stored in the
group file for NIS, NIS+ and files. For LDAP, group information is stored in the group container.
Fields in the passwd File
The fields in the passwd file are separated by colons and contain the following information:
username:password:uid:gid:comment:home-directory:login-shell
For example:
kryten:x:101:100:Kryten Series 4000 Mechanoid:/export/home/kryten:/bin/csh
The following table describes the passwd file fields.
Use the Solaris Management Console online help for information on performing these tasks.
For information on the Solaris commands that can be used to manage user accounts and groups, see Table 16.
These commands provide the same functionality as the Solaris management tools, including authentication and
name service support.
Tasks for Solaris User and Group Management Tools
The Solaris user management tools enable you to manage user accounts and groups on a local system or in a
name service environment. This table describes the tasks you can do with the Users tools User Accounts feature.
The Solaris environment provides default user initialization files for each shell in the /etc/skel directory on
each system, as shown in the following table.
You can use these files as a starting point and modify them to create a standard set of files that provide the work
environment common to all users. Or, you can modify these files to provide the working environment for different
types of users. Although you cannot create customized user initialization files with the Users tool, you can
populate a users home directory with user initialization files located in a specified skeleton directory. You can do
this by creating a user template with the User Templates tool and specifying a skeleton directory from which to
copy user initialization files.
For step-by-step instructions on how to create sets of user initialization files for different types of users, see How
to Customize User Initialization Files on page 103. When you use the Users tool to create a new user account
and select the create home directory option, the following files are created, depending on which login shell is
selected.
TABLE 419 Files Created by Users Tool When Adding a User
If you use the useradd command to add a new user account and specify the /etc/skel directory by using the
-k and -m options, all three /etc/skel/local* files and the /etc/skel/.profile file are copied into the
users home directory. At this point, you need to rename them to whatever is appropriate for the users login shell.
Using Site Initialization Files
The user initialization files can be customized by both the administrator and the user. This important feature can
be accomplished with centrally located and globally distributed user initialization files, called site initialization files.
Site initialization files enable you to continually introduce new functionality to the users work environment, while
enabling the user to customize the users initialization file.
When you reference a site initialization file in a user initialization file, all updates to the site initialization file are
automatically reflected when the user logs in to the system or when a user starts a new shell. Site initialization
files are designed for you to distribute site-wide changes to users work environments that you did not anticipate
when you added the users.
You can customize a site initialization file the same way that you customize a user initialization file. These files
typically reside on a server, or set of servers, and appear as the first statement in a user initialization file. Also,
each site initialization file must be the same type of shell script as the user initialization file that references it.
To reference a site initialization file in a C-shell user initialization file, place a line similar to the following at the
beginning of the user initialization file:
source /net/machine-name/export/site-files/site-init-file
To reference a site initialization file in a Bourne-shell or Korn-shell user initialization file, place a line similar to the
following at the beginning of the user initialization file:
. /net/machine-name/export/site-files/site-init-file
Avoiding Local System References
You should not add specific references to the local system in the user initialization file. You want the instructions
in a user initialization file to be valid regardless of which system the user logs into. For example:
1. To make a users home directory available anywhere on the network, always refer to the home directory
with the variable $HOME. For example, use $HOME/bin instead of /export/home/username/bin. The
$HOME variable works when the user logs in to another system and the home directories are
automounted.
2. To access files on a local disk, use global path names, such as /net/system-name/directory-name. Any
directory referenced by /net/system-name can be mounted automatically on any system on which the
user logs in, assuming the system is running AutoFS.
Shell Features
The following table lists basic shell features that each shell provides, which can help you determine what you can
and cant do when creating user initialization files for each shell.
Shell Environment
A shell maintains an environment that includes a set of variables defined by the login program, the system
initialization file, and the user initialization files. In addition, some variables are defined by default. A shell can
have two types of variables:
Environment variables Variables that are exported to all processes spawned by the shell. Their settings can be
seen with the env command. A subset of environment variables, such as PATH, affects the behavior of the shell
itself.
Shell (local) variables Variables that affect only the current shell. In the C shell, a set of these shell variables
have a special relationship to a corresponding set of environment variables. These shell variables are user, term,
home, and path. The value of the environment variable counterpart is initially used to set the shell variable.
In the C shell, you use the lowercase names with the set command to set shell variables. You use uppercase
names with the setenv command to set environment variables. If you set a shell variable, the shell sets the
corresponding environment variable and vice versa. For example, if you update the path shell variable with a
new path, the shell also updates the PATH environment variable with the new path.
In the Bourne and Korn shells, you can use the uppercase variable name equal to some value to set both shell
and environment variables. You also have to use the export command to activate the variables for any
subsequently executed commands.
For all shells, you generally refer to shell and environment variables by their uppercase names.
In a user initialization file, you can customize a users shell environment by changing the values of the predefined
variables or by specifying additional variables. The following table shows how to set environment variables in a
user initialization file.
The PATH Variable
When the user executes a command by using the full path, the shell uses that path to find the command.
However, when users specify only a command name, the shell searches the directories for the command in the
order specified by the PATH variable.
If the command is found in one of the directories, the shell executes the command. A default path is set by the
system. However, most users modify it to add other command directories. Many user problems related to setting
up the environment and accessing the correct version of a command or a tool can be traced to incorrectly defined
paths.
Setting Path Guidelines
Here are some guidelines for setting up efficient PATH variables:
1. If security is not a concern, put the current working directory (.) first in the path. However, including the
current working directory in the path poses a security risk that you might want to avoid, especially for
superuser.
2. Keep the search path as short as possible. The shell searches each directory in the path. If a command is
not found, long searches can slow down system performance.
3. The search path is read from left to right, so you should put directories for commonly used commands at
the beginning of the path.
4. Make sure that directories are not duplicated in the path.
5. Avoid searching large directories, if possible. Put large directories at the end of the path.
6. Put local directories before NFS mounted directories to lessen the chance of hanging when the NFS
server does not respond. This strategy also reduces unnecessary network traffic.
You can also determine the umask value you want to set by using the following table. This table shows the file
and directory permissions that are created for each of the octal values of umask.
5.1.4. Quotas
What Are Quotas?
Quotas enable system administrators to control the size of UFS file systems. Quotas limit the amount of disk
space and the number of inodes, which roughly corresponds to the number of files that individual users can
acquire. For this reason, quotas are especially useful on the file systems where user home directories reside.
Using Quotas
Once quotas are in place, they can be changed to adjust the amount of disk space or the number of inodes that
users can consume. Additionally, quotas can be added or removed, as system needs change.
In addition, quota status can be monitored. Quota commands enable administrators to display information about
quotas on a file system, or search for users who have exceeded their quotas.
Setting Soft Limits and Hard Limits for Quotas
You can set both soft limits and hard limits. The system does not allow a user to exceed his or her hard limit.
However, a system administrator might set a soft limit, which the user can temporarily exceed. The soft limit must
be less than the hard limit. Once the user exceeds the soft limit, a quota timer begins. While the quota timer is
ticking, the user is allowed to operate above the soft limit but cannot exceed the hard limit. Once the user goes
below the soft limit, the timer is reset. However, if the users usage remains above the soft limit when the timer
expires, the soft limit is enforced as a hard limit. By default, the soft limit timer is set to seven days.
The timeleft field in the repquota and quota commands shows the value of the timer.
For example, lets say a user has a soft limit of 10,000 blocks and a hard limit of 12,000 blocks. If the users block
usage exceeds 10,000 blocks and the seven-day timer is also exceeded, the user cannot allocate more disk
blocks on that file system until his or her usage drops below the soft limit.
The Difference between Disk Block and File Limits
A file system provides two resources to the user, blocks for data and inodes for files. Each file consumes one
inode. File data is stored in data blocks. Data blocks are usually made up of 1Kbyte blocks.
The who command displays a list of users currently logged in to the local system. It displays each users login
name, the login device (TTY port), the login date and time. The command reads the binary file /var/adm/utmpx
to obtain this information and information about where the users logged in from.
If a user is logged in remotely, the who command displays the remote host name, or Internet Protocol (IP)
address in the last column of the output.
# who
root console Oct 29 16:09 (:0)
root pts/4 Oct 29 16:09 (:0.0)
root pts/5 Oct 29 16:10 (192.168.0.234)
alice pts/7 Oct 29 16:48 (192.168.0.234)
The second field displayed by the who command defines the users login device, which is one of the following:
console The device used to display system boot and error messages
pts The pseudo device that represents a login or window session without a physical device
term The device physically connected to a serial port, such as a terminal or a modem
Displaying Users on Remote Systems
The rusers command produces output similar to that of the who command, but it displays a list of the users logged
in on local and remote hosts. The list displays the users name and the hosts name in the order in which the
responses are received from the hosts.
A remote host responds only to the rusers command if its rpc.rusersd daemon is enabled. The rpc.rusersd
daemon is the network server daemon that returns the list of users on the remote hosts.
Note The full path to this network server daemon is /usr/lib/netsvc/rusers/rpc.rusersd.
Displaying User Information
To display detailed information about user activity that is either local or remote, use the finger command.
The finger command displays:
1. The users login name
2. The home directory path
3. The login time
4. The login device name
5. The data contained in the comment field of the /etc/passwd file (usually the users full name)
6. The login shell
7. The name of the host, if the user is logged in remotely, and any idle time
Note You get a response from the finger command only if the in.fingerd daemon is enabled.
Displaying a Record of Login Activity
Use the last command to display a record of all logins and logouts with the most recent activity at the top of the
output. The last command reads the binary file /var/adm/wtmpx, which records all logins, logouts, and reboots.
Each entry includes the user name, the login device, the host that the user is logged in from, the date and time
that the user logged in, the time of logout, and the total login time in hours and minutes, including entries for
system reboot times.
The output of the last command can be extremely long. Therefore, you might want to use it with the -n number
option to specify the number of lines to display.
Recording Failed Login Attempts
When a user logs in to a system either locally or remotely, the login program consults the /etc/passwd and the
/etc/shadow files to authenticate the user. It verifies the user name and password entered.
If the user provides a login name that is in the /etc/passwd file and the correct password for that login name,
the login program grants access to the system.
If the login name is not in the /etc/passwd file or the password is not correct for the login name, the login
program denies access to the system. You can log failed command-line login attempts in the
/var/adm/loginlog file. This is a useful tool if you want to determine if attempts are being made to break into
a system.
By default, the loginlog file does not exist. To enable logging, you should create this file with read and write
permissions for the root user only, and it should belong to the sys group.
# touch /var/adm/loginlog
# chown root:sys /var/adm/loginlog
# chmod 600 /var/adm/loginlog
All failed command-line login activity is written to this file automatically after five consecutive failed attempts.
The loginlog file contains one entry for each of the failed attempts. Each entry contains the users login name,
login device (TTY port), and time of the failed attempt. If there are fewer than five consecutive failed attempts, no
activity is logged to this file.
2. If the attempt was successful. A plus sign (+) indicates a successful attempt. A minus sign (-) indicates an
unsuccessful attempt.
3. The port from which the command was issued.
4. The name of the user and the name of the switched identity.
The su logging in this file is enabled by default through the following entry in the /etc/default/su file:
SULOG=/var/adm/sulog
setuid Permission
When setuid permission is set on an executable file, a process that runs this file is granted access on the basis of
the owner of the file. The access is not based on the user who is running the executable file. This special
permission allows a user to access files and directories that are normally available only to the owner.
For example, the setuid permission on the passwd command makes it possible for users to change passwords. A
passwd command with setuid permission would resemble the following:
-r-sr-sr-x 1 root sys 27220 Jan 23 2005 /usr/bin/passwd
This special permission presents a security risk. Some determined users can find a way to maintain the
permissions that are granted to them by the setuid process even after the process has finished executing.
setgid Permission
The setgid permission is similar to the setuid permission. The processs effective group ID (GID) is changed to the
group that owns the file, and a user is granted access based on the permissions that are granted to that group.
The /usr/bin/mail command has setgid permissions:
-r-x--s--x 1 root mail 67900 Jan 23 2005 /usr/bin/mail
When the setgid permission is applied to a directory, files that were created in this directory belong to the group to
which the directory belongs. The files do not belong to the group to which the creating process belongs. Any user
who has write and execute permissions in the directory can create a file there. However, the file belongs to the
group that owns the directory, not to the group that the user belongs to.
Sticky Bit
The sticky bit is a permission bit that protects the files within a directory. If the directory has the sticky bit set, a file
can be deleted only by the file owner, the directory owner, or by a privileged user. The root user and the Primary
Administrator role are examples of privileged users. The sticky bit prevents a user from deleting other users files
from public directories such as /tmp:
drwxrwxrwt 7 root sys 528 Oct 29 14:49 /tmp
Be sure to set the sticky bit manually when you set up a public directory on a TMPFS file system.
File Permission Modes
The chmod command enables you to change the permissions on a file. You must be superuser or the owner of a
file or directory to change its permissions. You can use the chmod command to set permissions in either of two
modes:
Absolute Mode Use numbers to represent file permissions. When you change permissions by using the
absolute mode, you represent permissions for each triplet by an octal mode number. Absolute mode is the
method most commonly used to set permissions.
Symbolic Mode Use combinations of letters and symbols to add permissions or remove permissions.
You must use symbolic mode to set or remove setuid permissions on a directory. In absolute mode, you set
special permissions by adding a new octal value to the left of the permission triplet. The following table lists the
octal values for setting special permissions on a file.
The following table lists the octal values for setting file permissions in absolute mode.
who Specifies whose permissions are to be changed.
operator Specifies the operation to be performed.
permissions Specifies what permissions are to be changed.
Special File Permissions in Absolute Mode
For example, if you want everyone in a group to be able to read a file, you can simply grant group read
permissions on that file. Now, assume that you want only one person in the group to be able to write to that file.
Standard UNIX does not provide that level of file security. However, an ACL provides this level of file security.
ACL entries define an ACL on a file. The entries are set through the setfacl command. ACL entries consist of the
following fields separated by colons:
entry-type:[uid|gid]:perms
entry-type Is the type of ACL entry on which to set file permissions. For example, entry-type can be user (the
owner of a file) or mask (the ACL mask).
uid Is the user name or user ID (UID).
gid Is the group name or group ID (GID).
perms Represents the permissions that are set on entry-type. perms can be indicated by the symbolic characters
rwx or an octal number. These are the same numbers that are used with the chmod command.
Caution UFS file system attributes such as ACLs are supported in UFS file systems only. Thus, if you restore or
copy files with ACL entries into the /tmp directory, which is usually mounted as a TMPFS file system, the ACL
entries will be lost. Use the /var/tmp directory for temporary storage of UFS files.
ACL Entries for Files
The following table lists the valid ACL entries that you might use when setting ACLs on files. The first three ACL
entries provide the basic UNIX file protection.
ACL Entries for Directories
We can set default ACL entries on a directory. Files or directories created in a directory that has default ACL
entries will have the same ACL entries as the default ACL entries.
When you set default ACL entries for specific users and groups on a directory for the first time, you must also set
default ACL entries for the file owner, file group, others, and the ACL mask. These entries are required. They are
the first four default ACL entries in the following table.
Default ACL Entry Description
d[efault]:u[ser]::perms Default file owner permissions.
d[efault]:g[roup]::perms Default file group permissions.
d[efault]:o[ther]:perms Default permissions for users other than the file owner or members of the file group.
d[efault]:m[ask]:perms Default ACL mask.
d[efault]:u[ser]:uid:perms Default permissions for a specific user. For uid, you can specify either a user name
or a
numeric UID.
d[efault]:g[roup]:gid:perms Default permissions for a specific group. For gid, you can specify either a group
name
or a numeric GID.
The FTP server daemon in.ftpd reads the /etc/ftpd/ftpusers file when an FTP session is invoked. If the login name
of the user matches one of the listed entries, it rejects the login session and sends the Login failed error message.
The root entry is included in the ftpusers file as a security measure. The default security policy is to disallow
remote logins for the root user. The policy is also followed for the default value set as the CONSOLE entry in the
/etc/default/login file.
The /etc/hosts.equiv and $HOME/.rhosts Files
Typically, when a remote user requests login access to a local host, the first file read by the local host is its
/etc/passwd file. An entry for that particular user in this file enables that user to log in to the local host from a
remote system. If a password is associated with that account, then the remote user is required to supply this
password at log in to gain system access. If there is no entry in the local hosts /etc/passwd file for the remote
user, access is denied.
The /etc/hosts.equiv and $HOME/.rhosts files bypass this standard password-based authentication to determine if
a remote user is allowed to access the local host, with the identity of a local user. These files provide a remote
authentication procedure to make that determination.
This procedure first checks the /etc/hosts.equiv file and then checks the $HOME/.rhosts file in the home directory
of the local user who is requesting access. The information contained in these two files (if they exist) determines if
remote access is granted or denied. The information in the /etc/hosts.equiv file applies to the entire system, while
individual users can maintain their own $HOME/.rhosts files in their home directories.
Entries in the /etc/hosts.equiv and $HOME/.rhosts Files
While the /etc/hosts.equiv and $HOME/.rhosts files have the same format, the same entries in each file have
different effects. Both files are formatted as a list of one-line entries, which can contain the following types of
entries:
hostname
hostname username
+
The host names in the /etc/hosts.equiv and $HOME/.rhosts files must be the official name of the host, not one of
its alias names.
If the local hosts /etc/hosts.equiv file contains the host name of a remote host, then all regular users of that
remote host are trusted and do not need to supply a password to log in to the local host. This is provided so that
each remote user is known to the local host by having an entry in the local /etc/passwd file; otherwise, access is
denied.
This funtionality is particularly useful for sites where regular users commonly have accounts on many different
systems, eliminating the security risk of sending ASCII passwords over the network.
The /etc/hosts.equiv file does not exist by default. It must be created if trusted remote user access is required on
the local host.
The $HOME/.rhosts File Rules
While the /etc/hosts.equiv file applies system-wide access for nonroot users, the .rhosts file applies to a specific
user.
All users, including the root user, can create and maintain their own .rhosts files in their home directories. For
example, if you run an rlogin process from a remote host to gain root access to a local host, the /.rhosts file is
checked in the root home directory on the local host.
If the remote host name is listed in this file, it is a trusted host, and, in this case, root access is granted on the
local host. The CONSOLE variable in the /etc/default/login file must be commented out for remote root logins. The
$HOME/.rhosts file does not exist by default. You must create it in the users home directory.
SSH (Solaris Secure Shell) It is a secure remote login and transfer protocol that encrypts
communications over an insecure network.
In Solaris Secure Shell, authentication is provided by the use of passwords, public keys, or both. All network
traffic is encrypted. Thus, Solaris Secure Shell prevents a would-be intruder from being able to read an
intercepted communication. Solaris Secure Shell also prevents an adversary from spoofing the system.
With Solaris Secure Shell, you can perform these actions:
6. Printers
The Solaris printing service provides a complete network-printing environment. This environment allows sharing
of printers across machines, management of special printing situations such as forms, and filtering output to
match special printer types such as those that use the popular PostScript page description language. This release
also supports IPP, and also expanded printer support (through the use of additional transformation software,
raster image processor (RIP), and PostScript Printer Description (PPD) files, you can print to a wider range of
printers.
Print File
Spool directory
Printer
Print Client Print Server
The actual work of printing is executed by a printing daemon. Daemon is the nickname given to all processes
running on a UNIX system. The main printing daemon is lpsched.
Most output bound for a printer will require filtering before it is printed. The term filter is used to describe a
program that transforms the content of one file into another format as the file passes through the program. For
instance, when an ASCII file is sent to the printer that accepts the PostScript page description language, the print
service first runs the file through a filter to transform the file from ASCII into PostScript.
The resulting PostScript file contains complete instructions in a form the printer can use to print the page, and is
somewhat larger than the original file due to the addition of this information. After filtering, the PostScript file is
sent to the printer which reads a description of the pages to be printed and prints them.
Printing can be set up as a network-wide service. There are three varieties of network printing.
1. Machines with directly connected printers that accept print requests from other machines are called
print servers.
2. Machines that submit print requests over the network to other machines are called print-clients. A
machine can be both a print server and a print client.
3. Printers directly attached to the workstation are known as local printers, whereas printers attached to
other workstations reached via the network are known as remote printers.
Finally, it is increasingly common for printers to be directly attached to the network. These network printers can
either be served from a print server or act as their own print server, depending on the facilities provided in the
printer by the manufacturer.
For example, if your name service is NIS, printer configuration information on print clients is searched for in the
following sources in this order:
user Represents the users $HOME/.printers file
files Represents the /etc/printers.conf file
nis Represents the printers.conf.byname table
Most printing configuration tasks can be accomplished with Solaris Print Manager. However, if you need to write
interface scripts or add your own filters, you need to use the LP print service commands. These commands
underlie Solaris Print Manager.
Managing Network Printers
A network printer is a hardware device that is connected directly to the network. A network printer transfers data
directly over the network to the output device. The printer or network connection hardware has its own system
name and IP address. Network printers often have software support provided by the printer vendor. If your printer
has printer vendor-supplied software, then use the printer vendor software. If the network printer vendor does not
provide software support, Sun supplied software is available. This software provides generic support for network-
attached printers. However, this software is not capable of providing full access to all possible printer capabilities.
Note You can check the contents of the configuration files, but you should not edit these files directly. Instead,
use the lpadmin command to make configuration changes. Your changes are written to the configuration files in
the /etc/lp directory. The lpsched daemon administers and updates the configuration files.
The terminfo Database
The /usr/share/lib directory contains the terminfo database directory. This directory contains definitions
for many types of terminals and printers. The LP print service uses information in the terminfo database to
perform the following tasks:
1. Initializes a printer
2. Establishes a selected page size, character pitch, line pitch, and character set
3. Communicates the sequence of codes to a printer
Each printer is identified in the terminfo database with a short name. If necessary, you can add entries to the
terminfo database, but doing so is tedious and time-consuming.
Daemons and LP Internal Files
The /usr/lib/lp directory contains daemons and files used by the LP print service, as described in the
following table.
Spooling Directories
Files queued for printing are stored in the /var/spool/lp directory until they are printed, which might be only
seconds.
How LP Administers Files and Schedules Local Print Requests
The LP print service has a scheduler daemon called lpsched. The scheduler daemon updates the LP system
files with information about printer setup and configuration. The lpsched daemon schedules all local print
requests on a print server, as shown in the following figure. Users can issue the requests from an application or
from the command line. Also, the scheduler tracks the status of printers and filters on the print server. When a
printer finishes a request, the scheduler schedules the next request n the queue on the print server, if a next
request exists.
Without rebooting the system, you can stop the scheduler with the svcadm disable application/print/server
command. Then, restart the scheduler with the svcadm enable application/print/server command. The scheduler
for each system manages requests that are issued to the system by the lp command.
How Remote Printing Works
The following figure shows what happens when a user on a Solaris print client submits a print request to an lpd-
based print server. The command opens a connection and handles its own communications with the print server
directly.
lpstat Displays the status of the printing service and the status of individual jobs and printers.
cancel Stops individual print jobs.
enable Starts a printer printing.
Removing files more than a few days old from temporary directories
Executing accounting summary commands
Taking snapshots of the system by using the df and ps commands
Performing daily security monitoring
Running system backups
Weekly crontab system administration tasks might include the following:
Rebuilding the catman database for use by the man -k command
Running the fsck -n command to list any disk problems
Monthly crontab system administration tasks might include the following:
Listing files not used during a specific month
Producing monthly accounting reports
Additionally, users can schedule crontab commands to execute other routine system tasks, such as sending
reminders and removing backup files.
For Scheduling a Single Job: at
The at command allows you to schedule a job for execution at a later time. The job can consist of a single
command or a script. Similar to crontab, the at command allows you to schedule the automatic execution of
routine tasks. However, unlike crontab files, at files execute their tasks once. Then, they are removed from their
directory. Therefore, the at command is most useful for running simple commands or scripts that direct output into
separate files for later examination. Submitting an at job involves typing a command and following the at
command syntax to specify options to schedule the time your job will be executed. The at command stores the
command or script you ran, along with a copy of your current environment variable, in the /var/spool/cron/atjobs
directory. Your at job file name is given a long number that specifies its location in the at queue, followed by the .a
extension, such as 793962000.a. The cron daemon checks for at jobs at startup and listens for new jobs that are
submitted. After the cron daemon executes an at job, the at jobs file is removed from the atjobs directory.
Besides the default crontab files, users can create crontab files to schedule their own system tasks. Other crontab
files are named after the user accounts in which they are created, such as bob, mary, smith, or jones. To access
crontab files that belong to root or other users, superuser privileges are required. Procedures explaining how to
create, edit, display, and remove crontab files are described in subsequent sections.
How the cron Daemon Handles Scheduling
cron daemon manages automatic scheduling of crontab commands. The role of cron daemon is to check
/var/spool/cron/crontab directory for the presence of crontab files. The cron daemon performs the
following tasks at startup:
Checks for new crontab files
Reads the execution times that are listed within the files
Submits the commands for execution at the proper times
Listens for notifications from the crontab commands regarding updated crontab files.
In much the same way, the cron daemon controls the scheduling of at files. These files are stored in the
/var/spool/cron/atjobs directory. The cron daemon also listens for notifications from the crontab
commands regarding submitted at jobs.
Syntax of crontab File Entries
A crontab file consists of commands, one command per line that execute automatically at the time specified by
the first five fields of each command line. These five fields, described in the following table, are separated by
spaces.
Follow these guidelines for using special characters in crontab time fields:
Use a space to separate each field.
Use a comma to separate multiple values.
Use a hyphen to designate a range of values.
Use an asterisk as a wildcard to include all possible values.
Use a comment mark (#) at the beginning of a line to indicate a comment or a blank line.
For example, the following crontab command entry displays a reminder in the users console window at 4 p.m. on
the first and fifteenth days of every month.
0 16 1,15 * * echo Timesheets Due > /dev/console
Each command within a crontab file must consist of one line, even if that line is very long. The crontab file does
not recognize extra carriage returns.
Controlling Access to the crontab Command
You can control access to the crontab command by using two files in the /etc/cron.d directory: cron.deny and
cron.allow. These files permit only specified users to perform crontab command tasks such as creating, editing,
displaying, or removing their own crontab files. The cron.deny and cron.allow files consist of a list of user names,
one user name per line. These access control files work together as follows:
If cron.allow exists, only the users who are listed in this file can create, edit, display, or remove crontab
files.
If cron.allow does not exist, all users can submit crontab files, except for users who are listed in
cron.deny.
If neither cron.allow nor cron.deny exists, superuser privileges are required to run the crontab command.
Superuser privileges are required to edit or create the cron.deny and cron.allow files. The cron.deny file, which is
created during SunOS software installation, contains the following user names:
$ cat /etc/cron.d/cron.deny
daemon
bin
smtp
nuucp
listen
nobody
noaccess
None of the user names in the default cron.deny file can access the crontab command. You can edit this file to
add other user names that will be denied access to the crontab command. No default cron.allow file is supplied.
So, after Solaris software installation, all users (except users who are listed in the default cron.deny file) can
access the crontab command. If you create a cron.allow file, only these users can access the crontab command.
Frequency is another very important factor in successful file system backup strategies. The administrator must
determine how much can be lost before seriously affecting the organizations ability to do business. Corporations
that use computers for financial transactions, stock and commodities trading, and insurance activities would be a
few examples, which dont allow any data loss. In these cases the loss of a single transaction is not acceptable.
A reasonable backup schedule for a typical corporation would call for some form of backup to be performed
everyday. In addition, the backups must be performed every day so that the operator knows which tape to use to
reload, and to force the administrator into a schedule. The fact that backups must be maintained on a rigid
schedule lends support to the automated backup methods.
Backup Scheduling
After determining the files to backup and the frequency of backup, its time to decide about the backup schedule
and also the devices to be used for backup.
GRANDFATHER/FATHER/SON BACKUP
A cartridge tape distribution of the OS is easily stored in a remote location. In the event of a failure or disaster at
the primary computer location, these distribution tapes could be used to rebuild a system to replace the damaged
host.
Most cartridge tape systems use SCSI interconnections to the host system. These devices support data transfer
rate up to 5 Mb per second. This transfer rate may be a little misleading, however, as the information is typically
buffered in memory of the tape drive. The actual transfer rate from the tape drive memory to the tape media is
typically about 500 Kb per second.
8-mm Tape Drive
These tape drives are also small and fast, and use relatively inexpensive tape media. The 8-mm media can hold
between 2 and 40 GB of data. Because of high-density storage, 8-mm drives have become a standard backup
device on many systems. Several companies also use 8-mm tape as distribution media for software.
The 8-mm drives use the SCSI bus as the system interconnection. Low-density 8-mm drives can store 2.2 GB of
information on tape. These units transfer data to the tape at a rate of 250 Kb per second. High-density 8-mm
drives can store between 5 and 40 GB of information on tape.
At the low end, the 8-mm drives dont use data compression techniques to store the information on tape. At the
high end, the drives incorporate data compression hardware used to increase the amount of information that can
be stored on the tape. Regardless of the use of data compression, high-density 8-mm drives transfer data to tape
at a rate of 500 Kb per second. High-density drives also allow the user to read and write data at the lower
densities supported by the low-density drives. When using the high-density drives in low-density mode, storage
capacities and throughput numbers are identical to low-density drives.
Digital Audio Tape Drive
Digital audio tape (DAT) drives are small, fast, and use relatively inexpensive tape media. Typical DAT media can
hold between 2 and 40 GB of data. The DAT media is a relative newcomer to the digital data backup market. The
tape drive electronics and media are basically the same as the DAT tapes used in home audio systems.
The various densities available on DAT drives are due to data compression. A standard DAT drive can write 2 Gb
of data to a tape. By using various data compression algorithms, manufacturers have produced drives that can
store between 2 and 40 GB of data on tape. DAT drives use SCSI bus interconnections to the host system.
Because DAT technology is relatively new, it offers performance and features not available with other tape system
technologies. For instance, the DAT drive offers superior file search capabilities as compared to the 8-mm helical
scan drives on the market.
Digital Linear Tape (DLT)
Digital linear tape backup devices are among the newest devices on the backup market. These tape devices offer
huge data storage capabilities, high transfer rates, and small (but somewhat costly) media. Digital linear tape
drives can store up to 70 GB of data on a single tape cartridge. Transfer rates of 5 Mb/second are possible on
high-end DLT drives, making them very attractive at sites with large on-line storage systems.
Where 8-mm and DAT tape cost (roughly) $15 per tape, the DLT tapes can run as much as $60 each. However,
when the tape capacity is factored into the equation, the costs of DLT tapes become much more reasonable.
(Consider an 8-mm tape that holds 14 GB on average versus a DLT cartridge, which can hold 70 Gb of data.)
Many operators elect not to enable the compression hardware on tape systems, opting instead for software
compression before the data are sent to the tape drive. In the event of a hardware failure in the tape drives
compression circuitry, it is possible that the data written to tape would be scrambled. By using the software
compression techniques, the operator can bypass such potential problems.
Jukebox System
Jukebox systems combine a jukebox mechanism with one or more tape drives to provide a tape system capable
of storing several hundred Gb of data. The tape drives in the jukebox systems employ multiple tape drives, and
special robotic hardware to load and unload the tapes.
Jukebox systems require special software to control the robotics. The software keeps track of the content of each
tape and builds an index to allow the user to quickly load the correct tape on demand. Many commercially
available backup software packages allow the use of jukebox systems to permit backup automation.
Disk Systems as backup devices
One problem involved in using tape devices for backups is the (relatively) low data throughput rate. If the operator
had to backup up several gigabytes or terabytes of data daily it would not take long to realize that tape drives are
not the best backup method.
One popular method of backing up large-scale systems is to make backup copies of the data on several disk
drives. Disk drives are orders of magnitude faster than tape devices, and therefore offer a solution to one of the
backup problems on large-scale systems. However, disk drives are much more expensive than tapes. Disk
backups also consume large amounts of system resources. For example, you would need 100 2-Gb disks to back
up a hundred 2-Gb disks. Fortunately, there are software applications and hardware systems available to
transparently perform this function.
RAID Disk Arrays
One operating mode of redundant arrays of inexpensive disks (RAID) enables the system to make mirror image
copies of all data on backup disk drives. RAID disk arrays also allow data striping for high-speed data access. Yet
another mode stores the original data, as well as parity information on the RAID disks. If a drive should fail, the
parity information may be used to re-create the data from the failed drive.
Problems with Disks as Backup Devices
Although backing up to disk devices is much faster than backing up to other devices, it should be noted that disk
devices present a potentially serious problem. One of the important considerations of backup planning is the
availability of the data to users. In the event of a natural disaster, it may be necessary to keep a copy of the
corporate data off-site.
When tape devices are employed as the backup platform, it is a simple matter to keep a copy of the backups off-
site. When disk drives are employed as a backup media, the process of keeping a copy of the backup media off-
site becomes a bit more complicated (not to mention much more expensive). In the case of a RAID disk array, the
primary copy of the data is stored on another disk. However, both disks are housed in a single box. This makes
the task of moving one drive off-site much more complicated.
RAID disk arrays have recently been equipped with fiber channel interfaces. The fiber channel is a high-speed
interconnect that allows devices to be located several kilometers from the computer. By linking RAID disk arrays
to systems via optical fibers, it is possible to have an exact copy of the data at a great distance from the primary
computing site at all times.
In applications and businesses where data accessibility is of the utmost importance, the use of RAID disk arrays
and fiber channel interconnections could solve most of the problems.
Differences between Different types of Backups
To use the keyword=value pairs instead of the redirect symbols, you would type the following:
$ dd if=/floppy/floppy0 of=/tmp/output.file
-c Specifies that the cpio command should read headers in ASCII character format.
-v Displays the files as they are retrieved in a format that is similar to the output from the ls command.
"*file" Specifies that all files that match the pattern are copied to the current directory. You can specify multiple
patterns, but each pattern must be enclosed in double quotation marks.
< /dev/rmt/n Specifies the input file.
Verify that the files were copied.
$ ls -l
Retrieving Specific Files from a Tape (cpio)
The following example shows how to retrieve all files with the chapter suffix from the tape in drive 0.
$ cd /home/smith/Book
$ cpio -icv "*chapter" < /dev/rmt/0
Boot.chapter
Directory.chapter
Install.chapter
Intro.chapter
31 blocks
$ ls -l
If you dont specify the density, a tape drive typically writes at its preferred density. The preferred density usually
means the highest density the tape drive supports. Most SCSI drives can automatically detect the density or
format on the tape and read it accordingly. To determine the different densities that are supported for a drive, look
at the /dev/rmt subdirectory. This subdirectory includes the set of tape device files that support different output
densities for each tape. Also, a SCSI controller can have a maximum of seven SCSI tape drives.
The following example shows the status for an Exabyte tape drive (/dev/rmt/1):
$ mt -f /dev/rmt/1 status
Exabyte EXB-8200 8mm tape drive:
sense key(0x0)= NO Additional Sense residual= 0 retries= 0
file no= 0 block no= 0
The following example shows a quick way to poll a system and locate all of its tape drives:
$ for drive in 0 1 2 3 4 5 6 7
> do
> mt -f /dev/rmt/$drive status
> done
Archive QIC-150 tape drive:
sense key(0x0)= No Additional Sense residual= 0 retries= 0
file no= 0 block no= 0
/dev/rmt/1: No such file or directory
/dev/rmt/2: No such file or directory
/dev/rmt/3: No such file or directory
/dev/rmt/4: No such file or directory
/dev/rmt/5: No such file or directory
/dev/rmt/6: No such file or directory
/dev/rmt/7: No such file or directory
$
Handling Magnetic Tape Cartridges
If errors occur when a tape is being read, you can retension the tape, clean the tape drive, and then try again.
3. Store your tapes in a dust-free safe location, away from magnetic equipment. Some sites store archived
tapes in fireproof cabinets at remote locations. You should create and maintain a log that tracks which
media (tape volume) stores each job (backup) and the location of each backed-up file.
9. Network Basics
We have seen how to manage peripherals and resources directly connected to local systems. Let us study in
detail about the network and also see how computer networks allow the utilization of resources indirectly
connected to the network.
Overview of the Internet
The Internet consists of many dissimilar network technologies around the world that are interconnected. It
originated in the mid-1970s as the ARPANET. Primarily funded by the Defense Advanced Research Projects
Agency (DARPA), ARPANET pioneered the use of packet-switched networks using Ethernet, and a network
protocol called Transmission Control Protocol/Internet Protocol (TCP/IP). The beauty of TCP/IP is that it hides
network hardware issues from end users, making it appear as though all connected computers are using the
same network hardware.
The Internet currently consists of thousands of interconnected computer networks and millions of host computers.
Connecting to the Internet
If hosts on a network want to communicate, they need an addressing system that identifies the location of each
host on the network. In the case of hosts on the Internet, the governing body that grants Internet addresses is the
Network Information Center (NIC).
Technically, sites that do not want to connect to the Internet need not apply to the NIC for a network address. The
network/system administrator may assign network addresses at will. However, if the site decides to connect to the
Internet at some point in the future, it will need to re-address all hosts to a network address assigned by the NIC.
Although reassigning network addresses is not difficult, it is tedious and time consuming, especially on networks
of more than a few dozen hosts. It is therefore recommended that networked sites apply for an Internet address
as part of the initial system setup.
Beginning in 1995, management of the Internet became a commercial operation. The commercialization of
Internet management led to several changes in the way addresses are assigned. Prior to 1995, sites had to
contact the NIC to obtain an address. The new management determined that sites should contact respective
network service providers (NSPs) to obtain IP addresses. Alternatively, sites may contact the appropriate network
registry.
9.1. TCP/IP
TCP/IP is the networking protocol suite most commonly used with UNIX, MacOS, Windows, Windows NT, and
most other operating systems. It is also the native language of the Internet.
TCP/IP defines a uniform programming interface to different types of network hardware, guaranteeing that
systems can exchange data (interoperate) despite their many differences. IP, the suites underlying delivery
protocol, is the workhorse of the Internet. TCP and UDP (the User Datagram Protocol) are transport protocols that
are built on top of IP to deliver packets to specific applications.
TCP is a connection-oriented protocol that facilitates a conversation between the two programs. It works like a
phone call: the words you speak are delivered to the person you called, and vice versa. The connection persists
even when neither party is speaking. TCP provides reliable delivery, flow-control, and congestion control.
UDP is a packet-oriented service. Its analogous to sending a letter through the post office. It doesnt provide two-
way connections and doesnt have any form of congestion control.
TCP/IP is a protocol suite, a set of network protocols designed to work smoothly together. It includes several
components:
Internet Protocol (IP) --- this routes data packets from one machine to another.
Internet Control Message Protocol (ICMP) --- this provides several kinds of low-level support for IP,
including error messages, routing assistance, and debugging help.
Address Resolution Protocol (ARP) --- this translates IP addresses to hardware addresses
User Datagram Protocol (UDP) --- this delivers data to specific applications on the destination machine.
UNIX can support a variety of physical networks, including Ethernet, FDDI, token ring, ATM (Asynchronous
Transfer Mode), wireless Ethernet and serial-line-based systems.
Internet Protocol
The process of connecting two computer networks is called inter-networking. The network may or may not be
using the same network technology, such as Ethernet or token ring. For an internetwork connection to function, a
transfer device that forwards datagrams from one network to another is required. This transfer device is called a
router or, in some cases, a gateway.
In order for internetworked computers to communicate, they must speak the same language. The language
supported by most computers is the Transmission Control Protocol/Internet Protocol (TCP/IP). The TCP/IP
protocols are actually a collection of many protocols. This suite of protocols defines every aspect of network
communications, including the language spoken by the systems, the way the systems address each other, how
data are routed through the network, and how the data will be delivered. The Internet is currently using version 4
of the Internet protocol (IPv4).
Network Addresses
For internetworked computers to communicate there must be some way to uniquely identify the address of the
computer where data are to be delivered. This identification scheme must be much like postal mail address. It
should provide enough information so that the systems can send information long distances through the network,
yet have assurance that it will be delivered to the desired destination. As with postal addresses, there must be
some sanctioned authority to assign addresses and administer the network.
The TCP/IP protocol defines one portion of the addressing scheme used in most computer networks. The Internet
protocol defines the Internet protocol address (also known as an IP address, or Internet address) scheme. The IP
address is a unique number assigned to each host on the network.
The vendors of network hardware also provide a portion of the addressing information used on the network. Each
network technology defines a hardware (media) layer addressing scheme unique to that network technology.
These hardware-level addresses are referred to as media access controller (MAC) addresses.
Internet Protocol Addresses
Hosts connected to the Internet have a unique Internet Protocol (IP) address. IP addresses consist of 32-bit
hexadecimal number, but they are typically represented as a set of four integers separated by periods. An
example of an Internet address is 192.168.3.1. Each integer in the address must be in the range from 0 to 255.
There are five classes of Internet addresses: Class A, Class B, Class C, Class D, Class E. Class A, B, and C
addresses are used for host addressing. Class D addresses are called multi-cast addresses, and Class E
addresses are experimental addresses.
Class A Addresses
If the number in the first field of a hosts IP address is in the range 1 to 127, the host is on a Class A network.
There are 127 Class A networks. Each Class A network can have up to 16 million hosts. With Class A networks,
the number in the first field identifies the network number, and the remaining three fields identify the host address
on that network.
NOTE: 127.0.0.1 is a reserve IP address called the loopback address. All hosts on the Internet use this address
for their own internal network testing and interprocess communications. Dont make address assignments of the
form 127.x.x.x; nor should you remove the loopback address unless instructed otherwise.
Class B addresses
If the integer in the first field of a hosts IP address is in the range 128 to 191, the host is on a Class B network.
There are 16,384 Class B networks with upto 65,000 hosts each. With Class B networks, the integers in the first
two fields identify the network number, and the remaining two fields identify the host address on that network. An
internet protocol address space is shown:
st
Class 1 Byte Format # Nets # Hosts per net
Address Translation
Address translation is a technique that allows a router to transfer private addresses into public addresses. This
allows sites to assign private addresses internally, yet communicate with other hosts on the Internet using an
assigned public IP address.
Some typical applications where address translation is required include connections to cable-modems, DSL, and
other high-speed Internet Service Provider networks. The ISP assigns a single IP address to a customer that
signs up for cable-modem service. The customer has several computer systems that need to communicate over
the Internet. The customer installs Address Translation Software on their router. When an internal host wants to
communicate with an external host, the translation software goes to work.
The translation software traps the outbound packet, and records the destination IP address and service
requested in a memory-resident table. It then changes the packets source address to the address of the local
router and sends the packet out on the network. When the remote host replies, the packet is sent to the local
router.
The address translation software intercepts the reply packet, determines where it came from, and looks this
address up in the memory-resident table. When it finds a match, the software determines which internal host was
communicating with this external host, and modifies the packet destination address accordingly. The packet is
sent along to the internal host using the private address assigned to that host.
Media Access Controller (MAC) Addresses
In addition to IP addresses assigned by the NIC, most networks also employ another form of addressing known
as the hardware or media access controller (MAC) address. Each network interface board is assigned a unique
MAC address by the manufacturer. In the case of Ethernet interface boards, the MAC address is a 48-bit value.
The address is typically written as a series of six 2-byte hexadecimal values separated by colons.
Each network interface manufacturer is assigned a range of addresses it may assign to interface cards. For
example, 08:00:20:3f:oi:ee might be the address of a Sun Microsystems Ethernet interface. This address would
become a permanent part of the hardware for a workstation manufactured by the Sun Microsystems. (In a MAC
address, the first 3 bytes are IEEE assigned values and the next 3 bytes are vendor specific values).
Ethernet interfaces know nothing of IP addresses. The IP address is used by a higher level of the
communications software. Instead, when two computers connected to an Ethernet communicate, they do so via
MAC addresses. Data transport over the network media is handled by one systems network hardware. If the
datagram is bound for a foreign network it is sent to the MAC address of the router, which will handle forwarding
to the remote network.
Before the datagram is sent to the network interface for delivery, the communications software embeds the IP
address within the datagram. The routers along the path to the final destination use the IP address to determine
the next hop along the path to the final destination. When the datagram arrives at the destination, the
communications software extracts the IP address to determine what to do with the data.
Internet Protocol Version 6 (IPv6)
IPv4 is running out of address space as a result of the enormous expansion of the Internet in recent years. In
order to ensure that address space is available in the future, the Internet Engineering Task Force is readying IPv6
for deployment. Beginning with Solaris 8, the Solaris OE supports IPv6, and Dual-stack (IPv4/IPv6) protocols.
Major differences between IPv4 and IPv6 follow:
IPv6 addresses are 128 bits long (as opposed to the IPv4 32-bit addresses). A typical IPv6 address consists of a
series of colon-separated hexadecimal digits. For example, 0xFEDC:BA98:7654:3210:0123:4567:89AB: CDEF
might be a valid IPv6 host address. Provisions have been made to allow IPv4 addresses on an IPv6 network.
These addresses will have hexadecimal numbers separated by colons, followed by decimal numbers separated
by periods. Such an address might look like the following:
0000:0000:0000:0000:0000: FFFF: 192.168.33.44
IPv6 implements multicasting to replace the IPv4 broadcast-address scheme. The IPv6 multicast scheme allows
for several types of broadcast addresses (organizational, Internet-wide, domain-wide, and so forth).
IPv6 doesnt contain address classes. Some IPv6 address ranges will be reserved for specific services, but
otherwise the idea of address classes will vanish.
IPv6 uses classless Internet domain routing (CIDR) algorithms. This new routing algorithm allows for more
efficient router operation in large network environments. Addresses will be assigned regionally to minimize routing
tables, and to simplify packet routing. IPv6 can encrypt data on the transport media. IPv4 has no facility that
allows network data to be encrypted.
Ports
IP addresses identify machines, or more precisely, network interfaces on a machine. They are not specific
enough to address particular processes or services. TCP and UDP extend IP addresses with a concept known as
a port. A port is a 16-bit number that supplements an IP address to specify a particular communication channel.
Standard UNIX services such as e-mail, FTP, and the remote login server all associate themselves with well-
known ports defined in the /etc/services. (UNIX systems restrict access to port numbers under 1024 to root.)
Address Types
Unicast: addresses that refer to a single host.
Multicast: addresses that identify a group of hosts.
Broadcast: addresses that include all hosts on the local network.
Name Services
Computers are very adept at dealing with numbers. In general, people are not capable of remembering so many
numbers associated with computer networking. People prefer to work with alphabetic names. The IP provides a
means of associating a host name with an IP address. The naming of computers makes it simple for humans to
initiate contact with a remote computer simply by referring to the computer by name.
The Name Server:
Unlike humans, computers refer to remote systems by IP addresses. Name services provide mapping between
the hostname humans prefer to use, and the IP addresses computers prefer to use. These name services require
a host (or hosts) connected to the local network to run special name resolution software. These name server
hosts have a database containing mappings between hostnames and IP addresses.
Several name services are available under Solaris. Suns Network Information Service (NIS) and the Domain
Name Service (DNS) are the two name services used most frequently in the Solaris environment. The NIS is
designed to provide local name service within an organization. The DNS name service is designed to provide
Internet-wide name services.
resolution protocol (ARP) provides a method for hosts to exchange MAC addresses, thereby facilitating
communications.
The ARP software runs on every host on the network. This software examines every packet received by the host
and extracts the MAC address and IP address from each packet. This information is stored in a memory-resident
cache, commonly referred to as the ARP cache.
But if the MAC addresses are not cached then, a host (lets say A) machine sends a broadcast ARP packet to the
network. Every machine on the network receives the broadcast packet and examines its content. The packet
contains a query Does anyone know the MAC address for the host C machine at IP address p.q.r.s?
Every machine on the network (including host C) examines the broadcast ARP packet it just received. If the host
C machine is functioning, it should reply with an ARP reply datagram saying that I am host C, my IP address is
p.q.r.s, and my MAC address is so 8.0.20.0.11.8c. When the machine A receives this datagram, it adds the MAC
address to its ARP cache, and then sends the pending datagram to host C.
Network Design Considerations
Very few system administrators have the opportunity to design a corporate network. This function is typically
managed by a network administrator. Many corporations require the sys-admin to perform both functions for the
company.
However, the sys-admin doesnt know how to design a network, or how to go about managing the network that
gets implemented. The following sections outline some of the topics the network or system administrator must be
concerned with when designing a corporate network.
Computer networks may be classified by many methods, but geographic, technological, and infrastructural
references are three of the most common classifications.
Network geography refers to the physical span of the network. And this dictates the technologies used
when the network is implemented.
Network technology refers to the type of hardware used to implement the network. The network
hardware will dictate the infrastructure required to implement the network.
Network infrastructure refers to the details of the corporate network wiring/interconnection scheme.
Network Geography
Based on physical extent of a network, we have:
LAN (Local Area Network)
MAN (Metropolitan Area Network)
WAN (Wide Area Network)
NOTE: Network throughput measurements are generally listed in terms of bits per second instead of bytes per
second, as is customary with disk subsystems. The difference is attributed to the fact that many network
technologies are serial communications channels, whereas disk subsystems are parallel communications
channels. The abbreviations Kbps, Mbps, and Gbps refer to kilobits per second, megabits per second, and
gigabits per second.
LANS
These are confined to a single building, the underlying cable plant is often owned by the corporation instead of by
a public communications carrier.
MANS
These connect geographically dispersed office within a city or state. And they involve a single communications
carrier, which makes them much simpler to implement than WAN networks.
WANS
These are used to interconnect MANs and LANs. Many corporations use WANs to connect worldwide offices into
a single corporate network.
Differences between different types of networks:
Type Speed Coverage
WAN 56 Kbps 155 Mbps Span vast geographic areas, like continents
Network Technologies:
We have to study the advantages and disadvantages of all the available technologies (because we have a wide
variety of them available) before implementing the physical network.
Ethernet
It is one of the dominant LAN technologies in use today. Ethernet has no pre-arranged order in which hosts
transfer data. Any host can transmit onto the network whenever the network is idle. Ethernet is said to be multiple
access (MA). If two hosts transmit simultaneously, however, a data collision occurs. Both hosts will detect the
collision (CD) and stop transmitting. The hosts will then wait a random period before attempting to transmit again.
It is important to note that Ethernet is a best-effort delivery LAN. In other words, it doesnt perform error-correction
or retransmission of lost data when network errors occur due to events such as collisions. Any datagrams that
result in a collision or arrive at the destination with errors must be retransmitted. On a highly loaded network,
retranmission could result in a bottle-neck. It is best suited for LANs whose traffic patterns are data bursts. NFS
and NIS are examples of network applications that tend to generate bursty network traffic.
Integrated Services Digital Network (ISDN)
ISDN is a multiplexed digital networking scheme for existing telephone facilities (which are mostly analog). The
major advantage of ISDN is that it can be operated over most existing telephone lines.
Digital Subscriber Loop (DSL)
DSL is a relatively new form of long-haul, high-speed networking. DSL is available in many flavors. Each version
of DSL requires special DSL modem. These modems typically have an Ethernet connection on the local side, and
a telephone line interface on the remote side.
Token Ring
Token Ring networks are another widely used local area network technology. A token ring network utilizes a
special data-structure called a token, which circulates around the ring of connected hosts. Unlike the MA access
scheme of Ethernet, a host on a token ring can transmit data only when it possesses the token. Token ring
networks operate in receive and transmit modes.
In transmit mode, the interface breaks the ring open. The host sends its data on the output side of the ring, and
then wait until it receives the information back on its input. Once the system receives information it just tranmitted,
the token is placed back on the ring to allow other systems permission to transmit information, and the ring is
again closed.
In receive mode, a system copies the data from the input side of the ring to the output side of the ring. If the host
is down and unable to forward the information to the next host on the ring, the network is down. For this reason,
many token ring interfaces employ a drop-out mechanism that, when enabled, connects the ring input to the ring
output. This dropout mechanism is disabled by the network driver software when the system is up and running.
However, if the system is not running, the dropout engages and data can get through the interface to the next
system on the ring.
Fiber Distributed Data Interconnect
FDDI is a token ring LAN based on fiber optics. FDDI networks are well suited for LANs whose traffic patterns
include sustained high loads, such as relational database transfers and network tape backups. The FDDI
standards define 2 types of topologies: single attachment stations (SAS) and dual attachment stations (DAS).
Asynchronous Transfer Mode
ATM networks are rapidly becoming a popular networking technology. Because ATM networks are based on
public telephone carrier standards, the technology may be used for LANs, MANs, and WANs.
ATM, like todays telephone network, is a hierarchal standard that employs a connection-oriented protocol. For
two hosts to communicate, one must place a call to the other. When the conversation is over, the connection is
broken. Most ATM networks are implemented over fiber optic media, although recent standards also define
connections over twisted-pair copper cable plants.
Network Ethernet ISDN DSL FDDI ATM
Technology
All IP address classes allow more than 250 hosts to be connected to a network. In the real world, this is a
problem. Ethernet is the dominant network technology in use in private sector and campus networks. Due to the
shared bandwidth design of Ethernet, more hosts on a network means less bandwidth is available to each host.
How can this bandwidth problem be overcome?
One method of partitioning the network traffic is through a process called subnetting the network. Some
organizations place each department on a separate subnet. Others use the geographic location of a host (such as
the floor of an office building) as the subnet boundary. By segmenting the network media into logical entities, the
network traffic is also segmented over many networks, an example of which is shown:
192.168.3.0
rd
3 floor subnet
rd nd
For example, the machines on the 3 floor no longer have to contend with the machines on the 2 floor for a
network access. Because the floor subnets are tied together by a router (or gateway), the machines may still
communicate when required, but otherwise the third floor machines dont see traffic bound for the second floor
machines. But should the sub-nets be incorporated into a network, and how are subnets implemented?
When to Subnet?
There are no absolute formulas for determining when or how to subnet a network. The network topology, the LAN
technology being implemented, network bandwidth, and host applications all affect a networks performance.
However subnetting should be considered if one or more of the following conditions exist.
There are more than 20 hosts on the network.
Network applications slow down as users begin accessing the network.
A high percentage of packets on the network are involved in collisions.
Obtaining accurate network load statistics requires sophisticated (and expensive) network analysis equipment.
However, network load can be estimated by calculating the network interface collision rate on each host. This can
be done using the netstat i n command.
Using the information from the netstat commands output, divide the total collisions by the output packets, and
then multiply the result by 100. To obtain a rough idea of the network collision rate, collect netstat i statistics
for all hosts on the network and average them. Collision rates under 2% are generally considered acceptable.
Rates of 2% to 8% indicate a loaded network. Rates over 8% indicate a heavily loaded network that should be
considered for subnettting.
From a troubleshooting perspective, subnetting allows administrators to isolate (disconnect) pieces (subnets) of a
network when trying to resolve network problems. By isolating a subnet, it is often easier to determine the source
of a network problem. Most routers, gateways, hubs, and multi-port transceivers have a switch allowing them to
operate in standalone mode. If the switch is set to local mode, the subnet is disconnected from the rest of the
network. With the switch set to remote mode, the subnet is connected to the rest of the network.
Routing Concerns
All networks require some form of packet routing to ensure that packets get from one host to another. On a simple
(single wire) network, the routing is also simple: a source host sends a packet out onto the media, and the
destination host picks it up. On a multi-tiered or subnetted network, routing becomes much more difficult. On the
internet, routing can become a nightmare.
Routing Overview
Network routing is an iterative process. Consider the problem a visitor to a large city would have trying to locate
an office in a building in a city that she is unfamiliar with. Generally this problem is broken down into a sequence
of smaller problems, as summarized in the following:
Determine how to get to the destination city.
Determine which quadrant of the city contains the desired street.
The external routers know to forward all datagrams bound for hosts on the corporations two networks to the ISPs
router based on this information. The ISPs router forwards the datagrams to the corporate router for final delivery.
An example of supernet routing is shown in the following illustration:
192.168.7.0 = 11000000 10101000 00000111 00000000
192.168.6.0 = 11000000 10101000 00000110 00000000
First 23 bits of both networks are the same
Netmask = 11111111 11111111 11111100 00000000
Advertise = 11000000 10101000 00000110 00000000
192 168 6 /23
use the /tmp directory heavily and you do not monitor swap space usage, your system could run out of swap
space.
Do use the following if you want to use TMPFS, but your swap resources are limited:
1. Mount the TMPFS file system with the size option (-o size) to control how much swap resources
TMPFS can use.
2. Use your compilers TMPDIR environment variable to point to another larger directory.
process path is inherited by a new process from its parent process. When generated, a per-process core
file is owned by the owner of process with read/write permissions for the owner. Only the owning user can
view this file.
2. A global core file path, which defaults to core and is disabled by default. If enabled, an additional
core file with the same content as the per-process core file is produced by using the global core file path.
When generated, a global core file is owned by superuser with read/write permissions for superuser only.
Non-privileged users cannot view this file.
$ /usr/proc/bin/pstack ./core
core ./core of 19305: ./a.out
000108c4 main (1, ffbef5cc, ffbef5d4, 20800, 0, 0) + 1c
00010880 _start (0, 0, 0, 0, 0, 0) + b8
NFS FILES
# clear_locks -s bee
Caution This command should only be run when a client crashes and cannot clear its locks. To avoid data
corruption problems, do not clear locks for an active client.
mount
With this command, you can attach a named file system, either local or remote, to a specified mount point. Used
without arguments, mount displays a list of file systems that are currently mounted on your computer.
mountall
Use this command to mount all file systems or a specific group of file systems that are listed in a file-system table.
The command provides a way of doing the following:
1. Selecting the file-system type to be accessed with the -F FSType option
2. Selecting all the remote file systems that are listed in a file-system table with the r option
3. Selecting all the local file systems with the -l option
setmnt
This command creates an /etc/mnttab table. The mount and umount commands consult the table. Generally,
you do not have to run this command manually, as this command runs automatically when a system is booted.
share
With this command, you can make a local file system on an NFS server available for mounting. You can also use
the share command to display a list of the file systems on your system that are currently shared. The NFS server
must be running for the share command to work. The NFS server software is started automatically during boot if
an entry is in /etc/dfs/dfstab. The command does not report an error if the NFS server software is not
running, so you must verify that the software is running.
shareall
This command allows for multiple file systems to be shared. When used with no options, the command shares all
entries in /etc/dfs/dfstab. You can include a file name to specify the name of a file that lists share command
lines. If you do not include a file name, /etc/dfs/dfstab is checked. If you use a - to replace the file name,
you can type share commands from standard input.
showmount
This command displays one of the following:
All clients that have remotely mounted file systems that are shared from an NFS server
Only the file systems that are mounted by clients
The shared file systems with the client access information
Note The showmount command only shows NFS version 2 and version 3 exports. This command does not
show NFS version 4 exports.
umount
This command enables you to remove a remote file system that is currently mounted. The umount command
supports the -V option to allow for testing. You might also use the -a option to umount several file systems at one
time. If mount_points are included with the -a option, those file systems are unmounted. If no mount points are
included, an attempt is made to unmount all file systems that are listed in /etc/mnttab except for the required
file systems, such as /, /usr, /var, /proc, /dev/fd, and /tmp. Because the file system is already
mounted and should have an entry in /etc/mnttab, you do not need to include a flag for the file-system type.
The -f option forces a busy file system to be unmounted. You can use this option to unhang a client that is hung
while trying to mount an unmountable file system.
Caution By forcing an unmount of a file system; you can cause data loss if files are being written to.
umountall
Use this command to unmount a group of file systems. The -k option runs the fuser -k mount_point
command to kill any processes that are associated with the mount_point. The -s option indicates that unmount is
not to be performed in parallel. -l specifies that only local file systems are to be used, and -r specifies that only
remote file systems are to be used. The -h host option indicates that all file systems from the named host should
be unmounted. You cannot combine the -h option with -l or -r.
unshare
This command allows you to make a previously available file system unavailable for mounting by clients. You can
use the unshare command to unshare any file systemwhether the file system was shared explicitly with the
share command or automatically through /etc/dfs/dfstab. If you use the unshare command to unshare a file
system that you shared through the dfstab file, be careful. Remember that the file system is shared again when
you exit and reenter run level 3. You must remove the entry for this file system from the dfstab file if the change is
to continue. When you unshare an NFS file system, access from clients with existing mounts is inhibited. The file
system might still be mounted on the client, but the files are not accessible.
unshareall
This command makes all currently shared resources unavailable. The -F FSType option selects a list of file-
system types that are defined in /etc/dfs/fstypes. This flag enables you to choose only certain types of file
systems to be unshared. The default file-system type is defined in /etc/dfs/fstypes. To choose specific file
systems, use the unshare command.
nfs4cbd, which is for the exclusive use of the NFS version 4 client, manages the communication endpoints for the
NFS version 4 callback program. The daemon has no user-accessible interface.
nfsd
This daemon handles other client file-system requests. You can use several options with this command. See the
nfsd(1M) man page for a complete listing. These options can either be used from the command line or by editing
the appropriate string in /etc/default/nfs.
The NFSD_LISTEN_BACKLOG=length parameter in /etc/default/nfs sets the length of the connection queue over
connection-oriented transports for NFS and TCP. The default value is 32 entries. The same selection can be
made from the command line by starting nfsd with the -l option.
The NFSD_MAX_CONNECTIONS=#_conn parameter in /etc/default/nfs selects the maximum number of
connections per connection-oriented transport. The default value for #_conn is unlimited. The same parameter
can be used from the command line by starting the daemon with the -c #_conn option.
The NFSD_SERVER=nservers parameter in /etc/default/nfs selects the maximum number of concurrent requests
that a server can handle. The default value for nservers is 16. The same selection can be made from the
command line by starting nfsd with the nservers option. Unlike older versions of this daemon, nfsd does not
spawn multiple copies to handle concurrent requests. Checking the process table with ps only shows one copy of
the daemon running.
nfslogd
This daemon provides operational logging. NFS operations that are logged against a server are based on the
configuration options that are defined in /etc/default/nfslogd. When NFS server logging is enabled, records of all
RPC operations on a selected file system are written to a buffer file by the kernel. Then nfslogd postprocesses
these requests. The name service switch is used to help map UIDs to logins and IP addresses to host names.
The number is recorded if no match can be found through the identified name services. Mapping of file handles to
path names is also handled by nfslogd. The daemon tracks these mappings in a file-handle-to-path mapping
table. One mapping table exists for each tag that is identified in /etc/nfs/nfslogd. After post-processing, the
records are written to ASCII log files.
Note NFS version 4 does not use this daemon.
nfsmapid
In NFS version 4, the nfsmapid(1M) daemon provides a mapping from a numeric user identification (UID) or a
numeric group identification (GID) to a string representation, as well as the reverse. The string representation is
used by the NFS version 4 protocol to represent owner or owner_group. For example, the UID 123456 for the
user, known_user, that is operating on a client that is named system.anydomain.com, would be mapped to
known_user@anydomain.com. The NFS client sends the string representation, known_user@anydomain.com, to
the NFS server. The NFS server maps the string representation, known_user@anydomain.com, to the unique
UID 123456.
Note If the server does not recognize the given user name or group name (even if the domain is correct), the
server cannot map the user or group to its integer ID. More specifically, the server maps unrecognized strings
from the client to nobody. Administrators should avoid making special accounts that exist only on a client.
Although the server and the client do perform both integer-to-string conversions and string-to-integer conversions,
a difference does exist. The server and the client respond differently to unrecognized strings. If the user does not
exist on the server, the server rejects the remote procedure call (RPC). Under these circumstances, the user is
unable to perform any operations on the client or on the server. However, if the user exists on both the client and
the server, but the domain names are mismatched, the server rejects only a subset of the RPC. This behavior
enables the client to perform many operations on both the client and the server, even though the server is
mapping the user to nobody. If the NFS client does not recognize the string, the NFS client maps the string to
nobody.
statd
This daemon works with lockd to provide crash and recovery functions for the lock manager. The statd daemon
tracks the clients that hold locks on an NFS server. If a server crashes, on rebooting statd on the server contacts
statd on the client. The client statd can then attempt to reclaim any locks on the server. The client statd also
informs the server statd when a client has crashed so that the clients locks on the server can be cleared. You
have no options to select with this daemon. For more information, see the statd (1M) man page.
In the Solaris 7 release, the way that statd tracks the clients has been improved. In all earlier Solaris releases,
statd created files in /var/statmon/sm for each client by using the clients unqualified host name. This file naming
caused problems if you had two clients in different domains that shared a host name, or if clients were not
resident in the same domain as the NFS server. Because the unqualified host name only lists the host name,
without any domain or IP-address information, the older version of statd had no way to differentiate between
these types of clients. To fix this problem, the Solaris 7 statd creates a symbolic link in /var/statmon/sm to
the unqualified host name by using the IP address of the client.
Note NFS version 4 does not use this daemon.
truss
You can use this command to check if a process is hung. The truss command must be run by the owner of the
process or by root. You can use many options with this command. A shortened syntax of the command follows.
truss [ -t syscall ] -p pid
-t syscall Selects system calls to trace
-p pid Indicates the PID of the process to be traced
The syscall can be a comma-separated list of system calls to be traced. Also, starting syscall with an ! selects to
exclude the listed system calls from the trace. This example shows that the process is waiting for another
connection request from a new client.
# /usr/bin/truss -p 243
poll(0x00024D50, 2, -1) (sleeping...)
The previous example shows a normal response. If the response does not change after a new connection request
has been made, the process could be hung.
Example Commands from Client:
% /usr/sbin/ping bee
% nfsstat -m
% /usr/lib/nis/nisping -u
% /usr/bin/getent hosts bee
Checking server from a remotely:
% rpcinfo -s bee|egrep nfs|mountd
% /usr/bin/rpcinfo -u bee nfs
% /usr/bin/rpcinfo -u bee mountd
Commands on Server:
# /usr/bin/rpcinfo -u localhost rpcbind
# rpcinfo -u localhost nfs
# ps -ef | grep nfsd
# /usr/bin/rpcinfo -u localhost mountd
# ps -ef | grep mountd
11.6. Autofs
File systems that are shared through the NFS service can be mounted by using automatic mounting. Autofs, a
client-side service, is a file-system structure that provides automatic mounting. The autofs file system is initialized
by automount, which is run automatically when a system is booted. The automount daemon, automountd, runs
continuously, mounting and unmounting remote directories as necessary.
Whenever a client computer that is running automountd tries to access a remote file or remote directory, the
daemon mounts the remote file system. This remote file system remains mounted for as long as needed. If the
remote file system is not accessed for a certain period of time, the file system is automatically unmounted.
Mounting need not be done at boot time, and the user no longer has to know the superuser password to mount a
directory. Users do not need to use the mount and umount commands. The autofs service mounts and unmounts
file systems as required without any intervention by the user. Mounting some file hierarchies with automountd
does not exclude the possibility of mounting other hierarchies with mount. A diskless computer must mount /
(root), /usr, and /usr/kvm through the mount command and the /etc/vfstab file.
The new automountd also provides better on-demand mounting. Previous releases would mount an entire set of
file systems if the file systems were hierarchically related. Now, only the top file system is mounted. Other file
systems that are related to this mount point are mounted when needed.
The autofs service supports browsability of indirect maps. This support enables a user to see which directories
could be mounted, without having to actually mount each file system. A -nobrowse option has been added to the
autofs maps so that large file systems, such as /net and /home, are not automatically browsable. Also, you can
turn off autofs browsability on each client by using the -n option with automount.
Note Autofs runs on all computers and supports /net and /home (automounted home directories) by default.
These defaults can be overridden by entries in the NIS auto.master map or NIS+ auto_master table, or by local
editing of the /etc/auto_master file.
Mount Point /net
Autofs mounts under the directory /net all the entries in the special map -hosts. The map is a built-in map that
uses only the hosts database. Suppose that the computer gumbo is in the hosts database and it exports any of
its file systems. The following command changes the current directory to the root directory of the computer
gumbo.
% cd /net/gumbo
Autofs can mount only the exported file systems of host gumbo, that is, those file systems on a server that are
available to network users instead of those file systems on a local disk. Therefore, all the files and directories on
gumbo might not be available through /net/gumbo.
With the /net method of access, the server name is in the path and is location dependent. If you want to move
an exported file system from one server to another, the path might no longer work. Instead, you should set up an
entry in a map specifically for the file system you want rather than use /net.
Note Autofs checks the servers export list only at mount time. After a servers file systems are mounted, autofs
does not check with the server again until the servers file systems are automatically unmounted. Therefore,
newly exported file systems are not seen until the file systems on the client are unmounted and then remounted.
Direct Autofs Maps
A direct map is an automount point. With a direct map, a direct association exists between a mount point on the
client and a directory on the server. Direct maps have a full path name and indicate the relationship explicitly. The
following is a typical
/etc/auto_direct map:
/usr/local -ro \
/bin ivy:/export/local/sun4 \
/share ivy:/export/local/share \
/src ivy:/export/local/src
/usr/man -ro oak:/usr/man \
rose:/usr/man \
willow:/usr/man
/usr/games -ro peach:/usr/games
/usr/spool/news -ro pine:/usr/spool/news \
willow:/var/spool/news
Lines in direct maps have the following syntax:
key [ mount-options ] location
key key is the path name of the mount point in a direct map.
mount-options mount-options is the options that you want to apply to this particular mount. These options are
required only if the options differ from the map default. Options for each specific type of file system are listed in
the mount man page for that file system.
location location is the location of the file system. One or more file systems are specified as server:pathname for
NFS file systems or :devicename for High Sierra file systems (HSFS).
Note The pathname should not include an automounted mount point. The pathname should be the actual
absolute path to the file system. For instance, the location of a home directory should be listed as
server:/export/home/username, not as server:/home/username.
As in the master map, a line that begins with # is a comment. All the text that follows until the end of the line is
ignored. Put a backslash at the end of the line to split long lines into shorter ones. Of all the maps, the entries in a
direct map most closely resemble the corresponding entries in /etc/vfstab. An entry might appear in
/etc/vfstab as follows:
dancer:/usr/local - /usr/local/tmp nfs - yes ro
Note No concatenation of options occurs between the automounter maps. Any options that are added to an
automounter map override all options that are listed in maps that are searched earlier. For instance, options that
are included in the auto_master map are overridden by corresponding entries in any other map. On a network
without a name service, you have to change all the relevant files (such as /etc/passwd) on all systems on the
network to allow Linda access to her files. With NIS, make the changes on the NIS master server and propagate
the relevant databases to the slave servers. On a network that is running NIS+, propagating the relevant
databases to the slave servers is done automatically after the changes are made.
Autofs is a kernel file system that supports automatic mounting and unmounting. When a request is made to
access a file system at an autofs mount point, the following occurs:
1. Autofs intercepts the request.
2. Autofs sends a message to the automountd for the requested file system to be mounted.
3. automountd locates the file system information in a map, creates the trigger nodes, and performs the
mount.
4. Autofs allows the intercepted request to proceed.
5. Autofs unmounts the file system after a period of inactivity.
Note Mounts that are managed through the autofs service should not be manually mounted or unmounted.
Even if the operation is successful, the autofs service does not check that the object has been unmounted,
resulting in possible inconsistencies. A reboot clears all the autofs mount points.
Default Autofs Behavior with Name Services
At boot time autofs is invoked by the service svc:/system/filesystem/autofs and autofs checks for the
master auto_master map. Autofs is subject to the rules that are discussed subsequently. Autofs uses the name
service that is specified in the automount entry of the /etc/nsswitch.conf file. If NIS+ is specified, as
opposed to local files or NIS, all map names are used as is. If NIS is selected and autofs cannot find a map that
autofs needs, but finds a map name that contains one or more underscores, the underscores are changed to
dots. This change allows the old NIS file names to work. Then autofs checks the map again, as shown below.
The screen activity for this session would resemble the following example.
$ grep /home /etc/auto_master
/home auto_home
$ ypmatch brent auto_home
Cant match key brent in map auto_home. Reason: no such map in servers domain.
$ ypmatch brent auto.home
diskus:/export/home/diskus1/&
If files is selected as the name service, all maps are assumed to be local files in the /etc directory. Autofs
interprets a map name that begins with a slash (/) as local regardless of which name service autofs uses.
Volumes
A volume is a group of physical slices that appears to the system as a single, logical device. Volumes are
actually pseudo, or virtual, devices in standard UNIX terms. Historically, the Solstice DiskSuite product referred to
these logical devices as metadevices. For standardization, these devices are referred to as volumes.
Classes of Volumes
Volumes behave the same way as slices. Because volumes look like slices, the volumes are transparent to end
users, applications, and file systems. As with physical devices, volumes are accessed through block or raw device
names. Solaris Volume Manager enables you to expand a volume by adding additional slices and also expand
(using growfs) a UFS filesystem on-line.
Note After a file system has been expanded, the file system cannot be reduced in size. The inability to reduce
the size of a file system is a UFS limitation. Similarly, after a Solaris Volume Manager partition has been
increased in size, it cannot be reduced.
Volume Name Requirements
Volume names must begin with the letter d followed by a number (for example, d0). Solaris Volume Manager
has 128 default volume names from 0127. The following shows some example volume names.
/dev/md/dsk/d0 Block volume d0
/dev/md/dsk/d1 Block volume d1
/dev/md/rdsk/d126 Raw volume d126
/dev/md/rdsk/d127 Raw volume d127
State Database and State Database Replicas
The state database is a database that stores information about the state of your Solaris Volume Manager
configuration. The state database records and tracks changes made to your configuration. Solaris Volume
Manager automatically updates the state database when a configuration or state change occurs. Creating a new
volume is an example of a configuration change. A submirror failure is an example of a state change.
The state database is actually a collection of multiple, replicated database copies. Each copy, referred to as a
state database replica, and ensures that the data in the database is always valid. Multiple copies of the state
database protect against data loss from single points-of-failure. The state database tracks the location and status
of all known state database replicas. Solaris Volume Manager cannot operate until you have created the state
database and its state database replicas. A Solaris Volume Manager configuration must have an operating state
database.
When you set up your configuration, you can locate the state database replicas on either of the following:
On dedicated slices
On slices that will later become part of volumes
Solaris Volume Manager recognizes when a slice contains a state database replica, and automatically skips over
the replica if the slice is used in a volume. The part of a slice reserved for the state database replica should not be
used for any other purpose. You can keep more than one copy of a state database on one slice. However, you
might make the system more vulnerable to a single point-of-failure by doing so. The Solaris operating system
continues to function correctly if all state database replicas are deleted. However, the system loses all Solaris
Volume Manager configuration data if a reboot occurs with no existing state database replicas on disk.
Hot Spare Pools
A hot spare pool is a collection of slices (hot spares) reserved by Solaris Volume Manager to be automatically
substituted for failed components. These hot spares can be used in either a submirror or RAID-5 volume. Hot
spares provide increased data availability for RAID-1 and RAID-5 volumes.
When component errors occur, Solaris Volume Manager checks for the first available hot spare whose size is
equal to or greater than the size of the failed component. If found, Solaris Volume Manager automatically replaces
the component and resynchronizes the data. If a slice of adequate size is not found in the list of hot spares, the
submirror or RAID-5 volume is considered to have failed.
Disk Sets
A disk set is a set of physical storage volumes that contain logical volumes and hot spares. Volumes and hot
spare pools must be built on drives from within that disk set. Once you have created a volume within the disk set,
you can use the volume just as you would a physical slice. A disk set provides data availability in a clustered
environment. If one host fails, another host can take over the failed hosts disk set. (This type of configuration is
known as a failover configuration.) Additionally, disk sets can be used to help manage the Solaris Volume
Manager namespace, and to provide ready access to network-attached storage devices.
Note Use isainfo -v to determine if your system is running a 64-bit kernel. If the string 64-bit appears, you are
running a 64-bit kernel.
Solaris Volume Manager allows you to do the following:
Create, modify, and delete logical volumes built on or from logical storage units (LUNs) greater than 1
Tbyte in size.
Create, modify, and delete logical volumes that exceed 1 Tbyte in size. Support for large volumes is
automatic. If a device greater than 1 Tbyte is created, Solaris Volume Manager configures it
appropriately and without user intervention.
Note Do not create large volumes if you expect to run the Solaris software with a 32-bit kernel or if you expect
to use a version of the Solaris OS prior to the Solaris 9 4/03 release.
Volume Manager State Database and Replicas
The Solaris Volume Manager state database contains configuration and status information for all volumes, hot
spares, and disk sets. Solaris Volume Manager maintains multiple copies (replicas) of the state database to
provide redundancy and to prevent the database from being corrupted during a system crash (at most, only one
database copy will be corrupted). The state database replicas ensure that the data in the state database is always
valid. When the state database is updated, each state database replica is also updated. The updates occur one at
a time (to protect against corrupting all updates if the system crashes). If your system loses a state database
replica, Solaris Volume Manager must figure out which state database replicas still contain valid data. Solaris
Volume Manager determines this information by using a majority consensus algorithm. This algorithm requires
that a majority (half + 1) of the state database replicas be available and in agreement before any of them are
considered valid. Because of the requirements of the majority consensus algorithm, you must create at least three
state database replicas when you set up your disk configuration. A consensus can be reached as long as at least
two of the three state database replicas are available. During booting, Solaris Volume Manager ignores corrupted
state database replicas. In some cases, Solaris Volume Manager tries to rewrite state database replicas that are
corrupted. Otherwise, they are ignored until you repair them. If a state database replica becomes corrupted
because its underlying slice encountered an error, you need to repair or replace the slice and then enable the
replica.
If all state database replicas are lost, you could, in theory, lose all data that is stored on your Solaris Volume
Manager volumes. For this reason, it is good practice to create enough state database replicas on separate drives
and across controllers to prevent catastrophic failure. It is also wise to save your initial Solaris Volume Manager
configuration information, as well as your disk partition information.
State database replicas are also used for RAID-1 volume resynchronization regions. Too few state database
replicas relative to the number of mirrors might cause replica I/O to impact RAID-1 volume performance. That is, if
you have a large number of mirrors, make sure that you have at least two state database replicas per RAID-1
volume, up to the maximum of 50 replicas per disk set. By default each state database replica occupies 4 Mbytes
(8192 disk sectors) of disk storage. Replicas can be stored on the following devices:
A dedicated local disk partition
A local partition that will be part of a volume
A local partition that will be part of a UFS logging device
Note Replicas cannot be stored on the root (/), swap, or /usr slices. Nor can replicas be stored on slices that
contain existing file systems or data. After the replicas have been stored, volumes or file systems can be placed
on the same slice.
Understanding the Majority Consensus Algorithm
An inherent problem with replicated databases is that it can be difficult to determine which database has valid and
correct data. To solve this problem, Solaris Volume Manager uses a majority consensus algorithm. This algorithm
requires that a majority of the database replicas agree with each other before any of them are declared valid. This
algorithm requires the presence of at least three initial replicas, which you create. A consensus can then be
reached as long as at least two of the three replicas are available. If only one replica exists and the system
crashes, it is possible that all volume configuration data will be lost.
To protect data, Solaris Volume Manager does not function unless half of all state database replicas are available.
The algorithm, therefore, ensures against corrupt data. The majority consensus algorithm provides the following:
1. The system continues to run if at least half of the state database replicas are available.
2. The system panics if fewer than half of the state database replicas are available.
3. The system cannot reboot into multiuser mode unless a majority (half + 1) of the total number of state
database replicas is available.
4. If insufficient state database replicas are available, you must boot into single-user mode and delete
enough of the corrupted or missing replicas to achieve a quorum.
Note When the total number of state database replicas is an odd number, Solaris Volume Manager computes
the majority by dividing the number in half, rounding down to the nearest integer, then adding 1 (one). For
example, on a system with seven replicas, the majority would be four (seven divided by two is three and one-half,
rounded down is three, plus one is four).
Administering State Database Replicas
1. By default, the size of a state database replica is 4 Mbytes or 8192 blocks. You should create state
database replicas on a dedicated slice with at least 4 Mbytes per replica. Because your disk slices might
not be that small, you might want to resize a slice to hold the state database replica. To avoid single
points-of-failure, distribute state database replicas across slices, drives, and controllers. You want a
majority of replicas to survive a single component failure. If you lose a replica (for example, due to a
device failure), problems might occur with running Solaris Volume Manager or when rebooting the
system. Solaris Volume Manager requires at least half of the replicas to be available to run, but a majority
(half + 1) to reboot into multiuser mode. A minimum of 3 state database replicas are recommended, up to
a maximum of 50 replicas per Solaris Volume Manager disk set. The following guidelines are
recommended:
a. For a system with only a single drive: put all three replicas on one slice.
b. For a system with two to four drives: put two replicas on each drive.
c. For a system with five or more drives: put one replica on each drive.
2. If multiple controllers exist, replicas should be distributed as evenly as possible across all controllers. This
strategy provides redundancy in case a controller fails and also helps balance the load. If multiple disks
exist on a controller, at least two of the disks on each controller should store a replica.
3. If necessary, you could create state database replicas on a slice that will be used as part of a RAID-0,
RAID-1, or RAID-5 volume, or soft partitions. You must create the replicas before you add the slice to the
volume. Solaris Volume Manager reserves the beginning of the slice for the state database replica. When
a state database replica is placed on a slice that becomes part of a volume, the capacity of the volume is
reduced by the space that is occupied by the replica. The space used by a replica is rounded up to the
next cylinder boundary. This space is skipped by the volume.
4. RAID-1 volumes are used for small-sized random I/O (as in for a database). For best performance, have
at least two extra replicas per RAID-1 volume on slices (and preferably on separate disks and controllers)
that are unconnected to the RAID-1 volume.
5. You cannot create state database replicas on existing file systems, or the root (/), /usr, and swap file
systems. If necessary, you can create a new slice (provided a slice name is available) by allocating space
from swap. Then, put the state database replicas on that new slice.
6. You can create state database replicas on slices that are not in use.
7. You can add additional state database replicas to the system at any time. The additional state database
replicas help ensure Solaris Volume Manager availability.
half +1 valid state database replicas are available). When you manually repair or enable state database replicas,
Solaris Volume Manager updates them with valid data.
# metadb -a -c number -l length-of replica -f ctds-of-slice
-a Specifies to add or create a state database replica.
-f Specifies to force the operation, even if no replicas exist. Use the -f to force the creation of the initial replicas.
-c number Specifies the number of replicas to add to the specified slice.
-l length-of-replica Specifies the size of the new replicas, in blocks. The default size is 8192. This size should be
appropriate for virtually all configurations, including those configurations with thousands of logical volumes.
ctds-of-slice Specifies the name of the component that will hold the replica.
Note The metadb command entered on the command line without options reports the status of all state
database replicas.
Creating the First State Database Replica
# metadb -a -f c0t0d0s7
# metadb
flags first blk block count
...
a u 16 8192 /dev/dsk/c0t0d0s7
You must use the -f option along with the -a option to create the first state database replica. The -a option
adds state database replicas to the system.The -f option forces the creation of the first replica (and may be
omitted when you add supplemental replicas to the system).
Adding Two State Database Replicas to the Same Slice
# metadb -a -c 2 c1t3d0s1
# metadb
flags first blk block count
...
a u 16 8192 /dev/dsk/c1t3d0s1
a u 8208 8192 /dev/dsk/c1t3d0s1
The -a option adds state database replicas to the system. The -c 2 option places two replicas on the specified
slice. The metadb command checks that the replicas are active, as indicated by the a flag in the metadb
command output.
Adding State Database Replicas of a Specific Size
If you are replacing existing state database replicas, you might need to specify a replica size. Particularly if you
have existing state database replicas (on a system upgraded from the Solstice DiskSuite product, perhaps) that
share a slice with a file system, you must replace existing replicas with other replicas of the same size or add new
replicas in a different location.
# metadb -a -c 3 -l 1034 c0t0d0s7
# metadb
flags first blk block count
...
a u 16 1034 /dev/dsk/c0t0d0s7
a u 1050 1034 /dev/dsk/c0t0d0s7
a u 2084 1034 /dev/dsk/c0t0d0s7
The -a option adds state database replicas to the system. The -l option specifies the length in blocks of the
replica to add.
Maintaining State Database Replicas
How to Check the Status of State Database Replicas
1. Become superuser.
2. To check the status of state database replicas, use one of the following methods:
From the Enhanced Storage tool within the Solaris Management Console, open the State Database Replicas
node to view all existing state database replicas.
Use the metadb command to view the status of state database replicas. Add the -i option to display an
explanation of the status flags, as shown in the following example
Checking the Status of All State Database Replicas
# metadb -i
flags first blk block count
a m p luo 16 8192 /dev/dsk/c0t0d0s7
a p luo 8208 8192 /dev/dsk/c0t0d0s7
a p luo 16400 8192 /dev/dsk/c0t0d0s7
a p luo 16 8192 /dev/dsk/c1t3d0s1
W p l 16 8192 /dev/dsk/c2t3d0s1
a p luo 16 8192 /dev/dsk/c1t1d0s3
a p luo 8208 8192 /dev/dsk/c1t1d0s3
a p luo 16400 8192 /dev/dsk/c1t1d0s3
r - replica does not have device relocation information
o - replica active prior to last mddb configuration change
u - replica is up to date
l - locator for this replica was read successfully
c - replicas location was in /etc/lvm/mddb.cf
p - replicas location was patched in kernel
m - replica is master, this is replica selected as input
W - replica has device write errors
a - replica is active, commits are occurring to this replica
M - replica had problem with master blocks
D - replica had problem with data blocks
F - replica had format problems
S - replica is too small to hold current data base
R - replica had device read errors
A legend of all the flags follows the status. The characters in front of the device name represent the status.
Uppercase letters indicate a problem status. Lowercase letters indicate an Okay status.
How to Delete State Database Replicas
You might need to delete state database replicas to maintain your Solaris Volume Manager configuration. For
example, if you will be replacing disk drives, you want to delete the state database replicas before you remove the
drives. Otherwise Solaris Volume Manager will report them as having errors.
1. Become superuser.
2. To remove state database replicas, use one of the following methods:
From the Enhanced Storage tool within the Solaris Management Console, open the State Database Replicas
node to view all existing state database replicas. Select replicas to delete, then choose EditDelete to remove
them. Use the following form of the metadb command:
# metadb -d -f ctds-of-slice
-d Specifies to delete a state database replica.
-f Specifies to force the operation, even if no replicas exist.
ctds-of-slice Specifies the name of the component that contains the replica. Note that you need to specify each
slice from which you want to remove the state database replica.
Deleting State Database Replicas
# metadb -d -f c0t0d0s7
This example shows the last replica being deleted from a slice. You must add the -f option to force the deletion
of the last replica on the system.
takes for write requests to be written to disk. After you configure a mirror, the mirror can be used just like a
physical slice. You can mirror any file system, including existing file systems. These file systems root (/), swap,
and /usr. You can also use a mirror for any application, such as a database.
Tip Use Solaris Volume Managers hot spare feature with mirrors to keep data safe and available.
Overview of Submirrors
A mirror is composed of one or more RAID-0 volumes (stripes or concatenations) called submirrors. A mirror can
consist of up to four submirrors. However, two-way mirrors usually provide sufficient data redundancy for most
applications and are less expensive in terms of disk drive costs. A third submirror enables you to make online
backups without losing data redundancy while one submirror is offline for the backup. If you take a submirror
offline, the mirror stops reading and writing to the submirror. At this point, you could access the submirror itself,
for example, to perform a backup. However, the submirror is in a read-only state. While a submirror is offline,
Solaris Volume Manager keeps track of all writes to the mirror. When the submirror is brought back online, only
the portions of the mirror that were written while the submirror was offline (the resynchronization regions) are
resynchronized. Submirrors can also be taken offline to troubleshoot or repair physical devices that have errors.
Submirrors can be attached or be detached from a mirror at any time, though at least one submirror must remain
attached at all times. Normally, you create a mirror with only a single submirror. Then, you attach a second
submirror after you create the mirror.
parity information. If you have five components, then the equivalent of one component is used for parity
information. The parity information is distributed across all components in the volume. Similar to a mirror, a RAID-
5 volume increases data availability, but with a minimum of cost in terms of hardware and only a moderate
penalty for write operations. However, you cannot use a RAID-5 volume for the root (/), /usr, and swap file
systems, or for other existing file systems. Solaris Volume Manager automatically resynchronizes a RAID-5
volume when you replace an existing component. Solaris Volume Manager also resynchronizes RAID-5 volumes
during rebooting if a system failure or panic took place.
hot spare is resynchronized with data from a functional submirror. In the case of a RAID-5 volume, the hot spare
is resynchronized with the other slices in the volume. If a slice of adequate size is not found in the list of hot
spares, the submirror or RAID-5 volume that failed goes into a failed state and the hot spares remain unused. In
the case of the submirror, the submirror no longer replicates the data completely. In the case of the RAID-5
volume, data redundancy is no longer available.
Tip When you add hot spares to a hot spare pool, add them from smallest to largest in size. This strategy avoids
potentially wasting large hot spares as replacements for small slices.
Hot Spare Pools
A hot spare pool is an ordered list (collection) of hot spares. You can place hot spares into one or more hot spare
pools to get the most flexibility and protection from the fewest slices. You could put a single slice designated for
use as a hot spare into multiple hot spare pools, with each hot spare pool having different slices and
characteristics. Then, you could assign a hot spare pool to any number of submirror volumes or RAID-5 volumes.
rpc.metamhd) do not start early enough in the boot process to permit this. Additionally, the ownership of a disk
set is lost during a reboot. Do not disable the Solaris Volume Manager RPC daemons in the /etc/inetd.conf
file. They are configured to start by default. These daemons must remain enabled to allow Solaris Volume
Manager to use its full functionality. When the autotake feature is enabled using the -A option of the metaset
command, the disk set is automatically taken at boot time. Under these circumstances, a file system that resides
on a volume in a disk set can be automatically mounted with the /etc/vfstab file. To enable an automatic take
during the boot process, the disk set must be associated with only a single host, and must have the autotake
feature enabled. A disk set can be enabled either during or after disk set creation.
RBAC COMPONENTS
Roles Authorizations
Rights
Profiles
In conventional UNIX systems, the root user, also referred to as superuser, is all-powerful. Programs that run as
root, or setuid programs, are all-powerful. The root user has the ability to read and write to any file, run all
programs, and send kill signals to any process. Effectively, anyone who can become superuser can modify a
sites firewall, alter the audit trail, read confidential records, and shut down the entire network. A setuid program
that is hijacked can do anything on the system.
Role-based access control (RBAC) provides a more secure alternative to the all-or-nothing superuser model. With
RBAC, you can enforce security policy at a more fine-grained level. RBAC uses the security principle of least
privilege. Least privilege means that a user has precisely the amount of privilege that is necessary to perform a
job. Ordinary users have enough privilege to use their applications, check the status of their jobs, print files,
create new files, and so on. Capabilities beyond ordinary user capabilities are grouped into rights profiles. Users
who are expected to do jobs that require some of the capabilities of superuser assume a role that includes the
appropriate rights profile. Few of the superuser capabilities grouped together is called a rights profile. These rights
profiles are assigned to special user accounts that are called roles. A user to whom some work is to be delegated
is assigned that role. Predefined rights profiles are supplied with Solaris software. You create the roles and assign
the profiles.
Examples of rights profiles:
Primary Administrator rights profile is equivalent to superuser is a broad capability profile.
Cron Management rights profile manages at and cron jobs is a narrow capability profile.
There is no hard and fast rule about roles and also no default roles are shipped with Solaris OE, but three
recommended roles are:
Primary Administrator A powerful role that is equivalent to the root user, or superuser.
System Administrator A less powerful role for administration that is not related to security. This role can
manage file systems, mail, and software installation. However, this role cannot set passwords.
Operator A junior administrator role for operations such as backups and printer management.
These are just recommended roles. According to the needs of your organization you can prepare your own roles.
The root user can also be converted into a role so as minimize security risk.
Rights Profiles: A right, also known as a profile or a rights profile, is a collection of privileges that can be assigned
to a role or user. A rights profile can consist of authorizations, commands with setuid or setgid permissions
(referred to as security attributes), and other rights profiles.
Authorizations: An authorization is a discrete right that can be granted to a role or to a user. Authorizations
enforce policy at the user application level. Authorizations can be assigned directly to a role or to a user.
Typically, authorizations are included in a rights profile.
The above figure illustrates the relationship between the various RBAC elements
user_attr security
NOTE:
Commands that are designated with euid run with the supplied UID, which is similar to setting the setuid
bit on an executable file. Commands that are designated with uid run with both the real UID and the
effective UID.
Commands that are designated with egid run with the supplied GID, which is similar to setting the setgid
bit on an executable file. Commands that are designated with gid run with both the real GID and the
effective GID.
Priority Levels:
Source Facilities:
maps were designed to replace UNIX /etc files, as well as other configuration files. NIS maps store much more
than names and addresses.
NIS uses a client-server arrangement which is similar to DNS. Replicated NIS servers provide services to NIS
clients. The principal servers are called master servers, and for reliability, the servers have backup, or slave
servers. Both master and slave servers use the NIS retrieval software and both store NIS maps
NIS+ Naming Service
The Network Information Service Plus (NIS+) is similar to NIS but with more features. However, NIS+ is not an
extension of NIS. The NIS+ naming service is designed to conform to the shape of the organization. Unlike NIS,
the NIS+ namespace is dynamic because updates can occur and be put into effect at any time by any authorized
user. NIS+ enables you to store information about machine addresses, security information, mail information,
Ethernet interfaces, and network services in one central location. This configuration of network information is
referred to as the NIS+ namespace. The NIS+ namespace is hierarchical. The NIS+ namespace is similar in
structure to the UNIX directory file system. The hierarchical structure allows an NIS+ namespace to be configured
to conform to the logical hierarchy of an organization. The namespaces layout of information is unrelated to its
physical arrangement. Thus, an NIS+ namespace can be divided into multiple domains that can be administered
autonomously. Clients might have access to information in domains other than their own if the clients have the
appropriate permissions.
NIS+ uses a client-server model to store and have access to the information contained in an NIS+ namespace.
Each domain is supported by a set of servers. The principal server is called the primary server. The backup
servers are called secondary servers. The network information is stored in 16 standard NIS+ tables in an internal
NIS+ database. Both primary and secondary servers run NIS+ server software and both maintain copies of NIS+
tables. Changes made to the NIS+ data on the master server are incrementally propagated automatically to the
secondary servers.
NIS+ includes a sophisticated security system to protect the structure of the namespace and its information. NIS+
uses authentication and authorization to verify whether a clients request for information should be fulfilled.
Authentication determines whether the information requester is a valid user on the network.
Authorization determines whether a particular user is allowed to have or modify the information requested.
LDAP Naming Services
Solaris 10 supports LDAP (Lightweight Directory Access Protocol) in conjunction with the Sun Java System
Directory Server (formerly Sun ONE Directory Server), as well as other LDAP directory servers.
The table above gives a comparative study of the various name services.
(Root Level)
Each domain can also define a number of secondary name servers. The secondary name servers obtain their
databases from the primary name servers through a process called zone transfer. These secondary name
servers are queried in the event the primary name server(s) do not respond to a query.
Caching Name Server
Caching name servers have no direct access to any authoritative information about the domain. These name
servers query primary and secondary name servers with DNS requests, and store the results away in a memory
cache for future reference. When these servers spot a DNS look-up request on the network, they reply with the
information they have stored in their respective caches. Caching name servers usually contain valid data, but
because they dont load information directly from primary name servers, the data can become stale. Caching
name servers are not considered authoritative name servers.
Root Name Servers
In order to provide a master list of name servers available on the Internet, the Network Information Center (NIC)
maintains a group of root name servers. These name servers provide authoritative information about specific top
level domains. When a local name server cannot resolve an address, it queries the root name server for the
appropriate domain for information which may allow the local name server to resolve the address.
DNS Clients
DNS clients comprise a series of library calls, which issue RPC requests to a DNS server. These library routines
are referred to as resolver code. Whenever a system requires a name to IP address mapping, the system
software executes a gethostbyname () library call. The resolver code contacts the appropriate name service
daemon to resolve the query.
/etc/resolv.conf File
In order to make DNS operate, the administrator must configure the /etc/resolv.conf file on each host. This
file tells the name service resolution routines where to find DNS information. The format of the entries in the
/etc/resolv.conf file is keyword value.
nameserver: Valid values for name servers are IP addresses in standard dot notation.
domain: The domain value will be appended to any hostname which doesnt end in a dot. If a user types telnet
alice, the resolver will automatically append .wonderland.carroll.org to the request such that the telnet session will
actually be issued as telnet alice.wonderland.carroll.org.
DNS Database Files
The DNS service consults a series of database files in order to resolve name service queries. Such database files
are often called as db files. In order to provide scalability and modularity, the database information is split into
several files as described below:
1. named.hosts This provides the hostname to IP address mapping for the hosts within the domain name.
2. named.rev This database provides the IP address to hostname mapping (reverse mapping) for the
hosts on network address.
DNS provides a method for embedding comments in the db files. It begins with a semicolon (;). Comments are
needed in an attempt to explain the local site setup.
SOA Records
This provides information about management of data within the domain. The SOA record must start in column 1.
The format is shown below:
domain class SOA primary_server rp (
serial_number
refresh_value
retry_value
expiration_value
TTL)
The domain field is the domain name.
The class field allows the administrator to define the class of data. Currently, only one class is used. The IN class
defines Internet data.
SOA field tells the resolver that this is a start of authority record for domain.
The primary server field is the fully qualified host name of the primary name server for this domain.
The rp field gives the fully qualified email address of the person responsible for this domain
The serial number field is used by the secondary name servers. This field is a counter which gives the version
number for the file. This number should be incremented every time the file is altered. When the information stored
on the secondary name servers expires, the servers contact the primary name server to obtain new information. If
the serial number of the information obtained from the primary server is larger than the serial number of the
current information, a full zone transfer is performed. If the serial number from the primary name server is smaller
than the current information, the secondary name server discards the information from the primary name server.
The refresh_value field is used by the secondary name servers. The value in this field tells the secondary servers
how long (in seconds) to keep the data before they obtain a new copy from the primary name server. If the value
is too small, the name server may overload the network with zone transfers. If the value is too large, information
may become stagnant and changes may not propagate efficiently from the primary to the secondary name
servers.
The retry_value field is used by the secondary name servers. The value in this field tells the secondary servers
how long (in seconds) to wait before attempting to contact a non-responsive primary name server.
The expiration_value field is used by the secondary name servers. The value in this field tells the secondary
servers to expire their information after value seconds. Once the secondary server expires its data, it stops
responding to name service requests.
The TTL field tells name servers how long they can cache the response from the name server before they purge
(discard) the information obtained in response to a query. Example:
wonder.com. IN SOA alice.wonder.com. root.alice.wonder.com. (
2005111401 ; serial format YYYYMMDD##
10800 ; Refresh every 3 hours
3600 ; Retry after an hour
604800 ; Expire data after 1 week
86400 ) ; TTL for cached data is 1 day
NS Records
The NS records define the name servers within a domain. The NS record starts in column 1 of the file. The format
of an NS record follows: domain class NS fully_qualified_hostname. An example appears below.
wonder.com. IN NS alice.wonder.com.
A Records
The A (Address) records provide the information used to map host names to IP addresses. The A record must
begin in column 1 of the file. The format of an A record is fully_qualified_hostname class A address. Example:
alice.wonder.com. IN A 200.200.0.1
PTR Records
The pointer records provide the information for reverse mapping (looking up a host name from an IP address).
The PTR record must begin in column1 of the file. The format of the PTR record is address class PTR
fully_qualified_hostname. The PTR record does have one quirk: the address portion appears to be written
backwards, and contains information the user typically doesnt see. The information is presented in this format in
order to simplify the look-up procedure, and to maintain the premise that the information at the left of the record is
furthest from the root of the domain.
Ex: 1.0.200.200.in-addr.arpa. IN PTR alice.wonder.com.
MX Records
The mail exchanger records provide a way for remote hosts to determine where e-mail should be delivered for a
domain. The MX records must begin in column 1 of the db files. The format of MX records is
fully_qualified_hostname class MX preference fully_qualified_mail_hostname.
There should be an MX record for each host within a domain. The preference field tells the remote host the
preferred place to deliver the mail for the domain. This allows e-mail to be delivered in the event that the primary
mail server is down. The remote host will try to deliver e-mail to the host with the lowest preference value first. If
that fails, the remote host will attempt to deliver the e-mail to the host with the next lowest preference value.
Examples appear below:
alice.wonder.com. IN MX mail.wonder.com.
CNAME Records
The CNAME records provide a way to alias host names.
TXT Records
The TXT records allow the administrator to add text information to the db files giving more information about the
domain. The TXT records must start in column 1 of the db file, and the format of TXT records is hostname class
TXT Random text about hostname in this domain.
alice IN TXT Machine Alice is in Wonderland
RP Records
These records define the person responsible for the domain. The RP records must start in column 1 of the db
files. The format of an RP record is hostname class RP fully_qualified_email_address fully_qualified_hostname.
alice IN user1.alice.wonder.com. queen.wonder.com.
queen IN TXT The Responsible Person
Configuring DNS
The Solaris 10 operating system ships with the BIND 9.x DNS name server. The DNS/BIND named service can
be managed by using the Sevice Management Facility (SMF).
Tip Temporarily disabling a service by using the -t option provides some protection for the service configuration.
If the service is disabled with the t option, the original settings would be restored for the service after a reboot. If
the service is disabled without -t, the service will remain disabled after reboot.
The Fault Managed Resource Identifiers (FMRIs) for the DNS service are
svc:/network/dns/server:<instance> and
svc:/network/dns/client:<instance>.
If you need to start the DNS service with different options (for example with a configuration file other than
/etc/named.conf), change the start method property of the DNS server manifest by using the svccfg command.
Multiple SMF service instances are only needed if you want to run multiple copies of BIND 9 name service. Each
additional instance can be specified in the DNS server manifest with a different start method. While it is
recommended that you use svcadm to administer the server, you can use rndc as well. SMF is aware of the state
change of the BIND 9 named service, whether administered by using svcadm or rndc.
Note SMF will not be aware of the BIND 9 named service if the service is manually executed from the command
line.
# vi /etc/resolv.conf
domain wil.com
nameserver 200.200.0.1
Save and exit
# svcadm restart \*dns\*
# nslookup
>
NIS Utilities
NIS MAPS
NIS maps were designed to replace UNIX /etc files, as well as other configuration files, so they store much more
than names and addresses. On a network running NIS, the NIS master server for each NIS domain maintains a
set of NIS maps for other machines in the domain to query. NIS slave servers also maintain duplicates of the
master servers maps. NIS client machines can obtain namespace information from either master or slave
servers. NIS maps are essentially two-column tables. One column is the key and the other column is information
related to the key. NIS finds information for a client by searching through the keys.
NIS and the Service Management Facility:
The NIS Fault Managed Resource Identifiers (FMRIs) are
svc:/network/nis/server:<instance> for the NIS server and
svc:/network/nis/client:<instance> for the NIS client.
3. Copy all of these source files, except passwd, to the DIR directory that you have selected.
4. Copy the passwd file to the PWDIR directory that you have selected.
5. Preparing the Makefile
NOTE
Where master is the machine name of the existing NIS master server.
Repeat the procedures described in this section for each machine you want configured as an NIS slave server.
The following procedure shows how to start NIS on a slave server.
Stop the client service and start all NIS server processes.
# svcadm disable network/nis/client
# svcadm enable network/nis/server
A set of administrative tools have been developed to manage zones, allowing them to be configured, installed,
patched, upgraded, booted, rebooted, and halted. As a result, zones can be administered in a manner very similar
to separate machines. In fact, some types of administration are significantly easier; for example, an administrator
can apply a patch to every zone on a system with a single command.
A zone can either be bound to a dedicated pool of resources (such as a number of CPUs or a quantity of physical
memory), or can share resources with other zones according to defined proportions. This allows the use of zones
both on large systems (where dedicated resources may be most appropriate) and smaller ones (where a greater
degree of sharing is necessary). It also allows administrators to make appropriate tradeoffs depending on the
relative importance of resource isolation versus utilization.
Zones provide for the delegation of many of the expected administrative controls for the virtual operating system
environment. Since each zone has its own name service identity, it also has its own notion of a password file and
its own root user. The proportion of CPU resources that a zone can consume can be defined by an administator,
and then that share can be further divided among workloads running in the zone by the (potentially different) zone
administrator. In addition, the privileges available within a zone (even to the root user) are restricted to those that
can only affect the zone itself. As a result, even if a zone is compromised by an intruder, the compromise will not
affect other zones in the system or the system as a whole.
Zones also allow sharing of file system data, particularly read-only data such as executables and libraries.
Portions of the file system can be shared between all zones in the system through use of the read-only loopback
file system (or lofs), which allows a directory and its contents to be spliced into another part of the file system.
This not only substantially reduces the amount of disk space used by each zone, but reduces the time to install
zones and apply patches, and allows for greater sharing of text pages in the virtual memory system.
NOTE: Zones are part of the N1 Grid Containers feature in Solaris 10 and the container technology is still under
development.
The Solaris Zones partitioning technology is used to virtualize operating system services and provide an isolated
and secure environment for running applications. A zone is a virtualized operating system environment created
within a single instance of the Solaris Operating System. When you create a zone, you produce an application
execution environment in which processes are isolated from the rest of the system. This isolation prevents
processes that are running in one zone from monitoring or affecting processes that are running in other zones.
Even a process running with superuser credentials cannot view or affect activity in other zones.
A zone also provides an abstract layer that separates applications from the physical attributes of the machine on
which they are deployed. Examples of these attributes include physical device paths. Zones can be used on any
machine that is running the Solaris 10 release. The upper limit for the number of zones on a system is 8192. The
number of zones that can be effectively hosted on a single system is determined by the total resource
requirements of the application software running in all of the zones.
There are two types of non-global zone root file system models: sparse and whole root. The sparse root zone
model optimizes the sharing of objects. The whole root zone model provides the maximum configurability.
Every Solaris system contains a global zone. The global zone has a dual function. The global zone is both the
default zone for the system and the zone used for system-wide administrative control. All processes run in the
global zone if no non-global zones, referred to simply as zones, are created by the global administrator.
The global zone is the only zone from which a non-global zone can be configured, installed, managed, or
uninstalled. Only the global zone is bootable from the system hardware. Administration of the system
infrastructure, such as physical devices, routing, or dynamic reconfiguration (DR), is only possible in the global
zone. Appropriately privileged processes running in the global zone can access objects associated with other
zones.
Unprivileged processes in the global zone might be able to perform operations not allowed to privileged
processes in a non-global zone. For example, users in the global zone can view information about every process
in the system. If this capability presents a problem for your site, you can restrict access to the global zone.
Each zone, including the global zone, is assigned a zone name. The global zone always has the name global.
Each zone is also given a unique numeric identifier, which is assigned by the system when the zone is booted.
The global zone is always mapped to ID 0. Each zone also has a node name that is completely independent of
the zone name. The node name is assigned by the administrator of the zone. Each zone has a path to its root
directory that is relative to the global zones root directory. The scheduling class for a non-global zone is set to the
scheduling class for the system.
You can also set the scheduling class for a zone through the dynamic resource pools facility. If the zone is
associated with a pool that has its pool.scheduler property set to a valid scheduling class, then processes running
in the zone run in that scheduling class by default.
Features Summary:
Fig 15-1
4. Install the Solaris Flash archive on clone systems. The master system and the clone system must have
the same kernel architecture. When you install the Solaris Flash archive on a system, all of the files in the
archive are copied to that system. The newly installed system now has the same installation configuration
as the original master system, thus the system is called a clone system. Some customization is possible:
Scripts can be used for customization.
You can install extra packages with a Solaris Flash archive by using the custom JumpStart installation
method. The packages must be from outside the software group being installed or a third-party package.
5. (Optional) Save a copy of the master image. If you plan to create a differential archive, the master image
must be available and identical to the image installed on the clone systems.
Figure below shows an installation of clone systems with an initial installation. All files are overwritten.
Central processing unit (CPU) The CPU processes instructions, fetching instructions from memory and
executing them. Input/Output (I/O) devices I/O devices transfer information into and out of the computer. Such
a device could be terminal and keyboard, a disk drive, or a printer. Memory Physical (or main) memory is the
amount of memory (RAM) on the system.
Monitoring Performance (Tasks) describes the tools that display statistics about the activity and the
performance of the computer system.
Processes and System Performance
Terms related to process are described in process
Process Term Description
Proc Contains information that pertains to the whole process and has to be in main memory all
the time.
Kthread Contains information that pertains to one LWP and has to be in main memory all the time.
User Contians the per process information that is swappable.
Klwp Contains the per LWP process information that is swappable.
The illustration below shows the relationship among these structures.
Main Memory
(non-swappable)
proccess kernel thread
(proc structure) (kthread structure)
per process per LWP
user LWP
(user structure) (klwp structure)
swappable
Command for Managing Processes
ps Check the status of active processes on a system, as well as display detailed information about
the process
Dispadmin List default scheduling policies
priocntl Assignn processes to a priority class and manage process priorities
nice Change the priority of a timesharing process
In addition, process tools are available in /usr/proc/bin that display highly detailed information about the processes
listed in /proc, also know as the process file system ( PROCFS ). Images of active process are stored here by
their process ID number.
The process tools are similar to some options of the ps command, except that the output provided by the tools is
more detailed. In general, the process tools:
Display more details about processes, such as fstat and fcntl information, working directories, and trees of parent
and child processes. Provided control over processes, allowing users to stop or resume them
Performance suffers when the programs running on the system require more physical memory than is available.
When this happens, the operating system begins paging and swapping, which is costly in both disk and CPU
overhead. Paging involves moving pages that have not been recently referenced to a free list of available
memory pages. Most of the kernel resides in main memory and is not pageable. Swapping occurs if the page
daemon cannot keep up with the demand for memory. The swapper will attempt to swap out sleeping or stopped
lightweight process. The swapper will swap LWPs back in based on their priority. It will attempt to swap in
processes that are runnable.
Swap Space
Swap areas are really file systems used for swapping. Swap areas should be sized based on the requirements of
your applications.Check with your vendor to identify application requirements.
Buffer Resources
The buffer cache for read and write system calls uses a range of virtual addresses in the kernel address space. A
page of data is mapped into the kernel address space and the amount of data requested by the process in then
physically copied to the process address space. The page is then unmapped in the kernel. The physical page
will remain in memory until the page is freed up by the page daemon.
This means a few I/O-intensive processes can monopolize or force other processes out of main memory. To
prevent monopolization of main memory, balance the running of I/O- intensive processes serially in a script or
with the at command. Programmers can use mmap and madvise to ensure that their programs free memory
when they are not using it.
Kernel Parameters and System Performance
Many basic parameters (or tables) within the kernel are calculated from the value of the maxusers parameter.
Tables are allocated space dynamically. However, you can set maximums for these tables to ensure that
applications wont take up large amounts of memmory.
By default, maxusers is approximately set to the number of Mbytes of physical memory on the system. However,
the system will never set maxusers higher than 1024. The maximum value of maxusers is 2048, which can be set
by modifying the /etc/system file. In addtion to maxusers, a number of kernel parameters are allocated
dynamically based on the amount of physical memory on the system, as shown in kernel parameters below
Kernel Parameter
Ufs_ninode The maximum size of the inode table
Ncsize The size of the direectory name lookup cache
Max_nprocss The maximum size of the process
Ndquot The number of disk quota structures
Maxuprc The maximum number of user processes per user-ID
Example
#nice find . name *.c print &
The UNIX buffer cache is used to cache file data and file-related information such as file headers, inodes, and
indirect block addresses. You can use sar -b to monitor the hit ratio of the buffer cache.
Virtual memory
The virtual memory model allows a process or a system to address more memory than physically exists. The
UNIX operating system uses swap device in addition to physical memory to manage the allocation and
deallocation of memory, For example, a machine with 64 MB of physical memory and 192MB swap device
supports a virtual memory size of 256 M.
The physmem kernel parameter sets the total amount of physical memory in the system. This value is
automatically set when the system boots up. For benchmarking purpose this parameter can be set to study the
system behavior, e.g., if you have 1GB physical memory, you may want to set physmem to 128M and run
applications.
The physmem parameter can be set in /etc/system file, use
# set physmem = 260000
Remember the size is set in pages, not in bytes, in the case above it is (260000 * 4K).
The kernel memory allocator (kma) is responsible for serving all memory allocation and de-allocation requests.
The kma is responsible for maintaining memory free list.
On solaris, you can monitor the activity of the kma using kma command.
Using sar -k, you can monitor the workload of the kma.
Note: The Sun4u architecture (ultra series) uses an 8K page size.
Swapping
Swapping is the process by which the system no longer has enough free physical memory, and memory pages of
a process are completely swapped (written) to the swap device. The minfree kernel parameter sets the
absolute lower limit of free available memory. If free memory drops below minfree, swapping activity begins.
To list the swap devices configured on your system use
# swap -l
Paging
Current versions of UNIX provide a more granular approach to swapping known as paging. Paging swaps out
various different pages in memory to swap device rather than the entire process as in swapping. Typical memory
pages are 4K in size. There are several paging parameters that control both the action and the frequency of the
page daemon. The paging parameters are lotsfree, desfree, minfree, slowscan, fastscan and
handspreadpages these parameters can be set in /etc/system file.
Because the UNIX operating system is based on the virtual memory model, a translation layer is needed between
virtual memory addresses and physical memory addresses. This translation layer is part of the Kernel and is
usually written in machine level language (Assembly) to achieve optimal performance when translating and
mapping addresses.
For example, declaring large SGA when only a small section of the SGA is actively used could cause the unused
portions of the SGA to be paged back to the free list. This allows other processes access to the physical memory
pages. UNIX kernel maintains a free list of memory pages and uses the page daemon to periodically scan the
memory pages for active and idle pages.
The lotsfree kernel parameter sets the upper bound of free memory that the system tries to maintain. If the
available free memory is always above lotsfree, the page daemon will remain idle. Setting lotsfree
parameter to a higher value if you have low memory causes page daemon to be active and the processes do not
have to starve for memory.
The maxpgio parameter regulates the number of page-out I/O operations per second that are scheduled by the
system. On Solaris, the default is 40 pages per second. This value is based on the 3600rpm disk drive. However,
most new disk drives are either 5400 or 7200 rpm. On Solaris, the formula for maxpgio is rpm value X 2/3.
When tuning kenel paging parameters, you must maintain the following range check:
LOTSFREE > DESFREE > MINFREE
Thrashing occurs when swapping was not successful in freeing memory, and memory is needed to serve critical
processes. This usually results in a panic.
Memory Leaks
Programs sometimes cause memory leaks due to improper process cleanup during the life of a process. Memory
leaks are generally difficult to detect, especially in daemon that run continuously and fork children to parallelize
requests. There are numerous tools available in the market those help to detect memory leaks and access
violations.
Shared memory
The UNIX system V Release 4 operating system standard provides many different mechanisms for inter-process
communication. Inter-process communication is required when multiple distinct processes need to communicate
with each other either by sharing data or resources or both.
Shared memory, as the term suggests, enables multiple processes to "share" the same memory. In other words,
multiple processes mapping into the same virtual address map.
Oracle uses shared memory to hold the System Global Area (SGA).
The kernel parameters associated with the shared memory are:
SHMMAX, SHMMIN, SHMNI, and SHMSEG.
These parameters can be set in /etc/system file.
Process model
The UNIX operating system uses a priority-based round-robin process scheduling architecture. Processes are
dispatched to the run queue based on priority, and are placed on the sleep queue based either on the process
completion of its time slice or by waiting on a event or resource. Processes can sometimes result in a deadlock
when waiting on a resource. The kernel provides deadlock detection within the kernel; the best practice is for the
user application to provide deadlock detection.
For example if three processes are on dispatch queue which begins execution on three different CPUs, and three
more processes are sleeping. Once the time slice of the running process is exceeded, the process will be
migrated to sleep queue in favor of other sleeping processes.
Then the the process with highest priority (which is in the Ready to runqueue) will be scheduled to run. The kernel
may also decide to preempt processes by reducing their priority to make way for other processes.
Process states
The lifetime of a process can be concptually didvided into a set of states that describe the process. At any time, a
process can be in one of the following states.
1. The process is executing in user mode.
2. The process is executing in kernel mode.
3. The process is not executing but ready to run as soon as the kernel schedules it.
4. The process is sleeping & resides in main memory.
5. The process is ready to run, but the swapper (process 0) must swap the process into main memory
before the kernel can schedule it to execute.
6. The process is sleeping, and the swapper has swapped the process to secondary storage
7. To make room for other processes in main memory.
The process is returning from kernel to user mode, but the kernel preempts it and does a context switch to
schedule another process.
The process is newly created and is in a transition state; the process is ready to exist, but it is not ready to run,
nor it is sleeping. This state is the start state for all processes except process 0.
The process executed the exit system call and is in the zombie state. The process no longer exists, but it leaves a
record containing the exit code and some timing statistics for its parent process to collect. The zombie state is the
final state of a process.
The zombie process has two meanings:
The first is that process (child process) has terminated, and has a parent process.
The second is that the process is marked to be killed by the UNIX kernel. During the zombie state, the process
structure and address space is removed and freed back to the system. The only information remaining in the
kernel for a zombie process is an entry in the process table.
The process structure contains many different fields, including process information and process statistics in case
the process is swapped out. Each time a process is created, a separate process structure for that process is
created. The complete listing of process structure is available in proc.h file located in /usr/include/sys
directory.
nice command can be used to change priority levels of a process and only super-user can do this.
Job Scheduler
UNIX operating system job scheduler provides three different types of job classes as below
Job Class Priority Range Description
Time-share 0 through 59 Default job class
System 60 through 99 Reserved for system daemon processes
Real-Time 100 through 159 Highest priority job class
In theTime-slice job class, each process is assigned a time slice (or Quantum). The time slice specifies the
number of CPU clock ticks that a particular process can occupy the CPU. Once the process finishes its time slice,
the process priority is usually decreased and the process is placed on the sleep queue. Other process waiting for
the CPU time may have their priority increased & they are likelihood to run.
The system job class is reserved for system daemon processes such as the pageout daemon or the file system
daemon.
Real-Time job class is the highest priority job class, and has no time slice. Best example is thrashing.
Use ps -efc command to list job class and priority of a process.
Threads
A thread is an independent path (unit of control) of execution within a process.
A thread can be thought of as a sub-task or sub-process. Unlike process model, which creates a separate
process structure for each process, the threads model has substantial less overhead than a process. A thread is a
part of the process address space; therefore it is not necessary to duplicate the address space of a process when
a new thread is created. There is an order of magnitude of difference between creating a process versus a thread.
It takes approximately 1,000 times longer to create a full process that it does to create a thread. Sun Solaris 2.x
provides a Solaris threads library as well as the POSIX thread library.
Most of the system daemons use threads to serve requests.
17. Miscellaneous
17.1. Dynamic Host Configuration Protocol
Not very long ago, networks used to be small and static in nature and easy to manage on a per-host basis. Plenty
of IP addresses were available, and these IP addresses were assigned statically to all hosts connected to a
network. An IP address was reserved for each host in this scheme even if the host was not turned on. This
scheme worked fine until networks grew larger and mobile hosts, such as laptops and PDAs, started creeping in.
Mobile hosts that moved frequently from one network to another needed special attention, because of the
difficulty of reconfiguring a laptop computer whenever it was connected to a different network. With very large
networks, the need for centralized configuration management also became an issue. The Dynamic Host
Configuration Protocol (DHCP) solves these problems by dynamically assigning network configuration to hosts at
boot time. A DHCP server keeps information about IP addresses that can be allocated to hosts. It also keeps
other configuration data, such as the default gateway address, Domain Name Server (DNS) address, the NIS
server, and so on. A DHCP client broadcasts a message that locates a DHCP server. If one or moreDHCP
servers are present, they offer an IP address and other network configuration data to the client. If the client
receives a response from multiple servers, it accepts the offer from one of the servers. The server then leases
one IP address to the client for a certain period of time and the client configures itself with network configuration
parameters provided by the server. If the client host needs an IP address for a longer period, it can renew the
lease time. If a client host goes down before the lease time is over, it sends a message to the DHCP server to
release the IP address so that it can be assigned to another host. DHCP is usually not used for hosts that need
static IP addresses, although it has the provision to assign static IP addresses to clients. These hosts include
different types of servers and routers. Servers need static IP addresses so that clients of that server always
connect to the right host. Similarly, routers need a static IP address to have a consistent and reliable routing table.
Other than that, user PCs, workstations, and laptop computers may be assigned dynamic addresses.
The Dynamic Host Configuration Protocol is based on the Bootstrap Protocol (BOOTP). BOOTP was used to boot
diskless workstations. BOOTP has many limitations, however, including the manual configuration on the BOOTP
server. DHCP also can be used to configure several more network parameters compared to BOOTP, and it is
more flexible.
DHCP Lease Time
When a client connects to a DHCP server, the server offers an IP address to the client for a certain period of time.
This time is called the lease time. If the client does not renew the lease, the IP address is revoked after the
designated time. If configured as such, the client can renew its lease as many times as it likes.
DHCP Scope
Scope is the range of IP addresses from a network that a DHCP server can assign to clients. A server may have
multiple scopes. However, a server must have at least one scope to be able to assign IP addresses to DHCP
clients. DHCP scope is defined at the time of configuring the DHCP server.
Booting a Workstation Using DHCP
DHCP uses BOOTP port numbers 67 (for clients) and 68 (for servers). The process of booting a host using DHCP
consists of a number of steps. First, a DHCP client locates a DHCP server using a broadcast packet on the
network. All the DHCP servers listen to this request and send a lease offer. The client accepts one of the lease
offers and requests the offering server to assign an IP address and other network parameters. The following
sections describe these steps in more detail.
Discovering the DHCP Server
First, the DHCP client sends a DHCPDISCOVER type of broadcast message to find the available DHCP servers
on the network. The source address in this message is 0.0.0.0 because the DHCP client does not know its own IP
address at this time. If no DHCP server responds to this message, the message send attempt is retried. The
number of retries depends on the client.
Lease Offer
When the DHCP server listens to the DHCPDISCOVER message, a response is sent back using the
DHCPOFFER message. A client may receive multiple offers depending on how many DHCP servers are present
on the network. The DHCPOFFER message contains the offered IP address and other network configuration
information.
Lease Acknowledgment
The selected DHCP server then sends back a DHCP acknowledgment message (DHCPACK) to the DHCP client.
If response from the client is too late and the server is not able to fulfill the offered IP address, a negative
acknowledgment (DHCPNAK) is sent back to the client.
Client Configuration
When a client receives an acknowledgment message with configuration parameters, it verifies that any other host
on the network is not using the offered IP address. If the IP address is free, the client starts using it. The snoop
command can be used to see how data packets are exchanged between a client and a server during the time that
the client verifies that no other host is using the offered IP address.
DHCP Lease Renewal
A DHCP client requests the DHCP server to renew the IP lease time when 50% of the lease time has passed. It
does so by sending a DHCPREQUEST message to the server. If the server responds with a DHCPACK
message, the lease is renewed and a time counter is reset. If the client does not receive an acknowledgment from
the server, it again tries to renew the lease when 87.5% of the lease time has passed. If it does not receive a
message for this request, it restarts the DHCP configuration process at the end of the lease time.
Lease Release
If a client shuts down before the lease time is expired, it sends a DHCPRELEASE message to the DHCP server
telling it that it is going down and the IP address is going to be free. A server can then reuse this IP address
immediately. If the client does not send this message before shutting down, the IP address may still be marked as
being in use.
DHCP IP Address Allocation Types
Basically, the following three types of IP address allocations are used by DHCP when assigning IP addresses to
DHCP clients:
1. Automatic: The automatic lease is used to assign permanent IP addresses to hosts. No lease
expiration time applies to automatic IP addresses.
2. Dynamic: The dynamic lease is the most commonly used type. Leased IP addresses expire after lease
time is over, and the lease must be renewed if the DHCP client wants to continue to use the IP address.
3. Manual:A manual allocation is used by system administrators to allocate fixed IP addresses to certain
hosts.
DHCP clients on network 192.168.2.0, for example, you can have two DHCP servers. One of these DHCP
servers has the scope of IP addresses from 192.168.2.51 to 192.168.2.150, and the other one has a scope of IP
addresses from 192.168.2.151 to 192.168.2.250.
When planning to deploy DHCP, follow these steps:
Collect information about your network topology to find out how many DHCP servers or relay agents are required.
DHCP cannot be used where multiple IP networks use the same network physical media.
1. Select the best available servers for DHCP.
2. Determine which data storage method should be used. The data storage method tells the DHCP server
how to keep the DHCP database. Two methods are available on Solaris systems: the files method
and the nisplus (NIS+) method. In the files method, DHCP database files are stored in a local directory
on the DHCP server. In the NIS+ method, the DHCP data is stored in NIS+ database on the NIS+ server.
I recommend using the files method; in fact, NIS+ is seldom used.
3. Determine a lease policy for your network. Keep in mind the factors mentioned earlier while deciding on a
lease policy. Also decide whether youll use a dynamic or permanent lease type.
4. Determine which routers you need for DHCP clients. You have to assign these addresses to clients when
offering a lease.
5. Determine the IP addresses to be managed by each server. After going through each of these steps, you
should have a fairly good idea about how to proceed with the installation of DHCP on your network.
17.2. Samba
UNIX has brought TCP/IP and the Internet to the table, while windows has brought millions of users. And so, we
just cant survive only with one operating system, rather we have to take advantage of the of the variety and
cooperate to make most of the available OSs.
The most powerful level of PC/UNIX integration is achieved by sharing directories that live on a UNIX host with
desktop PCs that run Windows. The shared directories can be made to appear transparently under Windows, as
an extension to the regular Windows network file tree. Either NFS or CIFS can be used to implement this
functionality.
NFS was designed to share files among UNIX hosts, on which the file locking and security paradigms are
significantly different from those of Windows. Although a variety of products (e.g. PC-NFS) that mount NFS-
shared directories on Windows clients are available, their use should be aggressively avoided, both because of
the paradigm mismatch and because CIFS just works better.
CIFS: the Common Internet File System
CIFS is based on protocols that were formerly referred to as Server Message Block or SMB. SMB was an
extension that Microsoft added to DOs in tis early days to allow disk I/O to be redirected to a system known as
NetBIOS (Network Basic Input/Output System). Designed by IBM and Sytec, NetBIOS was a crude interface
between the network and application.
In the modern world, SMB packets are carried in an extension of NetBIOS known as NBT, NetBIOS over TCP.
While this sounds very convoluted, the result is that these protocols have become widespread and are available
on platforms ranging from MVS and VMS to our friends UNIX and Windows.
Samba is an enormously popular software package available under the GNU public license that implements CIFS
on UNIX hosts. Andrew Tridgell, an Australian, who reverse engineered the SMB protocol from another system
and published the resulting code in 1992, originally created it.
Today, Samba is well supported and actively under development to expand its functionality. It provides a stable,
industrial-strength mechanism for integrating Windows machines into a UNIX network. The real beauty of it is that
you only need to install one package on the UNIX machine; no additional software is needed on the Windows
side.
CIFS provides five basic services:
File Sharing
Network Printing
Authentication and authorization
Name Resolution
Service announcement (file server and printer browsing)
Most of Sambas functionality is implemented by two daemons: smbd and nmbd. smbd implements the first three
services (as listed above) and nmbd provides the remaining two services.
Unlike NFS, which is deeply intertwined with the kernel, Samba requires no kernel modifications and runs entirely
as a user process. It binds to the sockets used for NBT requests and waits for a client to request access to a
resource. Once the request has been made and authenticated, smbd forks an instance of itself that runs as the
user who is making the requests. As a result, all normal UNIX file access permissions (including group
permissions) are obeyed. The only special functionality that smbd adds on top of this is a file locking service that
provides client PCs with the locking semantics they are accustomed to.
17.3. Apache
In the 1980s, UNIX established a reputation for providing a high-performance, production-quality networked
environment on a variety of hardware platforms. When the World Wide Web appeared on the scene as the
ultimate distributed client/server application in the early 1990s, UNIX was there as its ready-made platform, and a
new era was born.
Web Hosting
In the early 1990s, UNIX was the only choice for serving content on the web. As the webs popularity grew, an
increasing number of parties developed an interest in having their own presence on the net.
Seizing the opportunity, companies large and small jumped into the ring with their own server solutions. A new
industry segment known as web-hosting or Internet hosting was born around the task serving content to the
web. These days we have a variety of web hosting platforms to choose from, and a number of specialized web
servers have been developed to meet the needs of specific market channels. Foe reliability, maintainability,
security and performance, UNIX is a better choice.
The foremost advantages of UNIX are its maintainability and performance. UNIX was designed from the start as a
multi-user, interactive operating system. On a UNIX box, one administrator can maintain a database, while
another looks after I/O performance and a third maintains the web server.
Web Hosting Basics
Hosting a web site isnt substantially different from providing any other network service. The foundation of the
World Wide Web is the Hyper-Text Transfer Protocol (HTTP), a simple TCP-based protocol thats used to format,
transmit, and link documents containing a variety of media types, including text, pictures, sound, animation, and
video. HTTP behaves much like the other client/server protocols used on the Internet, for example, SMTP (for
email) and FTP (for file transfer).
A web-server is simply a system thats configured to answer HTTP requests. To convert your UNIX system into a
web hosting platform, you need to install a daemon that listens for connections on TCP port 80 (the HTTP
standard), accepts requests for documents, and transmits them to the requesting user.
Web browsers such as Netscape and IE contact remote web-servers and make requests on behalf of users. The
documents thus obtained can contain hypertext pointers to other documents, which may or may not live and the
server that the user originally contacted. Since the HTTP protocol standard is well defined, clients running on any
OS can connect to any HTTP server.
How HTTP works
HTTP is the protocol that makes the WWW really work and it is an extremely basic, stateless, client/server
protocol. In the HTTP model, the initiator of a connection is always a client (usually a browser). The client asks
the server for contents of a specific URL. The server responds with either a spurt of data or with some type of
error message.
Virtual Interfaces
In the olden days, a UNIX machine typically acted as a server for single web site. As the webs popularity grew,
everybody wanted to have their own web-sites, and overnight, thousands of companies became web hosting
providers.
Providers quickly realized that they could achieve significant economies of scale if they were able to host more
than one site on a single server. In response to this business need, virtual interfaces were born (in fact, the
research project of UNIX was funded by the then big companies to support their businesses).
The idea is simple: a single UNIX machine responds on the network to more IP addresses than it has physical
network interfaces. Each of the resulting virtual network interfaces can be associated with a corresponding
domain name that users on the Internet might want to connect to. This feature allows a single UNIX machine to
serve literally hundreds of web sites.
WWW Servers
Installing and configuring a Web server is a much more involved process. A Web server is a very complex
daemon with numerous features controlled by a couple of configuration files. Web servers not only access files
containing Web pages, graphics and other media types for distribution of clients, they can also assemble pages
from more than one file, run CGI applications, and negotiate secure communications. Basic server configuration
issues are discussed in the follwing sections.
Apache is a free Web server developed by a community of Internet Programmers; it is available for many UNIX
systems as well as Solaris in source code form. It includes the latest features and provides for a broad range of
customization and configuration. It is a good choice for sites that require the latest web server features and
possess the required software development tools to compile and maintain the program.