Sunteți pe pagina 1din 79

Unix System Kernel

Instructors:
Fu-Chiung Cheng
( 鄭福炯 )
Associate Professor
Computer Science & Engineering
Tatung Institute of Technology
1
Unix: Introduction
• Operating System: a system that manages the resources
of a computer.
• Resources: CPUs, Memory, I/O devices, Network
• Kernel: the memory resident portion of Unix system
• File system and process control system are two major
components of Unix Kernel.

2
Architecture of Unix System

emacs
• OS interacts directly with
sh who the hardware
kernel date • Such OS is called
cpp system kernel
ed
cc as hardware
ld wc
grep
nroff
Other apps
3
Unix System Kernel
• Three major tasks of kernel:
 Process Management
 Device Management
 File Management
• Three additional Services for Kernel:
 Virtual Memory
 Networking
 Network File Systems
• Experimental Kernel Features:
 Multiprocessor support
 Lightweight process (thread) support
4
Block Diagram of System Kernel
User Programs
Libraries
User Level
Kernel System Call Interface
Level
Inter-process
File Subsystem Process communication

control Scheduler
Device drivers Memory
subsystem
management

hardware control

hardware 5
Hardware Level
Process Control Subsystem
• Process Synchronization
• Interprocess communication
• Memory management:
• Scheduler: process scheduling
(allocate CPU to Processes)

6
File subsystem
• A file system is a collection of files and directories on
a disk or tape in standard UNIX file system format.
• Kernel’s file sybsystem regulates data flow between
the kernel and secondary storage devices.

7
Hardware Control

• Hardware control is responsible for handling interrupts


and for communicating with the machine.
• Devices such as disks or terminals may interrupt the
CPU while a process is executing.
• The kernel may resume execution of the interrupted
process after servicing the interrupt.

8
Processes
• A program is an executable file.
• A process is an instance of the program in execution.
• For example: create two active processes
$ emacs &
$ emacs &
$ ps
PID TTY TIME CMD
12893 pts/4 0:00 tcsh
12581 pts/4 0:01 emacs
12582 pts/4 0:01 emacs
$ 9
Processes
• A process has
 text: machine instructions
(may be shared by other processes)
 data
 stack
• Process may execute either in user mode and in kernel
mode.
• Process information are stored in two places:
 Process table
 User table
10
User mode and Kernel mode
• At any given instant a computer running the Unix system
is either executing a process or the kernel itself is running
• The computer is in user mode when it is executing
instructions in a user process and it is in kernel mode
when it is executing instructions in the kernel.
• Executing System call ==> User mode to Kernel mode
perform I/O operations
system clock interrupt

11
Process Table
• Process table: an entry in process table has the following
information:
 process state:
A. running in user mode or kernel mode
B. Ready in memory or Ready but swapped
C. Sleep in memory or sleep and swapped
 PID: process id
 UID: user id
 scheduling information
 signals that is sent to the process but not yet handled
 a pointer to per-process-region table
12
• There is a single process table for the entire system
User Table (u area)
• Each process has only one private user table.
• User table contains information that must be accessible
while the process is in execution.
 A pointer to the process table slot
 parameters of the current system call, return values
error codes
 file descriptors for all open files
 current directory and current root
 process and file size limits.
• User table is an extension of the process table.
13
Process Kernel user
table address address
space space
Active process

resident
swappable
Region
text
table
u area data
stack
Per-process
region table 14
Shared Program Text and
Software Libraries
• Many programs, such as shell, are often being
executed by several users simultaneously.
• The text (program) part can be shared.
• In order to be shared, a program must be compiled using
a special option that arranges the process image so that
the variable part(data and stack) and the fixed part (text)
are cleanly separated.
• An extension to the idea of sharing text is sharing
libraries.
• Without shared libraries, all the executing programs
contain their own copies. 15
Process Region
table table

text
data
stack

Active process

Reference
text count = 2
data
stack
Per-process
region table 16
System Call
• A process accesses system resources through system call.
• System call for
 Process Control:
fork: create a new process
wait: allow a parent process to synchronize its
execution with the exit of a child process.
exec: invoke a new program.
exit: terminate process execution
 File system:
File: open, read, write, lseek, close
inode: chdir, chown chmod, stat fstat 17
others: pipe dup, mount, unmount, link, unlink
System call: fork()
• fork: the only way for a user to create a process in Unix
operating system.
• The process that invokes fork is called parent process
and the newly created process is called child process.
• The syntax of fork system call:
newpid = fork();
• On return from fork system call, the two processes have
identical copies of their user-level context except for the
return value pid.
• In parent process, newpid = child process id
• In child process, newpid = 0; 18
$ cc forkEx1.c -o forkEx1
$ forkEx1
/* forkEx1.c */ Before forking ...
#include <stdio.h> Child Process fpid=0
After forking fpid=0
main() Parent Process fpid=14707
{ After forking fpid=14707
int fpid; $
printf("Before forking ...\n");
fpid = fork();
if (fpid == 0) {
printf("Child Process fpid=%d\n", fpid);
} else {
printf("Parent Process fpid=%d\n", fpid);
}
printf("After forking fpid=%d\n", fpid);
} 19
$ forkEx2
/* forkEx2.c */
Before forking ...
#include <stdio.h>
PID TTY TIME CMD
14759 pts/9 0:00 tcsh
main()
14778 pts/9 0:00 sh
{
14777 pts/9 0:00 forkEx2
int fpid;
PID TTY TIME CMD
printf("Before forking ...\n");
14781 pts/9 0:00 sh
system("ps");
14759 pts/9 0:00 tcsh
fpid = fork();
14782 pts/9 0:00 sh
system("ps");
14780 pts/9 0:00 forkEx2
printf("After forking
14777 pts/9 0:00 forkEx2
fpid=%d\n", fpid);
After forking fpid=14780
}
$ PID TTY TIME CMD
$ ps 14781 pts/9 0:00 sh
PID TTY TIME CMD 14759 pts/9 0:00 tcsh
14759 pts/9 0:00 tcsh 14780 pts/9 0:00 forkEx2
20
$ After forking fpid=0
System Call: getpid() getppid()
• Each process has a unique process id (PID).
• PID is an integer, typically in the range 0 through 30000.
• Kernel assigns the PID when a new process is created.
• Processes can obtain their PID by calling getpid().
• Each process has a parent process and a corresponding
parent process ID.
• Processes can obtain their parent’s PID by calling
getppid().

21
/* pid.c */
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

main()
{
printf("pid=%d ppid=%d\n",getpid(), getppid());
}
$ cc pid.c -o pid
$ pid
pid=14935 ppid=14759
$
22
/* forkEx3.c */
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
main()
{
int fpid;
printf("Before forking ...\n");
fpid = fork();
if (fpid == 0) {
printf("Child Process fpid=%d pid=%d ppid=%d\n",
fpid, getpid(), getppid());
} else {
printf("Parent Process fpid=%d pid=%d ppid=%d\n",
fpid, getpid(), getppid());
}
printf("After forking fpid=%d pid=%d ppid=%d\n",
fpid, getpid(), getppid()); 23
$ cc forkEx3.c -o forkEx3
$ forkEx3
Before forking ...
Parent Process fpid=14942 pid=14941 ppid=14759
After forking fpid=14942 pid=14941 ppid=14759
$ Child Process fpid=0 pid=14942 ppid=1
After forking fpid=0 pid=14942 ppid=1

$ ps
PID TTY TIME CMD
14759 pts/9 0:00 tcsh

24
System Call: wait()
• wait system call allows a parent process to wait
for the demise of a child process.
• See forkEx4.c

25
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
main()
{
int fpid, status;
printf("Before forking ...\n");
fpid = fork();
if (fpid == 0) {
printf("Child Process fpid=%d pid=%d ppid=%d\n",
fpid, getpid(), getppid());
} else {
printf("Parent Process fpid=%d pid=%d ppid=%d\n",
fpid, getpid(), getppid());
}
wait(&status);
printf("After forking fpid=%d pid=%d ppid=%d\n",
fpid, getpid(), getppid()); 26

}
$ cc forkEx4.c -o forkEx4
$ forkEx4
Before forking ...
Parent Process fpid=14980 pid=14979 ppid=14759
Child Process fpid=0 pid=14980 ppid=14979
After forking fpid=0 pid=14980 ppid=14979
After forking fpid=14980 pid=14979 ppid=14759
$

27
System Call: exec()
• exec() system call invokes another program by replacing
the current process
• No new process table entry is created for exec() program.
Thus, the total number of processes in the system isn’t
changed.
• Six different exec functions:
execlp, execvp, execl, execv, execle, execve,
(see man page for more detail.)
• exec system call allows a process to choose its successor.
28
/* execEx1.c */
#include <stdio.h>
#include <unistd.h>

main()
{
printf("Before execing ...\n");
execl("/bin/date", "date", 0);
printf("After exec\n");
}

$ execEx1
Before execing ...
Sun May 9 16:39:17 CST 1999
$ 29
/* execEx2.c */
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h> $ execEx2
Before execing ...
main() After exec and fpid=14903
{ $ Sun May 9 16:47:08 CST 1999
int fpid; $
printf("Before execing ...\n");
fpid = fork();
if (fpid == 0) {
execl("/bin/date", "date", 0);
}
printf("After exec and fpid=%d\n",fpid);
} 30
Handling Signal
• A signal is a message from one process to another.
• Signal are sometime called “software interrupt”
• Signals usually occur asynchronously.
• Signals can be sent
A. by one process to anther (or to itself)
B. by the kernel to a process.
• Unix signals are content-free. That is the only thing that
can be said about a signal is “it has arrived or not”

31
Handling Signal
• Most signals have predefined meanings:
A. sighup (HangUp): when a terminal is closed, the
hangup signal is sent to every process in control terminal.
B. sigint (interrupt): ask politely a process to terminate.
C. sigquit (quit): ask a process to terminate and produce a
codedump.
D. sigkill (kill): force a process to terminate.
• See signEx1.c

32
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
main() {
int fpid, *status;
printf("Before forking ...\n");
fpid = fork();
if (fpid == 0) {
printf("Child Process fpid=%d pid=%d ppid=%d\n",
fpid, getpid(), getppid());
for(;;); /* loop forever */
} else {
printf("Parent Process fpid=%d pid=%d ppid=%d\n",
fpid, getpid(), getppid());
}
wait(status); /* wait for child process */
printf("After forking fpid=%d pid=%d ppid=%d\n",
fpid, getpid(), getppid());
33
$ cc sigEx1.c -o sigEx1
$ sigEx1 &
Before forking ...
Parent Process fpid=14989 pid=14988 ppid=14759
Child Process fpid=0 pid=14989 ppid=14988
$ ps
PID TTY TIME CMD
14988 pts/9 0:00 sigEx1
14759 pts/9 0:01 tcsh
14989 pts/9 0:09 sigEx1
$ kill -9 14989
$ ps
...
34
Scheduling Processes
• On a time sharing system, the kernel allocates the CPU to
a process for a period of time (time slice or time quantum)
preempts the process and schedules another one when
time slice expired, and reschedules the process to continue
execution at a later time.
• The scheduler use round-robin with multilevel feedback
algorithm to choose which process to be executed:
A. Kernel allocates the CPU to a process for a time slice.
B. preempts a process that exceeds its time slice.
C. feeds it back into one of the several priority queues.
35
Process Priority
Priority Levels Processes
swapper
wait for Disk IO
wait for buffer
wait for inode
...
Kernel Mode wait for child exit
User Mode User level 0
User level 1

...

User level n
36
Process Scheduling
(Unix System V)
• There are 3 processes A, B, C under the following
assumptions:
A. they are created simultaneously with initial priority 60.
B. the clock interrupt the system 60 times per second.
C. these processes make no system call.
D. No other process are ready to run
E. CPU usage calculation: CPU = decay(CPU) = CPU/2
F. Process priority calculation: priority = CPU/2 + 60.
G. Rescheduling Calculation is done once per second.
37
Process A Process B Process C
Priority CPU count Priority CPU count Priority CPU count
0 60 0 60 0 60 0

1 60
75 30 60 0 60 0

2 60
67 15 75 30 60 0

60
3
63 7 67 15 75 30

67
4
76 33 63 7 67 15
... 38
Unix System Kernel

Instructors:
Fu-Chiung Cheng
( 鄭福炯 )
Associate Professor
Computer Science & Engineering
Tatung Institute of Technology
39
Booting
• When the computer is powered on or rebooted, a short
built-in program (maybe store in ROM) reads the first
block or two of the disk into memory. These blocks
contain a loader program, which was placed on the disk
when disk is formatted.
• The loader is started. The loader searches the root
directory for /unix or /root/unix and load the file into
memory
• The kernel starts to execute.
40
The first processes
• The kernel initializes its internal data structures:
it constructs linked list of free inodes, regions, page table
• The kernel creates u area and initializes slot 0 of process
table
• Process 0 is created
• Process 0 forks, invoking the fork algorithm directly
from the Kernel. Process 1 is created.
• In kernel mode, Process 1 creates user-level context
(regions) and copy code (/etc/init) to the new region.
• Process 1 calls exec (executes init).
41
init process
• The init process is a process dispatcher:spawning
processes, allow users to login.
• Init reads /etc/inittab and spawns getty
• when a user login successfully, getty goes through a login
procedure and execs a login shell.
• Init executes the wait system call, monitoring the death
of its child processes and the death of orphaned processes
by exiting parent.

42
Init fork/exec
a getty progrma
to manage the line
When the shell
dies, init wakes up
and fork/exec a
getty for the line

Getty prints The shell runs


“login:” message and programs for the
waits for someone user unitl the
to login user logs off
The login process
prints the
password message,
read the password
then check the password 43
File Subsystem
• A file system is a collection of files and directories on
a disk or tape in standard UNIX file system format.
• Each UNIX file system contains four major parts:
A. boot block:
B. superblock:
C. i-node table:
D. data block: file storage

44
File System Layout

Block 0: bootstrap
Block 1: superblock
Block 2 Block 2 - n:i-nodes
...
Block n
Block n+1 Block n+1 - last:Files
...
The last Block
45
Boot Block
• A boot block may contains several physical blocks.
• Note that a physical block contains 512 bytes
(or 1K or 2KB)
• A boot block contains a short loader program for
booting
• It is blank on other file systems.

46
Superblock
• Superblock contains key information about a file system
• Superblock information:
A. Size of a file system and status:
label: name of this file system
size: the number of logic blocks
date: the last modification date of super block.
B. information of i-nodes
the number of i-nodes
the number of free i-nodes
C. information of data block: free data blocks.
• The information of a superblock is loaded into memory.47
I-nodes
• i-node: index node (information node)
• i-list: the list of i-nodes
• i-number: the index of i-list.
• The size of an i-node: 64 bytes.
• i-node 0 is reserved.
• i-node 1 is the root directory.
• i-node structure: next page

48
mode
owner I-node structure
timestamp
Data block Data block
Size
Reference count Data block Data block
Block count ... ...
Data block Data block
Direct blocks
0-9

Single indirect Indirect block


Indirect block ...
Double indirect Indirect block
Indirect block
Triple indirect 49
I-node structure
• mode: A. type: file, directory, pipe, symbolic link
B. Access: read/write/execute (owner, group,)
• owner: who own this I-node (file, directory, ...)
• timestamp: creation, modification, access time
• size: the number of bytes
• block count: the number of data blocks
• direct blocks: pointers to the data
• single indirect: pointer to a data block which
pointers to the data blocks (128 data blocks).
• Double indirect: (128*128=16384 data blocks)
• Triple indirect: (128*128*128 data blocks) 50
Data Block
• A data block has 512 bytes.
A. Some FS has 1K or 2k bytes per blocks.
B. See blocks size effect (next page)
• A data block may contains data of files or data of
a directory.
• File: a stream of bytes.
• Directory format:
i-# Next size File name pad

51
home

alex jenny john

Report.txt bin notes

grep find

i-# Next 10 Report.txt pad i-# Next 3

bin pad i-# Next 5 notes pad 0 Next52


home Boot Block
SuperBlock
alex kc i-nodes i-node
...
i-node
Report.txt source notes ...
i-node
grep find ...
i-node
In-core ...
u area inodes Device driver notes
i-node & ...
Current ... Hardware source
directory i-node control ...
inode ... Report.txt
i-node Data ...
53
Blocks Current Dir
In-core inode table
• UNIX system keeps regular files and directories on block
devices such as disk or tape,
• Such disk space are called physical device address space.
• The kernel deals on a logical level with file system
(logical device address space) rather than with disks.
• Disk driver can transfer logical addresses into physical
device addresses.
• In-core (memory resident) inode table stores the
inode information in kernel space.

54
In-core inode table
• An in-core inode contains
A. all the information of inode in disks.
B. status of in-core inode
inode is locked,
inode data changed
file data changed.
C. the logic device number of the file system.
D. inode number
E. reference count

55
File table
• The kernel have a global data structure, called file table,
to store information of file access.
• Each entry in file table contains:
A. a pointer to in-core inode table
B. the offset of next read or write in the file
C. access rights (r/w) allowed to the opening process.
D. reference count.

56
User File Descriptor table
• Each process has a user file descriptor table to identify
all opened files.
• An entry in user file descriptor table pointer to an entry
of kernel’s global file table.
• Entry 0: standard input
• Entry 1: standard output
• Entry 2: error output

57
System Call: open
• open: A process may open a existing file to read or write
• syntax:
fd = open(pathname, mode);
A. pathname is the filename to be opened
B. mode: read/write
• Example

58
#include <stdio.h> $ cc openEx1.c -o openEx1
#include <sys/types.h> $ openEx1
#include <fcntl.h> Before open ...
fd1=3 fd2=4 fd3=5
main() $
{
int fd1, fd2, fd3;
printf("Before open ...\n");
fd1 = open("/etc/passwd", O_RDONLY);
fd2 = open("./openEx1.c", O_WRONLY);
fd3 = open("/etc/passwd", O_RDONLY);
printf("fd1=%d fd2=%d fd3=%d \n", fd1, fd2, fd3);
}

59
User file
descriptor in-core
U area table file table inodes
0
... …
1
Pointer to 2 CNT=1 R CNT=2
Descriptor table 3 /etc/passwd
...
4
CNT=1 W
5 ...
6
7 ... CNT=1
./openEx2.c
.
. CNT=1 R
...
. ...
60
System Call: read
• read: A process may read an opened file
• syntax:
fd = read(fd, buffer, count);
A. fd: file descriptor
B. buffer: data to be stored in
C. count: the number (count) of byte
• Example

61
#include <stdio.h>
#include <sys/types.h> $ cc openEx2.c -o openEx2
#include <fcntl.h> $ openEx2
=======
main() fd1=3 buf1=root:x:0:1:Super-Us
{ fd1=3 buf2=er:/:/sbin/sh
int fd1, fd2, fd3; daemo
char buf1[20], buf2[20]; =======
buf1[19]='\0'; $
buf2[19]='\0';
printf("=======\n");
fd1 = open("/etc/passwd", O_RDONLY);
read(fd1, buf1, 19);
printf("fd1=%d buf1=%s \n",fd1, buf1);
read(fd1, buf2, 19);
printf("fd1=%d buf2=%s \n",fd1, buf2);
printf("=======\n"); 62
}
#include <stdio.h>
#include <sys/types.h> $ cc openEx3.c -o openEx3
#include <fcntl.h> $ openEx3
main() ======
{ fd1=3 buf1=root:x:0:1:Super-Us
int fd1, fd2, fd3; fd2=4 buf2=root:x:0:1:Super-Us
char buf1[20], buf2[20]; ======
buf1[19]='\0'; $
buf2[19]='\0';
printf("======\n");
fd1 = open("/etc/passwd", O_RDONLY);
fd2 = open("/etc/passwd", O_RDONLY);
read(fd1, buf1, 19);
printf("fd1=%d buf1=%s \n",fd1, buf1);
read(fd2, buf2, 19);
printf("fd2=%d buf2=%s \n",fd2, buf2);
printf("======\n"); 63
}
User file
descriptor in-core
U area table file table inodes
0
... …
1
Descriptor 2 CNT=1 R CNT=2
table 3 /etc/passwd
...
4
...
5 ...
6
7 ...
...
.
. CNT=1 R
...
. ...
64
System Call: dup
• dup: copy a file descriptor into the first free slot of the
user file descriptor table.
• syntax:
newfd = dup(fd);
A. fd: file descriptor
Example

65
#include <stdio.h>
#include <sys/types.h> $ cc openEx4.c -o openEx4
#include <fcntl.h> $ openEx4
main() ======
{ fd1=3 buf1=root:x:0:1:Super-Us
int fd1, fd2, fd3; fd2=4 buf2=er:/:/sbin/sh
char buf1[20], buf2[20]; daemo
buf1[19]='\0'; ======
buf2[19]='\0'; $
printf("======\n");
fd1 = open("/etc/passwd", O_RDONLY);
fd2 = dup(fd1);
read(fd1, buf1, 19);
printf("fd1=%d buf1=%s \n",fd1, buf1);
read(fd2, buf2, 19);
printf("fd2=%d buf2=%s \n",fd2, buf2);
printf("======\n"); char buf1[20], buf2[20];
66
}
User file
descriptor in-core
U area table file table inodes
0
... …
1
Descriptor 2 CNT=2 R CNT=1
table 3 /etc/passwd
...
4
...
5 ...
6
7 ...
...
.
. ...
...
. ...
67
System Call: creat
• creat: A process may create a new file by creat system
call
• syntax:
fd = write(pathname, mode);
A. pathname: file name
B. mode: read/write
Example

68
System Call: close
• close: A process may close a file by close system
call
• syntax:
close(fd);
A. fd: file descriptor
Example

69
System Call: write
• write: A process may write data to an opened file
• syntax:
fd = write(fd, buffer, count);
A. fd: file descriptor
B. buffer: data to be stored in
C. count: the number (count) of byte
• Example

70
/* creatEx1.c */
#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
main()
{
int fd1;
char *buf1="I am a string\n";
char *buf2="second line\n";
printf("======\n");
fd1 = creat("./testCreat.txt", O_WRONLY);
write(fd1, buf1, 20);
write(fd1, buf2, 30);
printf("fd1=%d buf1=%s \n",fd1, buf1);
close(fd1);
chmod("./testCreat.txt", 0666);
printf("======\n"); 71
}
$ cc creatEx1.c -o creatEx1
$ creatEx1
======
fd1=3 buf1=I am a string

======
$ ls -l testCreat.txt
-rw-rw-rw- 1 cheng staff 50 May 10 20:37 testCreat.txt
$ more testCreat.txt
...

72
System Call: stat/fstat
• stat/fstat: A process may query the status of a file (locked)
file type, file owner, access permission. file size, number
of links, inode number, access time.
• syntax:
stat(pathname, statbuffer); fstat(fd, statbuffer);
A. pathname: file name
B. statbuffer: read in data
C. fd: file descriptor
Example

73
/* statEx1.c */
#include <sys/stat.h>
main()
{
int fd1, fd2, fd3;
struct stat bufStat1, bufStat2;
char buf1[20], buf2[20];
printf("======\n");
fd1 = open("/etc/passwd", O_RDONLY);
fd2 = open("./statEx1", O_RDONLY);
fstat(fd1, &bufStat1); fstat(fd2, &bufStat2);
printf("fd1=%d inode no=%d block size=%d blocks=%d\n",
fd1, bufStat1.st_ino,bufStat1.st_blksize, bufStat1.st_blocks);
printf("fd2=%d inode no=%d block size=%d blocks=%d\n",
fd2, bufStat2.st_ino,bufStat2.st_blksize, bufStat2.st_blocks);
printf("======\n");
} 74
$ cc statEx1.c -o statEx1
$ statEx1
======
fd1=3 inode no=21954 block size=8192 blocks=6
fd2=4 inode no=190611 block size=8192 blocks=
======
...

75
System Call: link/unlink
• link: hardlink a file to another

• syntax:
link(sourceFile, targetFile); unlink(file)
A. sourceFile targetFile, file: file name
Example:
Lab exercise: write a c program which use link/unlink
system call. Use ls -l to see the reference count.

76
System Call: chdir
• chdir: A process may change the current directory
of a processl
• syntax:
chdir(pathname);
A. pathname: file name
Example

77
#include <stdio.h> $ ls -l /usr/bin
#include <sys/types.h> $
#include <fcntl.h>

main()
{
chdir("/usr/bin");
system("ls -l");
}

78
End of
System Kernel Lecture

79

S-ar putea să vă placă și