Documente Academic
Documente Profesional
Documente Cultură
INTRODUCTION
In computing, a process is an instance of a computer program, consisting of one or more threads, that is
being sequentially executed by a computer system that has the ability to run several computer programs
concurrently. Process migration on the other hand is the act of transferring a process between two
machines. It enables dynamic load distribution, fault resilience, eased system administration, and data
access locality, while thread is a single sequence stream within in a process but because threads have some
of the properties of processes, they are sometimes called lightweight processes. In the cause of this term
paper, we shall study in detail what each of the above given definitions are and how the inter-relate.
We shall also study the management of processes, representation and inter-process communication.
While we examine process migration, we shall take a look at the goals of process migration and the process
migration algorithm. Under threading we will look at the advantages and uses of threading, types and reasons
for threading. Since thread is a single sequence stream within a process and have some similarities with
processes, we shall compare threads with processes. Examine their differences, scrutinize their similarities and
stress the advantages and disadvantages of threads over multi-process.
In wrapping up to this term paper we shall enumerate the implementation of threads and draw over
conclusion based on our research.
2.0
PROCESS
In computing, a process is an instance of a computer program, consisting of one or more threads, that
is being sequentially executed by a computer system that has the ability to run several computer
programs concurrently. A computer program itself is just a passive collection of instructions, while a
process is the actual execution of those instructions. Several processes may be associated with the same
program; for example, opening up several instances of the same program often means more than one
process is being executed. In the computing world, processes are formally defined by the operating
system (OS) running them and so may differ in detail from one OS to another. A single computer
processor executes one or more (multiple) instructions at a time (per clock cycle), one after the other
(this is a simplification; for the full story, see superscalar CPU architecture). To allow users to run
several programs at once (e.g., so that processor time is not wasted waiting for input from a resource),
single-processor computer systems can perform time-sharing. Time-sharing allows processes to switch
between being executed and waiting (to continue) to be executed. In most cases this is done very rapidly,
providing the illusion that several processes are executing 'at once'. This is known as concurrency or
multiprogramming.
2.3
Process management
Multiprogramming systems explicitly allow multiple processes to exist at any given time, where
only one is using the CPU at any given moment, while the remaining processes are performing I/O or
are waiting.
The process manager is of the four major parts of the operating system. It implements the process
abstraction. It does this by creating a model for the way the process uses CPU and any system resources.
Much of the complexity of the operating system stems from the need for multiple processes to share the
hardware at the same time. As a consequence of this goal, the process manager implements CPU sharing
(called scheduling ), process synchronization mechanisms, and a deadlock strategy. In addition,
the process manager implements part of the operating system's protection and security
2.4
Inter-process communication
When processes communicate with each other it is called "Inter-process communication" (IPC).
Processes frequently need to communicate, for instance in a shell pipeline, the output of the first process
need to pass to the second one, and so on to the other process. It is preferred in a well-structured way
not using interrupts.
It is even possible for the two processes to be running on different machines. The operating system (OS)
may differ from one process to the other, therefore some mediator(s) (called protocols) are needed.
2.5
If the OS supports multiprogramming, then it needs to keep track of all the processes. For each process, its
process control block PCB is used to track the process's execution status, including the following:
Its current processor registers contents
Its processor state (if it is blocked or ready)
Its memory state
A pointer to its stack
Which resources have been allocated to it
Which resources it needs
3.1
The goals of process migration are closely tied with the type of applications that use migration,
as described in next section. The goals of process migration include:
Accessing more processing power is a goal of migration when it is used for load
distribution. Migration is particularly important in the receiver-initiated distributed
scheduling algorithms,where a lightly loaded node announces its availability and initiates
process migrationfrom an overloaded node. This was the goal of many systems described
in this survey, such as Locus [Walkeret al., 1983], MOSIX [Barak and Shiloh, 1985], and
Mach [Milojicic et al., 1993a]. Load
management and distributed scheduling (see Sections 2.7 and 2.8). A variation of this
goal is harnessing the computing power of temporarily free workstations in large clusters.
In this case, process migration is used to evict processes upon the owners return, such as
in the case of Sprite .
Exploitation of resource locality is a goal of migration in cases when it is more efficient to
access resources locally than remotely. Moving a process to another end of a
communication channel transforms remote communication to local and thereby
significantly improves performance. It is also possible that the resource is not remotely
accessible, as in the case when there are different semantics for local and remote
accesses. Examples include work by Jul [1989], Milojicic et al. [1993], and Miller and
Presotto [1981].
Resource sharing is enabled by migration to a specific node with a special hardware
device, large amounts of free memory, or some other unique resource. Examples include
NOW[Anderson et al., 1995] for utilizing memory of remote nodes, and the use of
parallel make in Sprite [Douglis and Ousterhout, 1991] and work by Skordos [1995] for
utilizing unused workstations.
Fault resilience is improved by migration from a partially failed node, or in the case of
long-running applications when failures of different kinds (network, devices) are
probable [Chu et al., 1980]. In this context, migration can be used in combination with
checkpointing, such as in Condor [Litzkow and Solomon, 1992] or Utopia [Zhou et al.,
1994]. Large-scale systems where there is a likelihood that some of the systems can fail
can also benefit from migration, such as in Hive [Chapin95] and OSF/1 AD TNC
[Zajc93].
3.2
6. State is transferred and imported into a new instance on the remote node. Not all of the
state needs to be transferred; some of the state could be lazily brought over after migration is
completed.
7. Some means of forwarding references to the migrated process must be maintained. This is
required in order to communicate with the process or to control it. It can be achieved by
registering the current location at the home node (e.g. in Sprite), by searching for the migrated
process (e.g. in the V Kernel, at the communication protocol level), or by forwarding messages
across all visited nodes (e.g. in Charlotte). This step also enables migrated communication
channels at the destination and it ends step 3 as communication is permanently redirected.
8. The new instance is resumed when sufficient state has been transferred and imported. With
this step, process migration completes. Once all of the state has been transferred from the
original instance, it may be deleted on the source node.
4.0 Threading
A thread of execution results from a fork of a computer program into two or more concurrently
running tasks. The implementation of threads and processes differs from one operating system to
another, but in most cases, a thread is contained inside a process. Multiple threads can exist within the
same process and share resources such as memory, while different processes do not share these
resources.
4.1
Multithreading
Each thread is represented simply by a PC, registers, stack and a small control block, all stored
in the user process address space.
Simple Management:
This simply means that creating a thread, switching between threads and synchronization
between threads can all be done without intervention of the kernel.
Fast and Efficient:
Thread switching is not much more expensive than a procedure call.
Disadvantages of Kernel Thread
There is a lack of coordination between threads and operating system kernel. Therefore,
process as whole gets one time slice irrespective of whether process has one thread or
1000 threads within. It is up to each thread to relinquish control to other threads.
User-level threads requires non-blocking systems call i.e., a multithreaded kernel.
Otherwise, entire process will blocked in the kernel, even if there are run able threads left
in the processes. For example, if one thread causes a page fault, the process blocks.
User thread: Threads are sometimes implemented in userspace libraries, thus called user
threads. The kernel is not aware of them, they are managed and scheduled in userspace. Some
implementations base their user threads on top of several kernel threads to benefit from multiprocessor machines (N:M model). In this article the term "thread" (without kernel or user
qualifier) defaults to referring to kernel threads. User threads as implemented by virtual
machines are also called green threads. User threads are generally fast to create and manage.
Advantages of User threads
Because kernel has full knowledge of all threads, Scheduler may decide to give more time
to a process having large number of threads than process having small number of threads.
Kernel-level threads are especially good for applications that frequently block.
Disadvantages of User threads
The kernel-level threads are slow and inefficient. For instance, threads operations are
hundreds of times slower than that of user-level threads.
Since kernel must manage and schedule threads as well as processes. It require a full thread
control block (TCB) for each thread to maintain information about threads. As a result
there is significant overhead and increased in kernel complexity.
Fibers: are an even lighter unit of scheduling which are cooperatively scheduled: a running fiber
must explicitly "yield" to allow another fiber to run, which makes their implementation much
easier than kernel or user threads. A fiber can be scheduled to run in any thread in the same
process. This permits applications to gain performance improvements by managing scheduling
themselves, instead of relying on the kernel scheduler (which may not be tuned for the
application). Parallel programming environments such as OpenMP typically implement their
tasks through fibers.
4.2
Threads in the same process share the same address space. This allows concurrently-running
code to couple tightly and conveniently exchange data without the overhead or complexity of an
IPC. When shared between threads, however, even simple data structures become prone to race
hazards if they require more than one CPU instruction to update: two threads may end up
attempting to update the data structure at the same time and find it unexpectedly changing
underfoot. Bugs caused by race hazards can be very difficult to reproduce and isolate.
To prevent this, threading APIs offer synchronization primitives such as mutexes to lock data
structures against concurrent access. On uniprocessor systems, a thread running into a locked
mutex must sleep and hence trigger a context switch. On multi-processor systems, the thread
may instead poll the mutex in a spinlock. Both of these may sap performance and force
processors in SMP systems to contend for the memory bus, especially if the granularity of the
locking is fine.
4.3
User thread or fiber implementations are typically entirely in userspace. As a result, context
switching between user threads or fibers within the same process is extremely efficient because it
does not require any interaction with the kernel at all: a context switch can be performed by
locally saving the CPU registers used by the currently executing user thread or fiber and then
loading the registers required by the user thread or fiber to be executed. Since scheduling occurs
in userspace, the scheduling policy can be more easily tailored to the requirements of the
program's workload.
However, the use of blocking system calls in user threads or fibers can be problematic. If a user
thread or a fiber performs a system call that blocks, the other user threads and fibers in the
process are unable to run until the system call returns. A typical example of this problem is when
performing I/O: most programs are written to perform I/O synchronously. When an I/O operation
is initiated, a system call is made, and does not return until the I/O operation has been completed.
In the intervening period, the entire process is "blocked" by the kernel and cannot run, which
starves other user threads and fibers in the same process from executing.
A common solution to this problem is providing an I/O API that implements a synchronous
interface by using non-blocking I/O internally, and scheduling another user thread or fiber while
the I/O operation is in progress. Similar solutions can be provided for other blocking system
calls. Alternatively, the program can be written to avoid the use of synchronous I/O or other
blocking system calls.
4.4
Following are some reasons why we use threads in designing operating systems.
A process with multiple threads makes a great server for example printer server.
Because threads can share common data, they do not need to use interprocess
communication.
Because of the very nature, threads can take advantage of multiprocessors.
Threads are cheap in the sense that:
They only need a stack and storage for registers therefore, threads are cheap to create.
Threads use very little resources of an operating system in which they are working. That is,
threads do not need new address space, global data, program code or operating system
resources.
Context switching are fast when working with threads. The reason is that we only have to
save and/or restore PC, SP and registers.
But this cheapness does not come free - the biggest drawback is that there is no protection
between threads.
Because thread is a single sequence stream within a process and have some similarities with
processes, we shall here examine the similarities and also see how they differ.
Sharing: Treads allow the sharing of a lot resources that cannot be shared in process, for
example, sharing code section, data section, Operating System resources like open file
etc.
6.1
There are many different and incompatible implementations of threading. These include both
kernel-level and user-level implementations. They however often follow more or less closely the
POSIX Threads interface.
Kernel-level implementation examples
Light Weight Kernel Threads in various BSDs
M:N threading
Native POSIX Thread Library for Linux, an implementation of the POSIX Threads (pthreads)
standard
Apple Multiprocessing Services version 2.0 and later, uses the built-in nanokernel in Mac OS
8.6 and later which was modified to support it.
User-level implementation examples
GNU Portable Threads
FSU Pthreads
Apple Inc.'s Thread Manager
REALbasic (includes an API for cooperative threading)
Netscape Portable Runtime (includes a user-space fibers implementation)
Hybrid implementation examples
Scheduler activations used by the NetBSD native POSIX threads library implementation (an
N:M model as opposed to a 1:1 kernel or userspace implementation model)
Marcel from the PM2 project.
The OS for the Tera/Cray MTA
Microsoft Windows 7
Fiber implementation examples
Fibers can be implemented without operating system support, although some operating systems or
libraries provide explicit support for them.
Win32 supplies a fiber API[1] (Windows NT 3.51 SP3 and later)
Ruby
REFERENCE
Charles J. Northrup: Programming with UNIX Threads, John Wiley & Sons, ISBN 0471-13751-0
Bill Lewis: Threads Primer: A Guide to Multithreaded Programming, Prentice Hall, ISBN
0-13-443698-9
Steve Kleiman, Devang Shah, Bart Smaalders: Programming With Threads, SunSoft
Press, ISBN 0-13-172389-8
Pfister, Gregory F. (1998), In search of clusters, Upper Saddle River, NJ: Prentice Hall
PTR, ISBN 978-0138997090, OCLC 38300954
Buyya, Rajkumar; Cortes, Toni; Jin, Hai (2001), "Single System Image", International
Journal of High Performance Computing Applications 15 (2): 124,
doi:10.1177/109434200101500205
Smith, Jonathan M. (1988), "A survey of process migration mechanisms", ACM SIGOPS
Operating Systems Review 22: 28, doi:10.1145/47671.47673
CONTENT
1.0
INTRODUCTION
2.0
PROCESS
2.0.1
History of processes
2.1
Process Representation
2.2
Process State
2.3
Process Management
2.4
Inter-process Communication
2.5
3.0
PROCESS MIGRATION
3.0.1
Process Checkpointing
3.1
3.2
4.0
THREADING
4.1
Multithreading
4.1.0
5.0
6.0
4.2
4.3
4.4
Similarities
5.2
Differences
5.3
5.4
A
TERM PAPER
ON
PROCESS, PROCESS MIGRATION AND THREADING IN OPERATING SYSTEM
COMPLETED
BY
SUBMITTED TO
MRS.
DARAMOLA
IN
COMPUTER SCIENCE
IN
THE FEDERAL UNIVERSITY OF TECHNOLOGY AKURE,
ONDO STATE
November 2009