Sunteți pe pagina 1din 83

UNIT : IV

SYSTEM
PROGRAMMING
II SEMESTER (MCSE 201)
PREPARED BY ARUN PRATAP SINGH

PREPARED BY ARUN PRATAP SINGH 1

1

DISTRIBUTED OPERATING SYSTEM AND ITS DESIGN ISSUES:



UNIT : IV

PREPARED BY ARUN PRATAP SINGH 2

2



PREPARED BY ARUN PRATAP SINGH 3

3


PREPARED BY ARUN PRATAP SINGH 4

4




PREPARED BY ARUN PRATAP SINGH 5

5


PREPARED BY ARUN PRATAP SINGH 6

6


PREPARED BY ARUN PRATAP SINGH 7

7


PREPARED BY ARUN PRATAP SINGH 8

8


PREPARED BY ARUN PRATAP SINGH 9

9


PREPARED BY ARUN PRATAP SINGH 10

10


PREPARED BY ARUN PRATAP SINGH 11

11


PREPARED BY ARUN PRATAP SINGH 12

12


PREPARED BY ARUN PRATAP SINGH 13

13


PREPARED BY ARUN PRATAP SINGH 14

14

PREPARED BY ARUN PRATAP SINGH 15

15


PREPARED BY ARUN PRATAP SINGH 16

16













PREPARED BY ARUN PRATAP SINGH 17

17
NETWORKING ISSUES :

NETWORK TOPOLOGY :
Network topology is the arrangement of the various elements (links, nodes, etc.) of a computer
network. Essentially, it is thetopological structure of a network, and may be depicted physically or
logically. Physical topology refers to the placement of the network's various components,
including device location and cable installation, while logical topology shows how data flows within
a network, regardless of its physical design. Distances between nodes, physical interconnections,
transmission rates, and/or signal types may differ between two networks, yet their topologies may
be identical.
The study of network topology recognizes seven basic topologies:
Point-to-point
Bus
Star
Ring or circular
Mesh
Tree
Hybrid



PREPARED BY ARUN PRATAP SINGH 18

18


COMMUNICATION OVER THE NETWORK :


PREPARED BY ARUN PRATAP SINGH 19

19



PREPARED BY ARUN PRATAP SINGH 20

20



SWITCHING TECHNIQUES IN NETWORK :
In large networks there might be multiple paths linking sender and receiver. Information may be
switched as it travels through various communication channels. There are three typical switching
techniques available for digital traffic.
Circuit Switching
Message Switching
Packet Switching

PREPARED BY ARUN PRATAP SINGH 21

21
Circuit Switching :
Circuit switching is a technique that directly connects the sender and the receiver in an
unbroken path.
Telephone switching equipment, for example, establishes a path that connects the caller's
telephone to the receiver's telephone by making a physical connection.
With this type of switching technique, once a connection is established, a dedicated path
exists between both ends until the connection is terminated.
Routing decisions must be made when the circuit is first established, but there are no
decisions made after that time.
Circuit switching in a network operates almost the same way as the telephone system
works.
A complete end-to-end path must exist before communication can take place.
The computer initiating the data transfer must ask for a connection to the destination.
Once the connection has been initiated and completed to the destination device, the
destination device must acknowledge that it is ready and willing to carry on a transfer.

Message Switching :
With message switching there is no need to establish a dedicated path between two
stations.
When a station sends a message, the destination address is appended to the message.
The message is then transmitted through the network, in its entirety, from node to node.
Each node receives the entire message, stores it in its entirety on disk, and then transmits
the message to the next node.
This type of network is called a store-and-forward network.


PREPARED BY ARUN PRATAP SINGH 22

22
A message-switching node is typically a general-purpose computer. The device needs sufficient
secondary-storage capacity to store the incoming messages, which could be long. A time delay
is introduced using this type of scheme due to store- and-forward time, plus the time required to
find the next node in the transmission path.
Packet Switching :
Packet switching can be seen as a solution that tries to combine the advantages of
message and circuit switching and to minimize the disadvantages of both.
There are two methods of packet switching: Datagram and virtual circuit.

In both packet switching methods, a message is broken into small parts, called
packets.
Each packet is tagged with appropriate source and destination addresses.
Since packets have a strictly defined maximum length, they can be stored in main
memory instead of disk, therefore access delay and cost are minimized.
Also the transmission speeds, between nodes, are optimized.
With current technology, packets are generally accepted onto the network on a first-
come, first-served basis. If the network becomes overloaded, packets are delayed or
discarded (``dropped'').
Packet Switching: Datagram
Datagram packet switching is similar to message switching in that each packet is a self-
contained unit with complete addressing information attached.
This fact allows packets to take a variety of possible paths through the network.
So the packets, each with the same destination address, do not follow the same route,
and they may arrive out of sequence at the exit point node (or the destination).

PREPARED BY ARUN PRATAP SINGH 23

23
Reordering is done at the destination point based on the sequence number of the
packets.
It is possible for a packet to be destroyed if one of the nodes on its way is crashed
momentarily. Thus all its queued packets may be lost.

Packet Switching:Virtual Circuit-
In the virtual circuit approach, a preplanned route is established
before any data packets are sent.
A logical connection is established when
a sender send a "call request packet" to the receiver and
the receiver send back an acknowledge packet "call accepted packet" to the sender if
the receiver agrees on conversational parameters.
The conversational parameters can be maximum packet sizes, path to be taken, and
other variables necessary to establish and maintain the conversation.
Virtual circuits imply acknowledgements, flow control, and error control, so virtual circuits
are reliable.
That is, they have the capability to inform upper-protocol layers if a transmission problem
occurs.

ROUTING OF NETWORK TRAFFIC :
Routing is the process of selecting best paths in a network. In the past, the term routing was also
used to mean forwarding network traffic among networks. However this latter function is much
better described as simply forwarding. Routing is performed for many kinds of networks, including
the telephone network (circuit switching), electronic data networks (such as the Internet),
and transportation networks. This article is concerned primarily with routing in electronic data
networks using packet switching technology.
In packet switching networks, routing directs packet forwarding (the transit of logically
addressed network packets from their source toward their ultimate destination) through
intermediate nodes. Intermediate nodes are typically network hardware devices such
as routers, bridges, gateways, firewalls, or switches. General-purpose computers can also
forward packets and perform routing, though they are not specialized hardware and may suffer
from limited performance. The routing process usually directs forwarding on the basis of routing
tables which maintain a record of the routes to various network destinations. Thus, constructing
routing tables, which are held in the router's memory, is very important for efficient routing. Most

PREPARED BY ARUN PRATAP SINGH 24

24
routing algorithms use only one network path at a time. Multipath routing techniques enable the
use of multiple alternative paths.
In case of overlapping/equal routes, the following elements are considered in order to decide
which routes get installed into the routing table (sorted by priority):
1. Prefix-Length: where longer subnet masks are preferred (independent of whether it is
within a routing protocol or over different routing protocol)
2. Metric: where a lower metric/cost is preferred (only valid within one and the same routing
protocol)
3. Administrative distance: where a lower distance is preferred (only valid between different
routing protocols)
Routing, in a more narrow sense of the term, is often contrasted with bridging in its assumption
that network addresses are structured and that similar addresses imply proximity within the
network. Structured addresses allow a single routing table entry to represent the route to a group
of devices. In large networks, structured addressing (routing, in the narrow sense) outperforms
unstructured addressing (bridging). Routing has become the dominant form of addressing on the
Internet. Bridging is still widely used within localized environments.

PREPARED BY ARUN PRATAP SINGH 25

25


PREPARED BY ARUN PRATAP SINGH 26

26





PREPARED BY ARUN PRATAP SINGH 27

27


COMMUNICATION PROTOCOLS :
In telecommunications, a communications protocol is a system of digital rules for data exchange
within or between computers.
Communicating systems use well-defined formats for exchanging messages. Each message has
an exact meaning intended to elicit a response from a range of possible responses pre-
determined for that particular situation. Thus, a protocol must define the syntax, semantics, and
synchronization of communication; the specified behavior is typically independent of how it is to
be implemented. A protocol can therefore be implemented as hardware, software, or both.
Communications protocols have to be agreed upon by the parties involved. To reach agreement
a protocol may be developed into a technical standard. A programming language describes the
same for computations, so there is a close analogy between protocols and programming
languages: protocols are to communications as programming languages are to computations.

Basic requirements of protocols-
Messages are sent and received on communicating systems to establish communications.
Protocols should therefore specify rules governing the transmission. In general, much of the
following should be addressed:
Data formats for data exchange. Digital message bitstrings are exchanged. The bitstrings are
divided in fields and each field carries information relevant to the protocol. Conceptually the
bitstring is divided into two parts called the header area and the data area. The actual
message is stored in the data area, so the header area contains the fields with more relevance

PREPARED BY ARUN PRATAP SINGH 28

28
to the protocol. Bitstrings longer than the maximum transmission unit (MTU) are divided in
pieces of appropriate size.
Address formats for data exchange. Addresses are used to identify both the sender and the
intended receiver(s). The addresses are stored in the header area of the bitstrings, allowing
the receivers to determine whether the bitstrings are intended for themselves and should be
processed or should be ignored. A connection between a sender and a receiver can be
identified using an address pair (sender address, receiver address). Usually some address
values have special meanings. An all-1s address could be taken to mean an addressing of
all stations on the network, so sending to this address would result in a broadcast on the local
network. The rules describing the meanings of the address value are collectively called
an addressing scheme.
Address mapping. Sometimes protocols need to map addresses of one scheme on addresses
of another scheme. For instance to translate a logical IP address specified by the application
to an Ethernet hardware address. This is referred to as address mapping.
Routing. When systems are not directly connected, intermediary systems along the route to
the intended receiver(s) need to forward messages on behalf of the sender. On the Internet,
the networks are connected using routers. This way of connecting networks is
called internetworking.
Detection of transmission errors is necessary on networks which cannot guarantee error-free
operation. In a common approach, CRCs of the data area are added to the end of packets,
making it possible for the receiver to detect differences caused by errors. The receiver rejects
the packets on CRC differences and arranges somehow for retransmission.
Acknowledgements of correct reception of packets is required for connection-oriented
communication. Acknowledgements are sent from receivers back to their respective senders.
Loss of information - timeouts and retries. Packets may be lost on the network or suffer from
long delays. To cope with this, under some protocols, a sender may expect an
acknowledgement of correct reception from the receiver within a certain amount of time. On
timeouts, the sender must assume the packet was not received and retransmit it. In case of
a permanently broken link, the retransmission has no effect so the number of retransmissions
is limited. Exceeding the retry limit is considered an error.
Direction of information flow needs to be addressed if transmissions can only occur in one
direction at a time as on half-duplex links. This is known as Media Access Control.
Arrangements have to be made to accommodate the case when two parties want to gain
control at the same time.
Sequence control. We have seen that long bitstrings are divided in pieces, and then sent on
the network individually. The pieces may get lost or delayed or take different routes to their
destination on some types of networks. As a result pieces may arrive out of sequence.
Retransmissions can result in duplicate pieces. By marking the pieces with sequence

PREPARED BY ARUN PRATAP SINGH 29

29
information at the sender, the receiver can determine what was lost or duplicated, ask for
necessary retransmissions and reassemble the original message.
Flow control is needed when the sender transmits faster than the receiver or intermediate
network equipment can process the transmissions. Flow control can be implemented by
messaging from receiver to sender.
Getting the data across a network is only part of the problem for a protocol. The data received
has to be evaluated in the context of the progress of the conversation, so a protocol has to specify
rules describing the context. These kind of rules are said to express the syntax of the
communications. Other rules determine whether the data is meaningful for the context in which
the exchange takes place. These kind of rules are said to express the semantics of the
communications.



PREPARED BY ARUN PRATAP SINGH 30

30


PREPARED BY ARUN PRATAP SINGH 31

31


PREPARED BY ARUN PRATAP SINGH 32

32


PREPARED BY ARUN PRATAP SINGH 33

33


PREPARED BY ARUN PRATAP SINGH 34

34




PREPARED BY ARUN PRATAP SINGH 35

35





PREPARED BY ARUN PRATAP SINGH 36

36



PREPARED BY ARUN PRATAP SINGH 37

37





PREPARED BY ARUN PRATAP SINGH 38

38
Common types of protocols :
The Internet Protocol is used in concert with other protocols within the Internet Protocol Suite.
Prominent members of which include:
Transmission Control Protocol (TCP)
User Datagram Protocol (UDP)
Internet Control Message Protocol (ICMP)
Hypertext Transfer Protocol (HTTP)
Post Office Protocol (POP3)
File Transfer Protocol (FTP)
Internet Message Access Protocol (IMAP)
Other instances of high level interaction protocols are:
IIOP
RMI
DCOM
DDE
SOAP


MESSAGE PASSING :
Message passing is a concept from computer science that is used extensively in the design and
implementation of modern software applications; it is key to some models of
concurrency and object-oriented programming. Message passing is a way of invoking behavior
through some intermediary service or infrastructure. Rather than directly invoke a process,
subroutine, or function by name as in conventional programming, message passing sends a
message to a process (which may be an actor or object) and relies on the process and the
supporting infrastructure to select and invoke the actual code to run.
Message passing is used ubiquitously in modern computer software. It is used as a way for the
objects that make up a program to work with each other and as a way for objects and systems
running on different computers (e.g., the Internet) to interact. Message passing may be
implemented by various mechanisms, including channels.
Messages are collection of data objects and their structures
Messages have a header containing system dependent control information and a
message body that can be fixed or variable size.

PREPARED BY ARUN PRATAP SINGH 39

39
When a process interacts with another, two requirements have to be satisfied.
Synchronization and Communication.
Fixed Length
Easy to implement
Minimizes processing and storage overhead.
Variable Length
Requires dynamic memory allocation, so fragmentation could occur.

Message passing is a technique for invoking behavior (i.e., running a program) on a computer. In
contrast to the traditional technique of calling a program by name, message passing uses
an object model to distinguish the general function from the specific implementations. The
invoking program sends a message and relies on the object to select and execute the appropriate
code. The justifications for using an intermediate layer essentially falls into two categories:
encapsulation and distribution.
Encapsulation is the idea that software objects should be able to invoke services on other objects
without knowing or caring about how those services are implemented. Encapsulation can reduce
the amount of coding logic and make systems more maintainable. E.g., rather than having IF-

PREPARED BY ARUN PRATAP SINGH 40

40
THEN statements that determine which subroutine or function to call a developer can just send a
message to the object and the object will select the appropriate code based on its type.
Distributed message passing provides developers with a layer of the architecture that provides
common services to build systems made up of sub-systems that run on disparate computers in
different locations and at different times. When sending a distributed object a message the
messaging layer can take care of issues such as:
Finding the appropriate object, including objects running on different computers, using
different operating systems and programming languages, at different locations from where
the message originated.
Saving the message on a queue if the appropriate object to handle the message is not
currently running and then invoking the message when the object is available. Also, storing
the result if needed until the sending object is ready to receive it.
Controlling various transactional requirements for distributed transactions, e.g.
ensuring ACID properties on data.

Synchronous versus asynchronous message passing
One of the most important distinctions among message passing systems is whether they use
synchronous or asynchronous message passing. Synchronous message passing occurs between
objects that are running at the same time. With asynchronous message passing it is possible for
the receiving object to be busy or not running when the requesting object sends the message.
Synchronous message passing is what typical object-oriented programming languages such as
Java and Smalltalk use. Asynchronous message passing requires additional capabilities for
storing and retransmitting data for systems that may not run concurrently.
The advantage to synchronous message passing is that it is conceptually less complex.
Synchronous message passing is analogous to a function call in which the message sender is
the function caller and the message receiver is the called function. Function calling is easy and
familiar. Just as the function caller stops until the called function completes, the sending process
stops until the receiving process completes. This alone makes synchronous message unworkable
for some applications. For example, if synchronous message passing would be used exclusively,
large, distributed systems generally would not perform well enough to be usable. Such large,
distributed systems may need to continue to operate while some of their subsystems are down;
subsystems may need to go offline for some kind of maintenance, or have times when subsystems
are not open to receiving input from other systems.
Imagine a busy business office having 100 desktop computers that send emails to each other
using synchronous message passing exclusively. Because the office system does not use

PREPARED BY ARUN PRATAP SINGH 41

41
asynchronous message passing, one worker turning off their computer can cause the other 99
computers to freeze until the worker turns their computer back on to process a single email.
Asynchronous message passing is generally implemented so that all the complexities that
naturally occur when trying to synchronize systems and data are handled by an intermediary level
of software. Commercial vendors who develop software products to support these intermediate
levels usually call their software "middleware". One of the most common types of middleware to
support asynchronous messaging is called Message Oriented Middleware (MOM)
With asynchronous message passing, the sending system does not wait for a response.
Continuing the function call analogy, asynchronous message passing would be a function call
that returns immediately, without waiting for the called function to execute. Such an asynchronous
function call would merely deliver the arguments, if any, to the called function, and tell the called
function to execute, and then return to continue its own execution. Asynchronous message
passing simply sends the message to the message bus. The bus stores the message until the
receiving process requests messages sent to it. When the receiving process arrives at the result,
it sends the result to the message bus. And the message bus holds the message until the original
process (or some designated next process) picks up its messages from the message bus.
Synchronous communication can be built on top of asynchronous communication by using
a Synchronizer. For example, the -Synchronizer works by ensuring that the sender always waits
for an acknowledgement message from the receiver. The sender only sends the next message
after the acknowledgement has been received.

In addition to the distinction between synchronous and asynchronous message passing the other
primary distinguishing feature of message passing systems is whether they use distributed or
local objects. With distributed objects the sender and receiver may exist on different computers,
running different operating systems, using different programming languages, etc. In this case the
bus layer takes care of details about converting data from one system to another, sending and
receiving data across the network, etc. The Remote Procedure Call (RPC) protocol in Unix was
an early example of this. Note that with this type of message passing it is not a requirement that
either the sender or receiver are implemented using object-oriented programming. It is possible
to wrap systems developed using procedural languages and treat them as large grained objects
capable of sending and receiving messages.
Distributed or asynchronous message passing has some overhead associated with it compared
to the simpler way of simply calling a procedure. In a traditional procedure Call, arguments are
passed to the receiver typically by one or more general purpose registers or in a parameter
list containing the addresses of each of the arguments. This form of communication differs from
message passing in at least three crucial areas:
total memory usage

PREPARED BY ARUN PRATAP SINGH 42

42
transfer time
locality
In message passing, each of the arguments has to copy the existing argument into a portion of
the new message. This applies regardless of the size of the argument and in some cases the
arguments can be as large as a document which can be megabytes worth of data. The argument
has to be copied in its entirety and transmitted to the receiving object.
By contrast, for a standard procedure call, only an address (a few bits) needs to be passed for
each argument and may even be passed in a general purpose register requiring zero additional
storage and zero transfer time.

RPC IN HETEROGENEOUS ENVIRONMENT :
IPC part of distributed system can often be conveniently handled by message-passing
model.
It doesn't offer a uniform panacea for all the needs.
RPC emerged as a result of this.
It can be said as the special case of message-passing model.
It has become widely accepted because of the following features:
Simple call syntax and similarity to local procedure calls.
It specifies a well defined interface and this property supports compile-time type
checking and automated interface generation.
Its ease of use, efficiency and generality.
It can be used as an IPC mechanism between
processes on different machines and
also between different processes on the same machine.

RPC MODEL
It is similar to commonly used procedure call model. It works in the following manner:
1. For making a procedure call, the caller places arguments to the procedure in some
well specified location.
2. Control is then transferred to the sequence of instructions that constitutes the body
of the procedure.
3. The procedure body is executed in a newly created execution environment that
includes copies of the arguments given in the calling instruction.

PREPARED BY ARUN PRATAP SINGH 43

43
4. After the procedure execution is over, control returns to the calling point, returning
a result.
The RPC enables a call to be made to a procedure that does not reside in the address
space of the calling process.
Since the caller and the callee processes have disjoint address space, the remote
procedure has no access to data and variables of the callers environment.
RPC facility uses a message-passing scheme for information exchange between the caller
and the callee processes.
On arrival of request message, the server process
1. extracts the procedures parameters,
2. computes the result,
3. sends a reply message, and
4. then awaits the next call message.

Only one of the two processes is active at any given time.
It is not always necessary that the caller gets blocked.

PREPARED BY ARUN PRATAP SINGH 44

44
There can be RPC implementations depending on the parallelism of the caller and the
callees environment.
The RPC could be asynchronous, so that the client may do useful work while waiting for
the reply from the server.
Server could create a thread to process an incoming request so that server is free to
receive other requests.

Transparency of RPC-
A transparent RPC is one in which the local and remote procedure calls are
indistinguishable.
Types of transparencies:
Syntactic transparency
A remote procedure call should have exactly the same syntax as a local
procedure call.
Semantic transparency
The semantics of a remote procedure call are identical to those of a local
procedure call.
Syntactic transparency is not an issue but semantic transparency is difficult.

Difference between remote procedure calls and local procedure calls:
1. Unlike local procedure calls, with remote procedure calls,
Disjoint Address Space
Absence of shared memory.
Meaningless making call by reference, using addresses in arguments and
pointers.
2. RPCs are more vulnerable to failure because of:
Possibility of processor crashes or
communication problems of a network.
3. RPCs are much more time consuming than LPCs due to the involvement of
communication network.
Due to these reasons, total semantic transparency is impossible.



PREPARED BY ARUN PRATAP SINGH 45

45
Implementing RPC Mechanism-
To achieve semantic transparency, implementation of RPC mechanism is based on the
concepts of stubs.
Stubs
Provide a normal / local procedure call abstraction by concealing the underlying
RPC mechanism.
A separate stub procedure is associated with both the client and server processes.
To hide the underlying communication network, RPC communication package
known as RPC Runtime is used on both the sides.
Thus implementation of RPC involves the five elements of program:
1. Client
2. Client Stub
3. RPC Runtime
4. Server stub
5. Server
The client, the client stub, and one instance of RPCRuntime execute on the client machine.
The server, the server stub, and one instance of RPCRuntime execute on the server
machine.
As far as the client is concerned, remote services are accessed by the user by making
ordinary LPC

PREPARED BY ARUN PRATAP SINGH 46

46

Client Stub-
It is responsible for the following two tasks:
On receipt of a call request from the client,
it packs a specifications of the target procedure and the arguments into a
message and
asks the local RPC Runtime to send it to the server stub.
On receipt of the result of procedure execution, it unpacks the result and passes it
to the client.
RPCRuntime
It handles transmission of messages across the network between Client and the server
machine.
It is responsible for
Retransmission,
Acknowledgement,

PREPARED BY ARUN PRATAP SINGH 47

47
Routing and
Encryption.

Server Stub-
It is responsible for the following two tasks:
On receipt of a call request message from the local RPCRuntime, it unpacks it and
makes a perfectly normal call to invoke the appropriate procedure in the server.
On receipt of the result of procedure execution from the server, it unpacks the
result into a message and then asks the local RPCRuntime to send it to the client
stub.
Stub Generation-
Stubs can be generated in one of the following two ways:
Manually
Automatically

Automatic Stub Generation-
Interface Definition Language (IDL) is used here, to define the interface between a client
and the server.
Interface definition:
It is a list of procedure names supported by the interface together with the types of
their arguments and results.
It also plays role in reducing data storage and controlling amount of data
transferred over the network.
It has information about type definitions, enumerated types, and defined constants.
Export the interface
A server program that implements procedures in the interface.
Import the interface
A client program that calls procedures from an interface.
The interface definition is compiled by the IDL compiler.
IDL compiler generates
components that can be combined with client and server programs, without making
any changes to the existing compilers;

PREPARED BY ARUN PRATAP SINGH 48

48
Client stub and server stub procedures;
The appropriate marshaling and unmarshaling operations;
A header file that supports the data types.

RPC Messages-
RPC system is independent of transport protocols and is not concerned as to how a
message is passed from one process to another.
Types of messages involved in the implementation of RPC system:
Call messages
Reply messages
Call Messages
Components necessary in a call message are
1. The id. Information of the remote procedure to be executed.
2. The arguments necessary for the execution.
3. Message Identification field consists of a sequence number for identifying lost and
duplicate messages.
4. Message Type field: whether call or reply messages.
5. Client Identification Field allows the server to:
Identify the client to whom the reply message has to be returned and
To allow server to authenticate the client process.
Call Message Format

Reply Messages
Sent by the server to the client for returning the result of remote procedure execution.
Conditions for unsuccessful message sent by the server:

PREPARED BY ARUN PRATAP SINGH 49

49
The server finds that the call message is not intelligible to it.
Client is not authorized to use the service.
Remote procedure identifier is missing.
The remote procedure is not able to decode the supplied arguments.
Occurrence of exception condition.
Reply message formats





PREPARED BY ARUN PRATAP SINGH 50

50
RESOURCE ALLOCATION :





PREPARED BY ARUN PRATAP SINGH 51

51
Most Realtime systems are distributed across multiple processors. Different techniques are
used to manage resource allocation in such distributed systems. Some of these techniques
are discussed below. They are:

Centralized Resource Allocation
Hierarchical Resource Allocation
Bi-Directional Resource Allocation
Random Access
Centralized Resource Allocation-

In this technique, a centralized allocator keeps track of all the available resources. All entities
send messages requesting resources and the allocator responds with the allocated
resources. The main advantage of this scheme is simplicity. However, the disadvantage is
that as the system size increases, the centralized allocator gets heavily loaded and becomes
a bottleneck.
For example, in Xenon, the space slot allocation strategy uses this scheme. Space slot free-
busy status is maintained at the CAS. XEN cards needing space slots request the CAS for a
space slot via a message. CAS allocates the space slot and informs the requesting XEN by
a message.

Hierarchical Resource Allocation-

In this technique, the resource allocation is done in multiple steps. First, the centralized
allocator takes the high level resource allocation decision. Then, it passes on the allocation
request to the secondary allocator which takes the detailed resource allocation decision. This
technique is explained by an example given below.
In Xenon trunk allocation is carried out in following steps.:
1. The centralized allocator at CAS determines which trunk group should be used to route
outgoing calls.
2. The call request is then forwarded to the XEN handling the trunk group.
3. The XEN level allocator selects the actual trunk from the trunk group.
The advantage of this scheme is that the centralized allocator is not burdened with the detailed
resource allocation decisions. In this design, the detailed allocation decisions are taken by the
XEN. Thus this design scales very well when the number of XENs in the switch is increased.
Note that this example shows only a two step resource allocation. This technique could be
implemented in multiple levels of hierarchical resource allocations.

Bi-Directional Resource Allocation-


PREPARED BY ARUN PRATAP SINGH 52

52
This scheme allows two independent allocators to allocate the same set of resources. It is
used in situations like bi-directional trunk groups. The switch at each end of the trunk group
allocates the trunks from one specific end. This logic avoids both ends allocating the same
resource under light and medium load. However, under heavy load, there is a possibility of
both ends allocating the same resource. This situation is resolved by a fixed arbitration
algorithm leading to one of the switches withdrawing its claim on the resource.
Consider the example of two switches A and B sharing a bi-directional trunk group with 32
trunks. Switch A searches for a free resource starting the search from from trunk number 1.
Switch B searches for a free resource starting from trunk number 32. Most of the time there
will be no clash in resource allocation. However, under heavy utilization of the trunk group, on
occurrence of a clash for a trunk, the incoming call takes precedence over the outgoing call.

Random Access-

Whenever a resource needs to be shared between multiple entities which cannot synchronize
to each other and they do not have access to a centralized allocator, designers have to resort
to random access to the resource. Here all the entities needing the resource just assume that
they have the resource and use it. If two or more entities use the resource at the same time,
none of them gets service and all the entities retry again after a random backoff timer.
Consider the case of a mobile system, where subscribers that need to originate a call need
to send a message across to the base station. This is achieved by using a random access
channel. If two or more mobile terminals use the random access channel at the same time,
the message is corrupted and all requesting terminals will timeout for a response from the
base station and try again after a random backoff.
The main disadvantage of this technique is that the random access channel works well only
when the overall contention for the random access channel is very small. As contention
increases, hardly any resource requests are serviced.

DISTRIBUTED DEADLOCK DETECTION :
Distributed deadlocks can occur in distributed systems when distributed
transactions or concurrency control is being used. Distributed deadlocks can be detected either
by constructing a global wait-for graph from local wait-for graphs at a deadlock detector or by
a distributed algorithm like edge chasing.
Phantom deadlocks are deadlocks that are falsely detected in a distributed system due to
system internal delays but don't actually exist. For example, if a process releases a resource R1
and issues a request for R2, and the first message is lost or delayed, a coordinator (detector of
deadlocks) could falsely conclude a deadlock (if the request for R2 while having R1 would cause
a deadlock).

PREPARED BY ARUN PRATAP SINGH 53

53



PREPARED BY ARUN PRATAP SINGH 54

54


WAIT-FOR-GRAPH (WFG):


PREPARED BY ARUN PRATAP SINGH 55

55



PREPARED BY ARUN PRATAP SINGH 56

56







PREPARED BY ARUN PRATAP SINGH 57

57
ISSUES IN DEADLOCK DETECTION:











PREPARED BY ARUN PRATAP SINGH 58

58
RESOLUTION OF A DETECTED DEADLOCK :


MODELS OF DEADLOCKS :














PREPARED BY ARUN PRATAP SINGH 59

59
THE AND MODEL :



PREPARED BY ARUN PRATAP SINGH 60

60



PREPARED BY ARUN PRATAP SINGH 61

61
Deadlocks in distributed systems are similar to deadlocks in single processor systems,
only worse.
They are harder to avoid, prevent or even detect.
They are hard to cure when tracked down because all relevant information is
scattered over many machines.
People sometimes might classify deadlock into the following types:
Communication deadlocks -- competing with buffers for send/receive
Resources deadlocks -- exclusive access on I/O devices, files, locks, and other
resources.
We treat everything as resources, there we only have resources deadlocks.
Four best-known strategies to handle deadlocks:
The ostrich algorithm (ignore the problem)
Detection (let deadlocks occur, detect them, and try to recover)
Prevention (statically make deadlocks structurally impossible)
Avoidance (avoid deadlocks by allocating resources carefully)

The FOUR Strategies for handling deadlocks-
The ostrich algorithm
No dealing with the problem at all is as good and as popular in distributed systems
as it is in single-processor systems.
In distributed systems used for programming, office automation, process control,
no system-wide deadlock mechanism is present -- distributed databases will
implement their own if they need one.
Deadlock detection and recovery is popular because prevention and avoidance are so
difficult to implement.
Deadlock prevention is possible because of the presence of atomic transactions. We will
have two algorithms for this.
Deadlock avoidance is never used in distributed system, in fact, it is not even used in
single processor systems.
The problem is that the bankers algorithm need to know (in advance) how much
of each resource every process will eventually need. This information is rarely, if
ever, available.
Hence, we will just talk about deadlock detection and deadlock prevention.


PREPARED BY ARUN PRATAP SINGH 62

62
Distributed Deadlock Detection
Since preventing and avoiding deadlocks to happen is difficult, researchers works on
detecting the occurrence of deadlocks in distributed system.
The presence of atomic transaction in some distributed systems makes a major
conceptual difference.
When a deadlock is detected in a conventional system, we kill one or more
processes to break the deadlock --- one or more unhappy users.
When deadlock is detected in a system based on atomic transaction, it is resolved
by aborting one or more transactions.
But transactions have been designed to withstand being aborted.
When a transaction is aborted, the system is first restored to the state it had before
the transaction began, at which point the transaction can start again.
With a bit of luck, it will succeed the second time.
Thus the difference is that the consequences of killing off a process are much less
severe when transactions are used.
The Chandy-Misra-Haas algorithm:
Processes are allowed to request multiple resources at once -- the growing phase
of a transaction can be speeded up.
The consequence of this change is a process may now wait on two or more
resources at the same time.
When a process has to wait for some resources, a probe message is generated
and sent to the process holding the resources. The message consists of three
numbers: the process being blocked, the process sending the message, and the
process receiving the message.
When message arrived, the recipient checks to see it it itself is waiting for any
processes. If so, the message is updated, keeping the first number unchanged,
and replaced the second and third field by the corresponding process number.
The message is then send to the process holding the needed resources.
If a message goes all the way around and comes back to the original sender -- the
process that initiate the probe, a cycle exists and the system is deadlocked.

PREPARED BY ARUN PRATAP SINGH 63

63



PREPARED BY ARUN PRATAP SINGH 64

64
A method that might work is to order the resources and require processes to acquire them
in strictly increasing order. This approach means that a process can never hold a high
resource and ask for a low one, thus making cycles impossible.
With global timing and transactions in distributed systems, two other methods are possible
-- both based on the idea of assigning each transaction a global timestamp at the moment
it starts.
When one process is about to block waiting for a resource that another process is using,
a check is made to see which has a larger timestamp.
We can then allow the wait only if the waiting process has a lower timestamp.
The timestamp is always increasing if we follow any chain of waiting processes, so cycles
are impossible --- we can used decreasing order if we like.
It is wiser to give priority to old processes because
they have run longer so the system have larger investment on these processes.
they are likely to hold more resources.
A young process that is killed off will eventually age until it is the oldest one in the
system, and that eliminates starvation.

Wait-die Vs. Wound-wait-
As we have pointed out before, killing a transaction is relatively harmless, since by
definition it can be restarted safely later.
Wait-die:
If an old process wants a resource held by a young process, the old one will wait.
If a young process wants a resource held by an old process, the young process
will be killed.
Observation: The young process, after being killed, will then start up again, and be
killed again. This cycle may go on many times before the old one release the
resource.
Once we are assuming the existence of transactions, we can do something that had
previously been forbidden: take resources away from running processes.
When a conflict arises, instead of killing the process making the request, we can kill the
resource owner. Without transactions, killing a process might have severe consequences.
With transactions, these effects will vanish magically when the transaction dies.
Wound-wait: (we allow preemption & ancestor worship)
If an old process wants a resource held by a young process, the old one will
preempt the young process -- wounded and killed, restarts and wait.

PREPARED BY ARUN PRATAP SINGH 65

65
If a young process wants a resource held by an old process, the young process
will wait.

Centralized Deadlock Detection-
We use a centralized deadlock detection algorithm and try to imitate the non distributed
algorithm.
Each machine maintains the resource graph for its own processes and resources.
A centralized coordinator maintain the resource graph for the entire system.
When the coordinator detect a cycle, it kills off one process to break the deadlock.
In updating the coordinators graph, messages have to be passed.
Method 1) Whenever an arc is added or deleted from the resource graph,
a message have to be sent to the coordinator.
Method 2) Periodically, every process can send a list of arcs added and
deleted since previous update.
Method 3) Coordinator ask for information when it needs it.



PREPARED BY ARUN PRATAP SINGH 66

66
MECHANISM FOR BUILDING DISTRIBUTED FILE SYSTEM :
A distributed file system is a client/server-based application that allows clients to access and
process data stored on the server as if it were on their own computer. When a user accesses a
file on the server, the server sends the user a copy of the file, which is cachedon the user's
computer while the data is being processed and is then returned to the server.
Ideally, a distributed file system organizes file and directory services of individual servers into a
global directory in such a way that remote data access is not location-specific but is identical from
any client. All files are accessible to all users of the global file system and organization is
hierarchical and directory-based.
Since more than one client may access the same data simultaneously, the server must have a
mechanism in place (such as maintaining information about the times of access) to organize
updates so that the client always receives the most current version of data and that data conflicts
do not arise. Distributed file systems typically use file or database replication (distributing copies
of data on multiple servers) to protect against data access failures.
A distributed file system is a resource management component of a distributed operating
system. It implements a common file system that can be shared by all the autonomous
computers in the system.
Two important goals :
1. Network transparency to access files distributed over a network. Ideally, users
do not have to be aware of the location of files to access them.
2. High Availability - to provide high availability. Users should have the same easy access
to files, irrespective of their physical location.

ARCHITECTURE
In a distributed file system, files can be stored at any machine and the computation
can be performed at any machine.
The two most important services present in a distributed file system are the name server and
cache manager. A name server is a process that maps names specified by clients to stored
objects such as files and directories. The mapping (also referred to as name resolution) occur
when a process references a file or directory for the first time. A cache manager is a process that
implements file caching. In file caching, a copy of data stored at a remote file server is brought
to the clients machine when referenced by the client. Subsequent accesses to the data are
performed locally at the client, thereby reducing the access delays due to disk latency. Cache
managers can be present at both clients and file servers. Cache managers at the servers cache
files in the main memory to reduce relays due to disk latency. If multiple clients are allowed to
cache a file and modify it, the copies can become inconsistent. To avoid this inconsistency
problem, cache managers at both servers and clients coordinate to perform data storage and
retrieval operations.

PREPARED BY ARUN PRATAP SINGH 67

67

Architecture of a Distributed File System
A request by a process to access a data block is presented to the local cache (client cache) of
the machine (client) on which the process is running (see Figure 5.1). If the block is not in the
cache, then the local disk, if present, is checked for the presence of the data block. If the block
is present, then the request is satisfied and the block is loaded into the client cache. If the block
is not stored locally, then the request is passed on to the appropriate file server (as determined
by the name server). The server checks its own cache for the presence of the data block before
issuing a disk I/O request. The data block is transferred to the client cache in any case and loaded
to the server cache if it was missing in the server cache.
MECHANISMS FOR BUILDING DISTRIBUTED FILE SYSTEMS
1. Mounting
This mechanism provides the binding together of different filename spaces to form a single
hierarchically structured name space. It is UNIX specific and most of existing DFS ( Distributed
File System ) are based on UNIX. A filename space can be bounded to or mounted at an internal
node or a leaf node of a namespace tree. A node onto which a name space is mounted is called
mount point. The kernel maintains amount table, which maps mount points to appropriate
storage devices.
Uses of Mounting in DFS

PREPARED BY ARUN PRATAP SINGH 68

68
File systems maintained by remote servers are mounted at clients so that each client have
information regarding file servers. Two approaches are used to maintain mount information.
Approach 1: Mount information is maintained at clients that is each client has to individually
mount every required file system. When files are moved to a different server then mount
information must be updated in mount table of every client.
Approach 2: Mount information is maintained at servers. If files are moved to a different servers,
then mount information need only be updated at servers.
2. Caching
This mechanism is used in DFS to reduce delays in accessing of data. In file caching, a copy of
data stored at remote file server is brought to client when referenced by client so subsequent
access of data is performed locally at client, thus reducing access delays due to netwrok latency.
Data can be cached in main memory or on the local disk of the clients. Data is cached in main
memory at servers to reduce disk access latency.
Need of Caching in DFS:
File system performance gets improved accessing remote disks is much slower than accessing
local memory or local disks. It also reduces the frequency of access to file servers and the
communication network so scalability gets increased.
3. Hints
Caching results in the cache consistency problem when multiple clients cache and modify shared
data. This problem can be avoided by great level of co-operation between file servers and clients
which is very expansive. Alternative method is to treat cached data as hints that is cached data
are not expected to be completely accurate. Only those class of applications which can recover
after discovering that cached data are invalid can use this approach.
Example: After the name of file or directory is mapped to physical object, the address of object
can be stored as hint in the cache. If the address is incorrect that is fails to map the object, the
cached address is deleted form the cache and file server consult the same server to obtain the
actual location of file or directly and updated the cache.
4. Bulk Data Transfer
In this mechanism, multiple consecutive data blocks are trasferred from server to client. This
reduces file access overhead by obtaining multiple number of blocks with a single seek, by
formatting and transmitting multiple number of large packets in single context switch and by
reducing the number of acknowledgement that need to be sent. This mechanism is used as many
files are accessed in their entirety.


PREPARED BY ARUN PRATAP SINGH 69

69
5. Encryption
This mechanism is used for security in Distributed systems. The method was developed by
Needham Schrodkar is used in DFS security. In this scheme, two entities which want to
communicate establish a key for conversation with help of authentication server.

DISTRIBUTED SHARED MEMORY (DSM) :
A distributed shared memory is a mechanism allowing end-users' processes to access shared
data without using inter-process communications. In other words, the goal of a DSM system is to
make inter-process communications transparent to end-users. Both hardware and software
implementations have been proposed in the literature. From a programming point of view, two
approaches have been studied:
Shared virtual memory: This notion is very similar to the well-known concept of paged
virtual memory implemented in mono-processor systems. The basic idea is to group all
distributed memories together into a single wide address space. Drawbacks: such
systems do not allow to take into account the semantics of shared data: the data
granularity is arbitrarily fixed to some page size whatever the type and the actual size of
the shared data might be. The programmer has no means to provide informations about
these data.
Object DSM: in that class of approaches, shared data are objects i.e. variables with
access functions. In his applications, the user has only to define which data (objects) are
shared. The whole management of the shared objects (creation, access, modification) is
handled by the DSM system. In opposite of SVM systems which work at operating system
layer, objects DSM systems actually propose a programming model alternative to the
classical message-passing.
In any case, implementing a DSM system implies to address problems of data location, data
access, sharing and locking of data, data coherence. Such problems are not specific to parallelism
but have connections with distributed or replicated databases management systems
(transactionnal model), networks (data migrations), uniprocessor operating systems (concurrent
programming), distributed systems.


PREPARED BY ARUN PRATAP SINGH 70

70
What
- The distributed shared memory (DSM) implements the shared memory
model in distributed systems, which have no physical shared memory
- The shared memory model provides a virtual address space shared
between all nodes
- The overcome the high cost of communication in distributed systems,
DSM systems move data to the location of access
How:
- Data moves between main memory and secondary memory (within a
node) and between main memories of different nodes
- Each data object is owned by a node
- Initial owner is the node that created object
- Ownership can change as object moves from node to node
- When a process accesses data in the shared address space, the
mapping manager maps shared memory address to physical memory
(local or remote)


PREPARED BY ARUN PRATAP SINGH 71

71
Advantages of distributed shared memory (DSM) :
Data sharing is implicit, hiding data movement (as opposed to Send/Receive in
message passing model)
Passing data structures containing pointers is easier (in message passing model data
moves between different address spaces)
Moving entire object to user takes advantage of locality difference
Less expensive to build than tightly coupled multiprocessor system: off-the-shelf
hardware, no expensive interface to shared physical memory
Very large total physical memory for all nodes: Large programs can run more efficiently
No serial access to common bus for shared physical memory like in multiprocessor
systems
Programs written for shared memory multiprocessors can be run on DSM systems with
minimum changes
Algorithms for implementing DSM
Issues
- How to keep track of the location of remote data
- How to minimize communication overhead when accessing remote data
- How to access concurrently remote data at several nodes
1. The Central Server Algorithm
- Central server maintains all shared data
Read request: returns data item
Write request: updates data and returns acknowledgement message
- Implementation
A timeout is used to resend a request if acknowledgment fails
Associated sequence numbers can be used to detect duplicate write
requests
If an applications request to access shared data fails repeatedly, a failure
condition is sent to the application
- Issues: performance and reliability
- Possible solutions
Partition shared data between several servers
Use a mapping function to distribute/locate data
2. The Migration Algorithm
- Operation
Ship (migrate) entire data object (page, block) containing data item to
requesting location
Allow only one node to access a shared data at a time
- Advantages
Takes advantage of the locality of reference
DSM can be integrated with VM at each node
- Make DSM page multiple of VM page size
- A locally held shared memory can be mapped into the VM page
address space

PREPARED BY ARUN PRATAP SINGH 72

72
- If page not local, fault-handler migrates page and removes it from
address space at remote node
- To locate a remote data object:
Use a location server
Maintain hints at each node
Broadcast query
- Issues
Only one node can access a data object at a time
Thrashing can occur: to minimize it, set minimum time data object resides
at a node
3. The Read-Replication Algorithm
Replicates data objects to multiple nodes
DSM keeps track of location of data objects
Multiple nodes can have read access or one node write access (multiple readers-
one writer protocol)
After a write, all copies are invalidated or updated
DSM has to keep track of locations of all copies of data objects. Examples of
implementations:
IVY: owner node of data object knows all nodes that have copies
PLUS: distributed linked-list tracks all nodes that have copies
Advantage
The read-replication can lead to substantial performance improvements if
the ratio of reads to writes is large
4. The FullReplication Algorithm
- Extension of read-replication algorithm: multiple nodes can read and
multiple nodes can write (multiple-readers, multiple-writers protocol)
- Issue: consistency of data for multiple writers
- Solution: use of gap-free sequencer
All writes sent to sequencer
Sequencer assigns sequence number and sends write request to
all sites that have copies
Each node performs writes according to sequence numbers
A gap in sequence numbers indicates a missing write request:
node asks for retransmission of missing write requests







PREPARED BY ARUN PRATAP SINGH 73

73
DISTRIBUTED SCHEDULING :
Scheduling refers to assigning a resource and a start time end to a task
Much of scheduling research has been done in Operations Research Community
e.g Job Shop, Flow shop scheduling etc.
Scheduling is often an overloaded term in Grids.
A related term is mapping that assigns a resource to a task but not the start time.
Systems taxonomy :
Parallel Systems
Distributed Systems
Dedicated Systems
Shared Systems
Time Shared e.g. aludra
Space Shared e.g. HPCC cluster
Homogeneous Systems
Heterogeneous Systems

Scheduling Regimens :
Online/Dynamic Scheduling
Offline/Static Scheduling
Resource level Scheduling
Application level Scheduling

Applications taxonomy :
Bag of tasks Independent tasks
Workflows dependent tasks
Generally Directed Acyclic Graphs (DAGs)
Performance criteria
Completion time (makespan), reliability etc.



PREPARED BY ARUN PRATAP SINGH 74

74
Scheduling Bag of Tasks on Dedicated Systems:
Min-Min
Max-Min
Sufferage
Min-Min Heuristic :
For each task determine its minimum completion time over all machines
Over all tasks find the minimum completion time
Assign the task to the machine that gives this completion time
Iterate till all the tasks are scheduled

Max-Min Heuristic :
For each task determine its minimum completion time over all machines
Over all tasks find the maximum completion time
Assign the task to the machine that gives this completion time
Iterate till all the tasks are scheduled

PREPARED BY ARUN PRATAP SINGH 75

75


Sufferage Heuristic :
For each task determine the difference between its minimum and second minimum
completion time over all machines (sufferage)
Over all tasks find the maximum sufferage
Assign the task to the machine that gives this sufferage
Iterate till all the tasks are scheduled


PREPARED BY ARUN PRATAP SINGH 76

76

Grid Environments :
Time-shared resources
Heterogeneous resources
Tasks require input files that might be shared
Data transfer times are important

PREPARED BY ARUN PRATAP SINGH 77

77





PREPARED BY ARUN PRATAP SINGH 78

78
Scheduling Heuristic :
Schedule()
1. Compute the next Scheduling event
2. Create a Gantt Chart G
3. For each computation and file transfer underway
1. Compute an estimate of its completion time
2. Update the Gantt Chart G
4. Select a subset of tasks that have not started execution: T
5. Until each host has been assigned enough work
1. Heuristically assign tasks to hosts
6. Convert G into a plan
Sample Gantt Chart-



PREPARED BY ARUN PRATAP SINGH 79

79
Scheduling Task Graphs
Task Graphs have dependencies between the tasks in the Application
Scheduling methods for bag of task applications cannot be directly applied



PREPARED BY ARUN PRATAP SINGH 80

80





PREPARED BY ARUN PRATAP SINGH 81

81
ALGORITHM FOR DISTRIBUTED CONTROL :
In a distributed operating system using a symmetric design for its kernel, each node in the system
has identical capabilities with respect to each OS control function. This provides availability of the
function in the presence of failure, and reduces delays involved in performing a controlled
function. However a symmetric design requires synchronization of activities in a various nodes to
ensure absence of race conditions over OS data structures. For example, even if each node is
capable of querying the resource status and performing allocation, only one node is permitted to
do so at any time. Multiprocessor operating system use this approach. However in a distributed
OS, data structure containing resource status information may be distributed in different nodes.
Presence of local memories and communication links creates difficulties in obtaining global state
information. This is the fundamental difference between distributed and non-distributed
algorithms.
Here two groups of distributed control algorithms discussed. First group of algorithms implements
mutual exclusion in the OS. Algorithms of the second group are used for distributed deadlock
handling.

Understanding distributed control algorithms :-
A distributed control algorithm (DCA) implements control activities in different nodes of the
system. Each DCA provides a service of some kind, e. g. providing mutual exclusion, and has
clients which can be user or OS processes. A DCA performs some action when a client needs
its service, or when an event related to its service occurs in the system.



PREPARED BY ARUN PRATAP SINGH 82

82

entity called privilege token exists in the system. Only a process processing this token (called
the token holder) can enter a critical section. When a new process wishes to enter, it sends a
request to the token holder and waits till it receives the token.

S-ar putea să vă placă și