Sunteți pe pagina 1din 59

Contents

An Introduction to Distributed Systems......................................................................1


What is a distributed system?.................................................................................1
Why distributed systems?....................................................................................... 2
Design Goals & Issues............................................................................................. 4
Common Characteristics...................................................................................... 4
Distributed System Models......................................................................................... 7
Architectural Models................................................................................................ 8
Client-Server........................................................................................................ 8
Peer-to-Peer (P2P) Model...................................................................................... 9
Software Architecture and layers.......................................................................10
Software and hardware service layers in distributed systems............................11
System Architecture........................................................................................... 13
Communication in distributed systems....................................................................18
Message Passing................................................................................................... 19
Desirable Features of a Good Message-Passing System.....................................22
Uniform Semantics................................................................................................... 22
Correctness.............................................................................................................. 23
Issues in IPC by Message Passing.......................................................................24
Synchronization.................................................................................................. 25
Buffering............................................................................................................ 29
Unbounded-Capacity Buffer...................................................................................... 31
Finite-Bound Buffer................................................................................................... 31
Multidatagram Messages................................................................................... 32
Encoding and Decoding of Messages....................................................................32
Process Addressing............................................................................................ 34
Remote Procedure Calls (RPC)...............................................................................34
What Is RPC........................................................................................................ 35
How RPC Works.................................................................................................. 35
RPC Application Development............................................................................36

TKSawe
Distributed Naming.................................................................................................. 38
Name..................................................................................................................... 38
Name Spaces........................................................................................................ 39
DNS Domain Names.............................................................................................. 40
Understanding the DNS Domain Namespace.....................................................40
How the DNS Domain Namespace Is Organized.................................................40
Types of DNS Domain Names.............................................................................40
DNS and Internet Domains................................................................................. 41
Distributed Transactions........................................................................................... 43
Transaction............................................................................................................ 43
Commit and rollback of transactions..................................................................43
DESIRABLE PROPERTIES OF TRANSACTION............................................................45
ACID properties of transactions................................................................................ 48
What Are Distributed Transactions?.......................................................................49
Distributed System synchronization.........................................................................52
Computers clock................................................................................................... 52
Problems with physical clocks............................................................................53
Coordinated Universal Time (UTC)......................................................................53
Physical Clock Synchronization..........................................................................54

TKSawe
An Introduction to Distributed Systems

DISTRIBUTED SYSTEMS APPEARED relatively recently in the brief history of computer


systems. Several factors contributed to this. Computers got smaller and cheaper: we can fit more
of them in a given space and we can afford to do so. Tens to thousands can fit in a box whereas
in the past only one would fit in a good-sized room. Their price often ranges from less than ten to
a few thousand dollars instead of several million dollars.

More importantly, computers are faster. Network communication takes computational effort. A
slower computer would spend a greater fraction of its time working on communicating rather
than working on the users program. Couple this with past CPU performance and cost and
networking just wasnt viable. Finally, interconnect technologies have advanced to the point
where it is very easy and inexpensive to connect computers together. Over local area networks,
we can expect connectivity in the range of tens of Mbits/sec to a Gbit/sec. Tanenbaum defines a
distributed system as a collection of independent computers that appear to the users of the
system as a single computer. There are two essential points in this definition. The first is the use
of the word independent. This means that, architecturally, the machines are capable of operating
independently. The second point is that the software enables this set of connected machines to
appear as a single computer to the users of the system. This is known as the single system image
and is a major goal in designing distributed systems that are easy to maintain and operate.

What is a distributed system?


You know you have one when the crash of a computer youve never heard of stops you from
getting any work done. Leslie Lamport
A collection of (perhaps) heterogeneous nodes connected by one or more interconnection
networks which provides access to system-wide shared resources and services.
A collection of independent computers that appears to its users as asingle coherent
system.
1. It consists of multiple computers that do not share memory.
2. Each Computer has its own memory and runs its own operating system.
3. The computers can communicate with each other through a
communication network.

TKSawe
Why distributed systems?
Just because it is easy and inexpensive to connect multiple computers together does not
necessarily mean that it is a good idea to do so. There are genuine benefits in building distributed
systems: Price/performance ratio. You don't get twice the performance for twice the price in
buying computers. Processors are only so fast and the price/performance curve becomes
nonlinear and steep very quickly. With multiple CPUs, we can get (almost) double the
performance for double the money (as long as we can figure out how to keep the processors busy
and the overhead negligible).

Distributing machines may make sense. It makes sense to put the CPUs for ATM cash machines
at the source, each networked with the bank. Each bank can have one or more computers
networked with each other and with other banks. For computer graphics, it makes sense to putthe
graphics processing at the user's terminal to maximize the bandwidth between the device and
processor.

Computer supported cooperative networking. Users that are geographically separated can now
work and play together. Examples of this are electronic whiteboards, distributed document
systems, audio/video teleconferencing, email, file transfer, and games such as Doom, Quake, Age
of Empires, and Duke Nukeem, Starcraft, and scores of others. Increased reliability. If a small
percentage of machines break, the rest of the system remains intact and can do useful work.
Incremental growth. A company may buy a computer. Eventually the workload is too great for
the machine. The only option is to replace the computer with a faster one. Networking allows
you to add on to an existing infrastructure. Remote services. Users may need to access
information held by others at their systems.

Examples of this include web browsing, remote file access, and programs such as Napster and
Gnutella to access MP3 music. Mobility. Users move around with their laptop computers, Palm
Pilots, and WAP phones. It is not feasible for them to carry all the information they need with
them.

TKSawe
A distributed system has distinct advantages over a set of non-networked smaller
computers. Data can be shared dynamically giving private copies (via floppy disk, for
example) does not work if the data is changing. Peripherals can also be shared. Some
peripherals are expensive and/or infrequently used so it is not justifiable to give each PC
a peripheral. These peripherals include optical and tape jukeboxes, typesetters, large
format color printers and expensive drum scanners. Machines themselves can be shared
and workload can be distributed amongst idle machines. Finally, networked machines are
useful for supporting person-to-person networking: exchanging email, file transfer, and
information access (e.g., the web).

Advantages of distributed systems over traditional time-sharing systems

1. Much better price/performance ratio


2. Resource sharing
3. Enhanced performance -- tasks can be executed concurrently; load distribution to
reduce response time
4. Higher reliability -- data replication
5. Easier modular expansion -- hardware and software resources can be easily added
without replacing existing resources

Disadvantages of distributed system include

1. Designing, implementing and using distributed software may be difficult. Issues of


creating operating systems and/or languages that support distributed systems arise.

2. The network may lose messages and/or become overloaded. Rewiring the network can be
costly and difficult.

3. Security becomes a far greater concern. Easy and convenient data access from anywhere
creates security problems.

Examples of Distributed system


The Internet: net of nets. Global access to everybody (data, service, other actor; open ended)
enormous size (open ended)
no single authority
communication types interrogation, announcement, stream data, audio, video
Intranets- Internet within an organization
a single authority
protected access
- a firewall
- total isolation
may be worldwide

TKSawe
typical services:
- infrastructure services: file service, name service
- application services
Mobile and ubiquitous computing
Portable devices
laptops
handheld devices
wearable devices
devices embedded in appliances
Mobile computing
Location-aware computing
Ubiquitous computing, pervasive computing

Design Goals & Issues


Common Characteristics
What are we trying to achieve when we construct a distributed system?
Certain common characteristics can be used to assess distributed systems
Resource Sharing
Openness
Concurrency
Scalability
Fault Tolerance
Transparency

Resource Sharing
With Distributed Systems, it is easier for users to access remote resources and to share
resources with other users.
Examples: printers, files, Web pages, etc
A distributed system should also make it easier for users to exchange information.
Easier resource and data exchange could cause security problems a distributed system
should deal with this problem.

Ability to use any hardware, software or data anywhere in the system.


-Resource manager controls access, provides naming scheme and controls concurrency.
- Resource sharing model (e.g. client/server or object-based) describing how
resources are provided,
they are used and
provider and user interact with each other.

TKSawe
Openness
The openness of DS is determined primarily by the degree to which new resource-sharing
services can be added and be made available for use by a variety of client programs.
1. Openness: offer services according to rules and interfaces that describe the syntax and
semantics of those services
Interoperability and portability
-- Separating policy from mechanism

Openness is concerned with extensions and improvements of distributed systems.


Detailed interfaces of components need to be published.
New components have to be integrated with existing components.
Differences in data representation of interface types on different processors (of different
vendors) have to be resolved.
Concurrency
There is a possibility that several clients will attempt to access a shared resource at the
same time.
Any object that represents a shared resource in a distributed system must be
responsible for ensuring that operates correctly in a concurrent environment.

Components in distributed systems are executed in concurrent processes.


Components access and update shared resources (e.g. variables, databases, device
drivers).
Integrity of the system may be violated if concurrent updates are not coordinated.
Lost updates
Inconsistent analysis
Scalability

A system is described as scalable if it remains effective when there is a significant


increase in the number of resources and the number of users.
The challenge is to build distributed systems that scale with the increase in the number of
CPUs, users, and processes, larger databases, etc.
Scalability along several dimensions: size, geography, administrative domains

Challenges:
Controlling the cost of resources or money.
Controlling the performance loss.

Adaption of distributed systems to


accomodate more users
respond faster (this is the hard one)
Usually done by adding more and/or faster processors.
Components should not need to be changed when scale of a system increases.
Design components to be scalable!

TKSawe
Fault Tolerance
Hardware, software and networks fail!
Distributed systems must maintain availability even at low levels of
hardware/software/network reliability.
Fault tolerance is achieved by
recovery
redundancy
Transparency

It hides the fact that the processes and resources are physically distributed across
multiple computers.
How to achieve single-system image? How to hide distribution from users or
programs?
Is it a good idea? Sometimes requires trade off transparency for performance

Distributed systems should be perceived by users and application programmers as a


whole rather than as a collection of cooperating components.
Transparency has different dimensions that were identified by ANSA.
These represent various properties that distributed systems should have.

Access Transparency
Enables local and remote information objects to be accessed using identical
operations.
Example: File system operations in NFS, Navigation in the Web and SQL Queries
Location Transparency
Enables information objects to be accessed without knowledge of their location.
Example: File system operations in NFS, Pages in the Web, Tables in distributed
databases.
Concurrency Transparency
Enables serveral processes to operate concurrently using shared information
objects without interference between them.
Example: NFS, Automatic teller machine network, Database management
system.
Replication Transparency
Enables multiple instances of information objects to be used to increase reliability
and performance without knowledge of the replicas by users or application
programs
Example: Distributed DBMS, Mirroring Web Pages.
Failure Transparency
Enables the concealment of faults
Allows users and applications to complete their tasks despite the failure of other
components.

TKSawe
o Example: Database Management System
Migration Transparency
Allows the movement of information objects within a system without affecting
the operations of users or application programs.
Example: NFS, Web Pages
Performance Transparency
Allows the system to be reconfigured to improve performance as loads vary.
o Example: Distributed make.
Scaling Transparency
Allows the system and applications to expand in scale without change to the
system structure or the application algorithms.
Example: World-Wide-Web, Distributed Database
Distributed System Models

System architectures - depends on the specific application


What is the best way to separate and place the parts of a distributed system at the software and
system level?
Many models are drive by unique characteristics of participants, such as mobile, remote and ad-
hoc users
The design of interaction, failure and security will depend on application domain.

Characterization

The structure and the organization of systems and the relationship among their
components should be designed with the following goals in mind:

To cover the widest possible range of circumstances.

To face the possible difficulties and threats.

To meet the current and possibly the future demands.

Architectural models provide both:

a pragmatic starting point

a conceptual view

to address these challenges.

Widely varying models of use

TKSawe
High variation of workload, partial disconnection of components,
or poor connection.

Wide range of system environments

Heterogeneous hardware, operating systems, network, and


performance.

Internal problems

Non synchronized clocks, conflicting updates, various hardware


and software failures.

External threats

Attacks on data integrity, secrecy, and denial of service.

Architectural Models
Architectural models provide a high-level view of the distribution of functionality between
system components and the interaction relationships between them.
Architectural models define
components (logical components deployed at physical nodes)
communication
Criteria
performance
reliability
scalability, ..

Client-Server
Clients send requests to servers
A server is a system that runs a service
The server is always on and processes requests from
clients
Clients do not communicate with other clients
Client-server model:

TKSawe
Service provided by multiple servers:

Needed:
name service
trading/broker service
browsing service

Tiered architectures
Tiered (multi-tier) architectures
distributed systems analogy to a layered architecture
Each tier (layer)
Runs as a network service
Is accessed by surrounding layers
The classic client-server architecture is a two-tier model
Clients: typically responsible for user interaction
Servers: responsible for back-end services (data access, printing, )

Layered architectures
Break functionality into multiple layers
Each layer handles a specific abstraction
Hides implementation details and specifics of hardware, OS,
network abstractions, data encoding,

Peer-to-Peer (P2P) Model


No reliance on servers
Machines (peers) communicate with each other

TKSawe
Goals
Robustness
Expect that some systems may be down
Self-scalability: the system can handle greater workloads as more peers are added
Examples
BitTorrent, Skype

At a high level we can look at the application and user requirements, and classify the individual
processes and logic as Server, Client or Peer processes. Take the Dictionary server as an
example. We could have had one monolithic application, will all words and interface available
locally.
However a client-server approach may be more efficient. Previous queries could be cached
locally. Word additions, removes and modifications can be done centrally. C/S is valid approach!
Could use a peer model were each peer has a segment of Dictionary

Software Architecture and layers


The term software architecture referred:

Originally to the structure of software as layers or modules in a single computer.

More recently in terms of services offered and requested between processes in the
same or different computers.

Breaking up the complexity of systems by designing them through layers and services

Layer: a group of related functional components

Service: functionality provided to the next layer

TKSawe
.

services offered to above layer)

Layers exist in the stack at both a single machine and between multiple machines
Operating systems typically split functionality into layers, and hide the complexity of many
common operations from users and programmers.
Layers access levels above and below them as services

Software and hardware service layers in distributed systems

TKSawe
Applications, services

Middleware

Operating system
Platform

Computer and network hardware

A networked computer operating system is a perfect example of this layering


Users interact with applications and services using the GUI or CLI
Applications and services depend on middleware and OS functions to do anything useful
(RPC/RMI/TCP+IP)
Middleware/OS access the hardware and network links directly to send and receive information

Platform

The lowest hardware and software layers are often referred to as a platform for
distributed systems and applications.

These low-level layers provide services to the layers above them, which are implemented
independently in each computer.

Major Examples

Intel x86/Windows

Intel x86/Linux

Intel x86/Solaris

TKSawe
SPARC/SunOS

PowerPC/MacOS

Even thought these platforms are reasonably disparate (Unix, Windows) & (x86, PPC, SPARC)
they still have a core set of services and middleware in common. This is essential to enable these
platforms to communicate, interoperate and collaborate together with minimal effort.
We dont want to be converting between little endian/big endian or have to care about the
specifics of each platform at the higher level

Middleware

A layer of software whose purpose is to mask heterogeneity present in distributed


systems and to provide a convenient programming model to application developers.

Major Examples:

Sun RPC (Remote Procedure Calls)

OMG CORBA (Common Object Request Broker Architecture)

Microsoft D-COM (Distributed Components Object Model)

Sun Java RMI

Modern Middleware:

Manjrasoft Aneka for Cloud computing

IBM WebSphere

Microsoft .NET

Sun J2EE

Google AppEngine

To enable us to be truly platform agnostic, we need a core set of middleware and functions that
we can use on all platforms
Sun RPC is available on most Unix platforms
CORBA is a distributed middleware that is available for all major platforms
Java RMI is available on any platform there is a compliant JVM install
.NET is becoming a standard, and is available for Windows/Linux/MacOS
Higher level middleware (Gridbus/Globus) can hide even more complexity

TKSawe
System Architecture

The most evident aspect of DS design is the division of responsibilities between system
components (applications, servers, and other processes) and the placement of the
components on computers in the network.
It has major implication for:
-Performance, reliability, and security of the resulting system.

The decisions that you make as system designers is crucial [tradeoffs]


Moving some load to client (e.g. applet execution) can improve response and allow SOME
disconn. access, but can be a security risk.
Moving complexity on server can reduce client requirements and lower the entry bar (e.g. server
based CGI/PHP, client only needs browser)
TRADEOFFS!!!

Client Server Basic Model:


Clients invoke individual servers

Client invocation Server


invocation

result result
Server

Client
Key:
Process: Computer:

TKSawe
Client processes interact with individual server processes in a separate computer in order
to access data or resource. The server in turn may use services of other servers.

Example:

A Web Server is often a client of file server.

Browser search engine -> crawlers other web servers.

Querying a web server, which could then query a mysql or oracle database before returning the
content of a page
Web server is a client of the database server
A search engine, as well as serving search requests from clients, crawls other websites to keep its
information current.
Search engine is both a server and a client for other web servers.

A service provided by multiple servers

TKSawe
Service

Server
Client

Server

Client
Server

Services may be implemented as several server processes in separate host computers.


Example: Cluster based Web servers and apps such as Google, parallel databases Oracle

This topology is extremely common. A web site like google serves approximately 100M searches
a day.
It is obviously simply not feasible to serve them from a single server.
Google uses clusters containing 10s of thousands of machines offering equivalent services, and
you are redirected (via DNS and other means) to one of them. Can also be redirected at protocol
or application level.
Similar techniques can be used for Oracle databases, that are replicated over many servers to
offer redundancy and performance

TKSawe
Proxy servers (replication transparency) and caches: Web proxy server

Client Web
server
Proxy
server

Client Web
server

A cache is a store of recently used data.

Web proxy servers can operate at client level, at ISP level and at edge/gateway levels to improve
performance and reduce communication costs for frequently accessed data.
Caching can even be used for dynamic data (such as a google search). This reduces the load on
the web servers and improves the performance for end users by reducing the time taken for a
dynamic request. Google uses this technique extensively! When serving 100M requests/day this
saves resources. [DO A LIVE DEMO USING MU-WIRELESS!!!!!!!!!!!!!!!]

Peer Processes: A distributed application based on peer processes

TKSawe
Peer 2

Peer 1
Applic ation

Applic ation

S harable Peer 3
objec ts
Applic ation

Peer 4

Applic ation

Peers 5 .... N

All of the processes play similar roles, interacting cooperatively as peers to perform
distributed activities or computations without distinction between clients and servers.
E.g., music sharing systems Gnutella, Napster, Kaza, etc.

Distributed white board users on several computers to view and interactively modify
a picture between them.

Peer model suits ad-hoc groupings of participants. Can be used very effectively for instant
bittorrent swarm style downloaded.
-No central point of failure (reliable)
-No central point of control (difficult to deny service for adversaries)
-Some peers will typically contribute more than others (I.e. seed or super-peer)

TKSawe
Variants of Client Sever Model: Mobile code and Web applets
a) client request results in the downloading of applet code

Client Web
Applet code server

b) client interacts with the applet

Web
Client Applet server

Similar to client-server model but client takes on more responsibility


As code is executed locally the application has good responsiveness.
Need to be careful running such code as client can be exploited. From a server perspective, they
cannot control the client environment so there are integrity issues there as well (e.g. client applet
not suitable for online banking!)

Variants of Client Sever Model: Mobile Agents


A running program (code and data) that travels from one computer to another in a
network carrying out an autonomous task, usually on behalf of some other process
advantages: flexibility, savings in communications cost
virtual markets, software maintain on the computers within an organisation.
Potential security threat to the resources in computers they visit. The environment
receiving agent should decide which of the local resource to allow. (e.g., crawlers and
web servers).
Agents themselves can be vulnerable they may not be able to complete task if they are
refused access

TKSawe
Communication in distributed systems

The most important difference between a distributed system and a single processor system is the
communication between processes. In a single-processor communication implicitly assumes the
existence of shared memory:
Ex: problem of producers and consumers, where a process writes to a shared buffer and another
process reads from it.
In a distributed system there is no shared memory and thus the whole nature of communication
between processes must be rethought. Processes to communicate, they must adhere to rules
known as protocols. For distributed systems over a wide area, these protocols often take the form
of several layers and each layer has its own goals and rules. Messages are exchanged in various
ways, there are many design options in this regard, an important option is the "remote procedure
call." It is also important to consider the possibilities of communication between groups of
processes, not only between two processes. In distributed systems, the absence of physical
connection between the different memories of the teams, communication is performed by
message transfer.

Message Passing

Interprocess communication (IPC) basically requires information sharing among two or


more processes. Two basic methods for information sharing are as follows:

original sharing, or shared-data approach;


copy sharing, or message-passing approach.

TKSawe
Two basic interprocess communication paradigms: the shared data approach and
message passing approach.

TKSawe
In the shared-data approach, the information to be shared is placed in a common memory
area that is accessible to all processes involved in an IPC.

In the message-passing approach, the information to be shared is physically copied from


the sender processs space to the address space of all the receiver processes, and this is done by
transmitting the data to be copied in the form of messages (message is a block of information).

A message-passing system is a subsystem of distributed operating system that provides a


set of message-based IPC protocols, and does so by shielding the details of complex network
protocols and multiple heterogeneous platforms from programmers. It enables processes to
communicate by exchanging messages and allows programs to be written by using simple
communication primitives, such as send and receive.

TKSawe
Desirable Features of a Good Message-Passing System

Simplicity

A message passing system should be simple and easy to use. It should be possible to
communicate with old and new applications, with different modules without the need to worry
about the system and network aspects.

Uniform Semantics

In a distributed system, a message-passing system may be used for the following two types of
interprocess communication:

local communication, in which the communicating processes are on the same node;

remote communication, in which the communicating processes are on different nodes.

Semantics of remote communication should be as close as possible to those of local


communications. This is an important requirement for ensuring that the message passing is easy
to use.

Efficiency

TKSawe
An IPC protocol of a message-passing system can be made efficient by reducing the number of
message exchanges, as far as practicable, during the communication process. Some optimizations
normally adopted for efficiency include the following:

avoiding the costs of establishing and terminating connections between the same pair of
processes for each and every message exchange between them;
minimizing the costs of maintaining the connections;
piggybacking of acknowledgement of previous messages with the next message during a
connection between a sender and a receiver that involves several message exchanges.

Correctness

Correctness is a feature related to IPC protocols for group communication. Issues related
to correctness are as follows:

atomicity;
ordered delivery;
survivability.

Atomicity ensures that every message sent to a group of receivers will be delivered to
either all of them or none of them. Ordered delivery ensures that messages arrive to all receivers
in an order acceptable to the application. Survivability guarantees that messages will be correctly
delivered despite partial failures of processes, machines, or communication links.

Other features:

reliability;
flexibility;
security;
portability.

TKSawe
Issues in IPC by Message Passing

A message is a block of information formatted by a sending process in such a manner that


it is meaningful to the receiving process. It consists of a fixed-length header and a variable-size
collection of typed data objects. The header usually consists of the following elements:

Address. It contains characters that uniquely identify the sending and receiving processes in
the network.
Sequence number. This is the message identifier (ID), which is very useful for identifying lost
messages and duplicate messages in case of system failures.
Structural information. This element also has two parts. The type part specifies whether the
data to be passed on to the receiver is included within the message or the message only
contains a pointer to the data, which is stored somewhere outside the contiguous portion of
the message. The second part of this element specifies the length of the variable-size message
data.

A typical message structure

TKSawe
Synchronization

A central issue in the communication structure is the synchronization imposed on the


communicating processes by the communication primitives. The semantics used for
synchronization may by broadly classified as blocking and nonblocking types. A primitive is said
to have nonblocking semantics if its invocation does not block the execution of its invoker (the
control returns almost immediately to the invoker); otherwise a primitive is said to be of the
blocking type.

In case of a blocking send primitive, after execution of the send statement, the sending
process is blocked until it receives an acknowledgement from the receiver that the message has
been received. On the other hand, for nonblocking send primitive, after execution of the send
statement, the sending process is allowed to proceed with its execution as soon as the message
has been copied to a buffer.

In the case of blocking receive primitive, after execution of the receive statement, the
receiving process is blocked until it receives a message. On the other hand, for a nonblocking
receive primitive, the receiving process proceeds with its execution after execution of the receive
statement, which returns control almost immediately just after telling the kernel where the
message buffer is.

TKSawe
An important issue in a nonblocking receive primitive is how the receiving process
knows that the message has arrived in the message buffer. One of the following two methods is
commonly used for this purpose:

Polling. In this method, a test primitive is provided to allow the receiver to check the buffer
status. The receiver uses this primitive to periodically poll the kernel to check if the message
is already available in the buffer.

Interrupt. In this method, when the message has been filled in the buffer and is ready for use
by the receiver, a software interrupt is used to notify the receiving process.

A variant of the nonblocking receive primitive is the conditional receive primitive, which
also returns control to the invoking process almost immediately, either with a message or with
an indicator that no message is available.

When both the send and receive primitives of a communication between two processes
use blocking semantics, the communication is said to be synchronous, otherwise it is
asynchronous. The main drawback of synchronous communication is that it limits concurrency
and is subject to communication deadlocks.

TKSawe
Synchronous mode of communication with both send and receive primitives having blocking-
type semantics

TKSawe
TKSawe
Buffering

In the standard message passing model, messages can be copied many times: from the user
buffer to the kernel buffer (the output buffer of a channel), from the kernel buffer of the sending
computer (process) to the kernel buffer in the receiving computer (the input buffer of a channel),
and finally from the kernel buffer of the receiving computer (process) to a user buffer.

Null Buffer (No Buffering)

In this case, there is no place to temporarily store the message. Hence one of the following
implementation strategies may be used:

The message remains in the sender processs address space and the execution of the send is
delayed until the receiver executes the corresponding receive.

The message is simply discarded and the time-out mechanism is used to resend the message
after a timeout period. The sender may have to try several times before succeeding.

TKSawe
The three types of buffering strategies used in interprocess communication

TKSawe
Single-Message Buffer

In single-message buffer strategy, a buffer having a capacity to store a single message is used on
the receivers node. This strategy is usually used for synchronous communication, an application
module may have at most one message outstanding at a time.

Unbounded-Capacity Buffer

In the asynchronous mode of communication, since a sender does not wait for the receiver to be
ready, there may be several pending messages that have not yet been accepted by the receiver.
Therefore, an unbounded-capacity message-buffer that can store all unreceived messages is
needed to support asynchronous communication with the assurance that all the messages sent to
the receiver will be delivered.

Finite-Bound Buffer

Unbounded capacity of a buffer is practically impossible. Therefore, in practice, systems using


asynchronous mode of communication use finite-bound buffers, also known as multiple-
message buffers. In this case message is first copied from the sending processs memory into the
receiving processs mailbox and then copied from the mailbox to the receivers memory when
the receiver calls for the message.

When the buffer has finite bounds, a strategy is also needed for handling the problem of a
possible buffer overflow. The buffer overflow problem can be dealt with in one of the following
two ways:

TKSawe
Unsuccessful communication. In this method, message transfers simply fail, whenever there
is no more buffer space and an error is returned.

Flow-controlled communication. The second method is to use flow control, which means that
the sender is blocked until the receiver accepts some messages, thus creating space in the
buffer for new messages. This method introduces a synchronization between the sender and
the receiver and may result in unexpected deadlocks. Moreover, due to the synchronization
imposed, the asynchronous send does not operate in the truly asynchronous mode for all send
commands.

Multidatagram Messages
Almost all networks have an upper bound of data that can be transmitted at a time. This size is
known as maximum transfer unit (MTU). A message whose size is greater than MTU has to be
fragmented into multiples of the MTU, and then each fragment has to be sent separately. Each
packet is known as a datagram. Messages larger than the MTU are sent in miltipackets, and are
known as multidatagram messages.

Encoding and Decoding of Messages

A message data should be meaningful to the receiving process. This implies that, ideally, the
structure of program objects should be preserved while they are being transmitted from the
address space of the sending process to the address space of the receiving process. However,
even in homogenous systems, it is very difficult to achieve this goal mainly because of two
reasons:

TKSawe
An absolute pointer value loses its meaning when transferred from one process address space
to another.

Different program objects occupy varying amount of storage space. To be meaningful, a


message must normally contain several types of program objects, such as long integers, short
integers, variable-length character strings, and so on.

In transferring program objects in their original form, they are first converted to a stream
form that is suitable for transmission and placed into a message buffer. The process of
reconstruction of program object from message data on the receiver side is known as decoding
of message data. One of the following two representations may by used for the encoding and
decoding of a message data:

In tagged representation the type of each program object along with its value is encoded in
the message.

In untagged representation the message data only contains program object. No information is
included in the message data to specify the type of each program object.

TKSawe
Process Addressing

Another important issue in message-based communication is addressing (or naming) of the


parties involved in an interaction. For greater flexibility a message-passing system usually
supports two types of process addressing:

Explicit addressing. The process with which communication is desired is explicitly named as
a parameter in the communication primitive used.

Implicit addressing. The process willing to communicate does not explicitly name a process
for communication (the sender names a server instead of a process). This type of process
addressing is also known as functional addressing.

Remote Procedure Calls (RPC)


This provides an overview of Remote Procedure Calls (RPC) RPC.

TKSawe
What Is RPC

RPC is a powerful technique for constructing distributed, client-server based applications. It is


based on extending the notion of conventional, or local procedure calling, so that the called
procedure need not exist in the same address space as the calling procedure. The two processes
may be on the same system, or they may be on different systems with a network connecting
them. By using RPC, programmers of distributed applications avoid the details of the interface
with the network. The transport independence of RPC isolates the application from the physical
and logical elements of the data communications mechanism and allows the application to use a
variety of transports.

RPC makes the client/server model of computing more powerful and easier to program. When
combined with the ONC RPCGEN protocol compiler (Chapter 33) clients transparently make
remote calls through a local procedure interface.

How RPC Works

An RPC is analogous to a function call. Like a function call, when an RPC is made, the calling
arguments are passed to the remote procedure and the caller waits for a response to be returned
from the remote procedure. Figure 32.1 shows the flow of activity that takes place during an
RPC call between two networked systems. The client makes a procedure call that sends a request
to the server and waits. The thread is blocked from processing until either a reply is received, or
it times out. When the request arrives, the server calls a dispatch routine that performs the
requested service, and sends the reply to the client. After the RPC call is completed, the client
program continues. RPC specifically supports network applications.

TKSawe
Fig: Remote Procedure Calling Mechanism A remote procedure is uniquely identified by the
triple: (program number, version number, procedure number) The program number identifies a
group of related remote procedures, each of which has a unique procedure number. A program
may consist of one or more versions. Each version consists of a collection of procedures which
are available to be called remotely. Version numbers enable multiple versions of an RPC protocol
to be available simultaneously. Each version contains a a number of procedures that can be called
remotely. Each procedure has a procedure number.

RPC Application Development

Consider an example:

A client/server lookup in a personal database on a remote machine. Assuming that we cannot


access the database from the local machine (via NFS).

We use UNIX to run a remote shell and execute the command this way. There are some problems
with this method:

the command may be slow to execute.


You require an login account on the remote machine.

The RPC alternative is to

TKSawe
establish an server on the remote machine that can repond to queries.
Retrieve information by calling a query which will be quicker than previous approach.

To develop an RPC application the following steps are needed:

Specify the protocol for client server communication


Develop the client program

Develop the server program

The programs will be compiled seperately. The communication protocol is achieved by generated
stubs and these stubs and rpc (and other libraries) will need to be linked in.

TKSawe
Distributed Naming
Name
A name in a distributed system is a string of bits or characters used to refer to an entity.

Entity is something that is operated on using some access point.

Entities: hosts, printers, disks, files, processes, users, mailboxes, web pages, graphical
windows, messages, network connections, etc.

The name of the access point is called an address.

An entity may change its access point over course of time.

Thus the address cannot be treated as the name of the entity. Moreover an entity may have
more than one access point. Location independent name is separate from the address of the
access point.

An identifier is an unambiguous reference to an entity.

human-readable text: .login, caesar

system/low-level names: bit pattern (Amoeba)

multi-level name: caesar.cs.umn.edu

Entity

Entity has an address

serves as the access point for the entity

entity Jon_Server; address 192.44.33.64, 333

can change over time

Entity may have attributes

Jon_Server

Owner: jon

Lifetime: 1 hourIdentifiers

TKSawe
Identifier

- An identifier refers to at most one entity

-Each entity is referred to by at most one identifier

-An identifier always refers to the same e

Binding and Resolution

To use an entity, need to find an access point

Binding: associate {name, address}

usually maintained by a Name Server

Resolution: name -> address

sometimes this is called navigation

name -> Name Server1 -> Name Server2 -> ...

AddressNaming Service

Name Spaces
Set of all valid names to be used in a certain context, e. g., all valid URLs in WWW

Can be described using a generative grammar (e. g., BNF for URLs).

Internal structure

-Flat set of numeric or symbolic identifiers

-Hierarchy representing position (e. g., UNIX file system)

- Hierarchy representing organizational structure (e. g., Internet domains)

Potentially infinite

-Holds only for hierarchic name spaces

-Flat name spaces finite size induced by max. name length

TKSawe
Aliases

-In general, allows a convenient name to be substituted for a more complicated one

Naming domain

-Name space for which there exist a single administrative authority for assigning names
within it

DNS Domain Names


The Domain Name System is implemented as a hierarchical and distributed database containing
various types of data, including host names and domain names. The names in a DNS database
form a hierarchical tree structure called the domain namespace. Domain names consist of
individual labels separated by dots, for example: mydomain.microsoft.com.

Understanding the DNS Domain Namespace


The DNS domain namespace is based on the concept of a tree of named domains. Each level of
the tree can represent either a branch or a leaf of the tree. A branch is a level where more than
one name is used to identify a collection of named resources. A leaf represents a single name
used once at that level to indicate a specific resource

How the DNS Domain Namespace Is Organized


Any DNS domain name used in the tree is technically a domain. Most DNS discussions,
however, identify names in one of five ways, based on the level and the way a name is
commonly used. For example, the DNS domain name registered to Microsoft (microsoft.com.) is
known as a second-level domain. This is because the name has two parts (known as labels) that
indicate it is located two levels below the root or top of the tree. Most DNS domain names have
two or more labels, each of which indicates a new level in the tree. Periods are used in names to
separate labels.
The five categories used to describe DNS domain names by their function in the namespace are
described in the following table, along with an example of each name type.

TKSawe
Types of DNS Domain Names

Name Type Description Example

Root This is the top of the tree, representing an unnamed level; it is A single period (.) or a period
domain sometimes shown as two empty quotation marks (""), indicatingused at the end of a name, such as
a null value. When used in a DNS domain name, it is stated example.microsoft.com.
by a
trailing period (.) to designate that the name is located at the
root or highest level of the domain hierarchy. In this instance,
the DNS domain name is considered to be complete and points
to an exact location in the tree of names. Names stated this way
are called fully qualified domain names (FQDNs).

Top level A name used to indicate a country/region or the type of .com, which indicates a
domain organization using a name. name registered to a business
for commercial use on the
Internet.

Second Variable-length names registered to an individual or microsoft.com. , which is


level organization for use on the Internet. These names are always the second-level domain
domain based upon an appropriate top-level domain, depending on the name registered to Microsoft
type of organization or geographic location where a name is by the Internet DNS domain
used. name registrar.

Subdomain Additional names that an organization can create that are example.microsoft.com. ,
derived from the registered second-level domain name. These which is a fictitious
include names added to grow the DNS tree of names in an subdomain assigned by
organization and divide it into departments or geographic Microsoft for use in
locations. documentation example
names.

Host or Names that represent a leaf in the DNS tree of names and host-
resource identify a specific resource. Typically, the leftmost label of a a.example.microsoft.com.,
name DNS domain name identifies a specific computer on the where the first label (host-
network. For example, if a name at this level is used in a host a) is the DNS host name for
(A) RR, it is used to look up the IP address of computer based a specific computer on the
on its host name. network.

TKSawe
DNS and Internet Domains
The Internet Domain Name System is managed by a Name Registration Authority on the
Internet, responsible for maintaining top-level domains that are assigned by organization and by
country/region. These domain names follow the International Standard 3166. Some of the many
existing abbreviations, reserved for use by organizations, as well as two-letter and three-letter
abbreviations used for countries/regions are shown in the following table:
Some DNS Top-level Domain Names (TLDs)

DNS Domain Name Type of Organization

com Commercial organizations

edu Educational institutions

org Non-profit organizations

net Networks (the backbone of the Internet)

gov Non-military government organizations

mil Military government organizations

arpa Reverse DNS

xx Two-letter country code (i.e. us, au, ca, fr)

TKSawe
Distributed Transactions
Transaction
In computer programming, a transaction usually means a sequence of information exchange and
related work (such as database updating) that is treated as a unit for the purposes of satisfying a
request and for ensuring database integrity. For a transaction to be completed and database
changes to made permanent, a transaction has to be completed in its entirety. A typical
transaction is a catalog merchandise order phoned in by a customer and entered into a computer
by a customer representative. The order transaction involves checking an inventory database,
confirming that the item is available, placing the order, and confirming that the order has been
placed and the expected time of shipment. If we view this as a single transaction, then all of the
steps must be completed before the transaction is successful and the database is actually changed
to reflect the new order. If something happens before the transaction is successfully completed,
any changes to the database must be kept track of so that they can be undone.

A program that manages or oversees the sequence of events that are part of a transaction is
sometimes called a transaction monitor. Transactions are supported by Structured Query
Language, the standard database user and programming interface. When a transaction completes
successfully, database changes are said to be committed; when a transaction does not complete,
changes are rolled back. In IBM's Customer Information Control System product, a transaction is
a unit of application data processing that results from a particular type of transaction request. In
CICS, an instance of a particular transaction request by a computer operator or user is called a
task.

Less frequently and in other computer contexts, a transaction may have a different meaning. For
example, in IBM mainframe operating system batch processing, a transaction is a job or a job
step.

Commit and rollback of transactions

At any time, an application process might consist of a single transaction. However the life of an
application process can involve many transactions as a result of commit or rollback operations.

TKSawe
A transaction begins when data is read or written. A transaction ends with a COMMIT or
ROLLBACK statement or with the end of an application process.

The COMMIT statement commits the database changes that were made during the
current transaction, making the changes permanent.

DB2 holds or releases locks that are acquired on behalf of an application process,
depending on the isolation level in use and the cause of the lock.

The ROLLBACK statement backs out, or cancels, the database changes that are made by
the current transaction and restores changed data to the state before the transaction began.

The initiation and termination of a transaction define points of consistency within an application
process. A point of consistency is a time when all recoverable data that an application program
accesses is consistent with other data. The following figure illustrates these concepts.

Figure 1. A transaction with a commit operation

When a rollback operation is successful, DB2 backs out uncommitted changes to restore the data
consistency that existed when the unit of work was initiated. That is, DB2 undoes the work, as
shown in the following figure. If the transaction fails, the rollback operations begins.

Figure 2. Rolling back changes from a transaction

An alternative to cancelling a transaction is to roll back changes to a savepoint. A savepoint is a


named entity that represents the state of data at a particular point in time during a transaction.

TKSawe
You can use the ROLLBACK statement to back out changes only to a savepoint within the
transaction without ending the transaction.

Savepoint support simplifies the coding of application logic to control the treatment of a
collection of SQL statements within a transaction. Your application can set a savepoint within a
transaction. Without affecting the overall outcome of the transaction, application logic can undo
the data changes that were made since the application set the savepoint. The use of savepoints
makes coding applications more efficient because you don't need to include contingency and
what-if logic in your applications.

To assure the ACID properties of a transaction, any changes made to data in the course of a
transaction must be committed or rolled back.

When a transaction completes normally, a transaction processing system commits the changes
made to the data; that is, it makes them permanent and visible to other transactions.

When a transaction does not complete normally, the system rolls back (or backs out) the changes;
that is, it restores the data to its last consistent state.

Resources that can be rolled back to their state at the start of a transaction are known as
recoverable resources: resources that cannot be rolled back are non-recoverable.

DESIRABLE PROPERTIES OF TRANSACTION


Transactions should possess several properties. These are often called the ACID properties, and
they should be enforced by the concurrency control and recovery methods of the DBMS. The
following are the ACID properties:

1. Atomicity: A transaction is an atomic unit of processing; it is either performed in its entirety


or not performed at all.

The atomicity property requires that we execute a transaction to completion. It is the


responsibility of the transaction recovery subsystem of a DBMS to ensure atomicity. If a
transaction fails to complete for some reason, such as a system crash in the midst of transaction
execution, the recovery technique must undo any effects of the transaction on the database.
Eg:

TKSawe
Consider the case of funds transfer from account A to account B.

A.bal -= amount;
B.bal += amount;

A.bal -= amount;
CRASH

RECOVERY
A.bal += amount; -- Rollback

2. Consistency preservation: A transaction is consistency preserving if its complete execution


take(s) the database from one consistent state to another.

The preservation of consistency is generally considered to be the responsibility of the


programmers who write the database programs or of the DBMS module that enforces integrity
constraints. Recall that a database state is a collection of all the stored data items (values) in the
database at a given point in time. A consistent state of the database satisfies the constraints
specified in the schema as well as any other constraints that should hold on the database. A
database program should be written in a way that guarantees that, if the database is in a
consistent state before executing the transaction, it will be in a consistent state after the complete
execution of the transaction, assuming that no interference with other transactions occurs
Eg:
Consider the case of funds transfer from account A to account B.

A.bal -= amount;
B.bal += amount;

B.bal += amount;

TKSawe
A.bal -= amount (FAILS!! As balance is 0)

B.bal -= amount; -- Rollback

3. Isolation: A transaction should appear as though it is being executed in isolation from other
transactions. That is, the execution of a transaction should not be interfered with by any other
transactions executing concurrently.

Isolation is enforced by the concurrency control subsystem of the DBMS. If every transaction
does not make its updates visible to other transactions until it is committed, one form of isolation
is enforced that solves the temporary update problem and eliminates cascading rollbacks. There
have been attempts to define the level of isolation of a transaction. A transaction is said to have
level 0 (zero) isolation if it does not overwrite the dirty reads of higher-level transactions. A level
1 (one) isolation transaction has no lost updates; and level 2 isolation has no lost updates and no
dirty reads. Finally, level 3 isolation (also called true isolation) has, in addition to degree 2
properties, repeatable reads.
Eg:
Consider the case of funds transfer from account A to account B.

Transaction T1:
A.bal -= amount; (Let As balance become 0 after this)
B.bal += amount;

Transaction T2:
A.bal -= amount2;

Net effect should be either T1,T2 (in which case T2 fails) or


T2,T1 (in which case T1 fails)

TKSawe
4. Durability or permanency: The changes applied to the database by a committed transaction
must persist in the database. These changes must not be lost because of any failure.

Finally, the durability property is the responsibility of the recovery subsystem of the DBMS.
Eg:
Consider the case of funds transfer from account A to account B.

Account A should have a balance of amount

Transaction T1:
A.bal -= amount;
B.bal += amount;
Commit

Account A should have a balance of 0.

ACID properties of transactions

In the context of transaction processing, the acronym ACID refers to the four key properties of a
transaction: atomicity, consistency, isolation, and durability.

Atomicity
All changes to data are performed as if they are a single operation. That is, all the changes
are performed, or none of them are.
For example, in an application that transfers funds from one account to another, the
atomicity property ensures that, if a debit is made successfully from one account, the
corresponding credit is made to the other account.
Consistency
Data is in a consistent state when a transaction starts and when it ends.
For example, in an application that transfers funds from one account to another, the
consistency property ensures that the total value of funds in both the accounts is the same
at the start and end of each transaction.
Isolation

TKSawe
The intermediate state of a transaction is invisible to other transactions. As a result,
transactions that run concurrently appear to be serialized.
For example, in an application that transfers funds from one account to another, the
isolation property ensures that another transaction sees the transferred funds in one
account or the other, but not in both, nor in neither.
Durability
After a transaction successfully completes, changes to data persist and are not undone,
even in the event of a system failure.
For example, in an application that transfers funds from one account to another, the
durability property ensures that the changes made to each account will not be reversed.

What Are Distributed Transactions?


A distributed transaction is a transaction that updates data on two or more networked computer
systems. Distributed transactions extend the benefits of transactions to applications that must
update distributed data. Implementing robust distributed applications is difficult because these
applications are subject to multiple failures, including failure of the client, the server, and the
network connection between the client and server. In the absence of distributed transactions, the
application program itself must detect and recover from these failures.

A distributed transaction includes one or more statements that, individually or as a group,


update data on two or more distinct nodes of a distributed database. For example, assume the
database configuration depicted in

A distributed transaction is a transaction that updates data on two or more networked computer
systems. Distributed transactions extend the benefits of transactions to applications that must
update distributed data. Implementing robust distributed applications is difficult because these
applications are subject to multiple failures, including failure of the client, the server, and the
network connection between the client and server. In the absence of distributed transactions, the
application program itself must detect and recover from these failures.

For distributed transactions, each computer has a local transaction manager. When a transaction
does work at multiple computers, the transaction managers interact with other transaction
managers via either a superior or subordinate relationship. These relationships are relevant only
for a particular transaction.

Each transaction manager performs all the enlistment, prepare, commit, and abort calls for its
enlisted resource managers (usually those that reside on that particular computer). Resource
managers manage persistent or durable data and work in cooperation with the DTC to guarantee
atomicity and isolation to an application.

TKSawe
In a distributed transaction, each participating component must agree to commit a change action
(such as a database update) before the transaction can occur. The DTC performs the transaction
coordination role for the components involved and acts as a transaction manager for each
computer that manages transactions. When committing a transaction that is distributed among
several computers, the transaction manager sends prepare, commit, and abort messages to all its
subordinate transaction managers. In the two-phase commit algorithm for the DTC, phase one
involves the transaction manager requesting each enlisted component to prepare to commit; in
phase two, if all successfully prepare, the transaction manager broadcasts the commit decision.

In general, transactions involve the following steps:

1. Applications call the transaction manager to begin a transaction.

2. When the application has prepared its changes, it asks the transaction manager to commit
the transaction. The transaction manager keeps a sequential transaction log so that its
commit or abort decisions will be durable.
o If all components are prepared, the transaction manager commits the transaction
and the log is cleared.
o If any component cannot prepare, the transaction manager broadcasts an abort
decision to all elements involved in the transaction.
o While a component remains prepared but not committed or aborted, it is in doubt
about whether the transaction committed or aborted. If a component or transaction
manager fails, it reconciles in-doubt transactions when it reconnects.

When a transaction manager is in-doubt about a distributed transaction, the transaction manager
queries the superior transaction manager. The root transaction manager, also referred to as the
global commit coordinator, is the transaction manager on the system that initiates a transaction
and is never in-doubt. If an in-doubt transaction persists for too long, the system administrator
can force the transaction to commit or abort.

Note
Many aspects of a distributed transaction are identical to a transaction whose scope is a single database. For
example, a distributed transaction provides predictable behavior by enforcing the ACID properties that define
all transactions.

For distributed transactions, each computer has a local transaction manager. When a transaction
does work at multiple computers, the transaction managers interact with other transaction

TKSawe
managers via either a superior or subordinate relationship. These relationships are relevant only
for a particular transaction.
Each transaction manager performs all the enlistment, prepare, commit, and abort calls for its
enlisted resource managers (usually those that reside on that particular computer). Resource
managers manage persistent or durable data and work in cooperation with the DTC to guarantee
atomicity and isolation to an application.
In a distributed transaction, each participating component must agree to commit a change action
(such as a database update) before the transaction can occur. The DTC performs the transaction
coordination role for the components involved and acts as a transaction manager for each
computer that manages transactions. When committing a transaction that is distributed among
several computers, the transaction manager sends prepare, commit, and abort messages to all its
subordinate transaction managers. In the two-phase commit algorithm for the DTC, phase one
involves the transaction manager requesting each enlisted component to prepare to commit; in
phase two, if all successfully prepare, the transaction manager broadcasts the commit decision.

Two-phase commit algorithm


When a transaction involves multiple distributed resources, for example, a database server on
each of two different network hosts, the commit process is somewhat complex because the
transaction includes operations that span two distinct software systems, each with its own
resource manager, log records, and so on. (In this case, the distributed resources are the database
servers.)
Two-phase commit is a transaction protocol designed for the complications that arise with
distributed resource managers. With a two-phase commit protocol, the distributed transaction
manager employs a coordinator to manage the individual resource managers.
The commit process proceeds as follows:
Phase 1
Each participating resource manager coordinates local operations and forces all log
records out:
If successful, respond "OK"
If unsuccessful, either allow a time-out or respond "NO"
Phase 2
If all participants respond "OK":
Coordinator instructs participating resource managers to "COMMIT"
Participants complete operation writing the log record for the commit
Otherwise:
Coordinator instructs participating resource managers to "ROLLBACK"
Participants complete their respective local undos

In order for the scheme to work reliably, both the coordinator and the participating resource
managers independently must be able to guarantee proper completion, including any necessary
restart/redo operations. The algorithms for guaranteeing success by handling failures at any stage
are provided in advanced database texts.

TKSawe
Distributed System synchronization
Synchronization: single CPU sys vs. dist sys.
Single CPU:
critical regions, mutual exclusion, and other synchronization probls are solved using
methods such as semaphores and monitors.
DS:
semaphores and monitors are not appropriate since they rely on the existence of shared
memory
Problems to be tackled with:
Time
Mutual exclusion
Election algorithms
Atomic transactions
Deadllocks

Computers clock
Each computer has a circuit for keeping track of time

TKSawe
The word "clock" is used to refer to these devices, but they are not actually clocks in the usual
sense: timer is perhaps a better word.
A computer timer is usually a precisely machined quartz crystal.
When kept under tension, quartz crystals oscillate at a well-defined frequency that depends on
the kind of crystal, how it is cut, and the amount of tension.
Associated with each crystal are two registers, a counter and a holding register.
Each oscillation of the crystal decrements the counter by one.
When the counter gets to zero, an interrupt is generated and the counter is reloaded from the
holding register.
In this way, it is possible to program a timer to generate an interrupt 60 times a second, or at
any other desired frequency.
Each interrupt is called one clock tick.

Problems with physical clocks


Clock skew
Although the frequency at which a crystal oscillator runs is usually fairly stable, it is
impossible to guarantee that the crystals in different computers all run at exactly the same
frequency.
When a system has n computers, all n crystals will run at slightly different rates, causing the
(software) clocks gradually to get out of sync and give different values when read out.
Computer clocks, like any other clocks tend not to be in
perfect agreement !!
Clock skew (offset): the difference between the times on two clocks |Ci(t) Cj(t)|
Clock drift : they count time at different rates
Ordinary quartz clocks drift by ~ 1sec in 11-12 days. (10-6 secs/sec).
High precision quartz clocks drift rate is ~ 10-7 or 10-8 secs/sec
Differences in material, Temperature variation..
Consequence: programs that are based on the time associated with a file, object, process, or
message can fail

Lack of Global Time in DS


It is impossible to guarantee that physical clocks run at the same frequency
Lack of global time, can cause problems

TKSawe
When each machine has its own clock, an event that occurred after another event may
nevertheless be assigned an earlier time.

Coordinated Universal Time (UTC)


Coordinated Universal Time (UTC) is the primary time standard by which the world regulates
clocks and time. It has essentially replaced the old standard, Greenwich Mean Time.
UTC-signals come from shortwave radio broadcasting stations or from satellites (GEOS, GPS)
with an accuracy of:
1.0 msec (broadcasting station)
1.0 sec (GPS)
>> 1ms (UTC available via phone line)
Receivers are available commercially and can be connected to PCs

Physical Clock Synchronization


Clock inaccuracies cause serious problems and troublesome in distributed systems. The clocks of
different processors need to be synchronized to limit errors. This is to have an efficient
communication or resource sharing. Hence the clocks need to be monitored and adjusted
continuously. Otherwise, the clocks drift apart. Similarly clock skew also introduces mismatch in
time value of two clocks. Both these are to be addressed to make an efficient usage of features of
distributed systems.
External: synchronize with an external resource, UTC source |S(t) Ci(t)| < D
Internal: synchronize without access to an external resource |Ci(t) Cj(t)| < D

Cristians Algorithm External Synch


External source S
Denote clock value at process X by C(X)
Periodically, a process P:
1. send message to S, requesting time
2. Receive message from S, containing time C(S)
3. Adjust C at P, should we set C(P) = C(S)?
Reply takes time
When P adjusts C(P) to C(S), C(S) > C(P)

Berkeley Algorithm Internal Synch


Internal: synchronize without access to an external resource |Ci(t) Cj(t)| < D
Periodically,
S: send C(S) to each client P

TKSawe
P: calculate P = C(P) C(S)
send P to S
S: receive all Ps
compute an average
send -P to client P
P: apply -P to C(P)

a) The time daemon asks all the other machines for their clock values
b) The machines answer
c) The time daemon tells everyone how to adjust their clock

Network time protocol- External Synch


Synchronization of clients relative to UTC on an internet-wide scale
Reliable, even in the presence of extensive loss of connectivity
Allow frequent synchronization (relative to clock drift)
Tolerant against disturbance <1ms within LAN, 1-10 ms internet scale

Logical Clocks
For many DS algorithms, associating an event to an absolute real time is not essential, we only
need to know an unambiguous order of events.
Synchronization based on relative time.
Example: Unix make (Is output.c updated after the generation of
output.o?)
relative time may not relate to the real time.
Whats important is that the processes in the Distributed System agree on the ordering in
which certain events occur.
Such clocks are referred to as Logical Clocks.
Lamport Algorithm

TKSawe
Clock synchronization does not have to be exact
Synchronization not needed if there is no interaction between machines
Synchronization only needed when machines communicate
i.e. must only agree on ordering of interacting events

Happened Before Relation


a Happened Before b : ab
The happened-before relation captures the causal dependencies between events,
1. ab if a and b are events in the same process and a occurred before b.
2. ab if a is the event of sending a message m in a process and b is the event of receipt
of the same message m by another process.
3. If ab and bc, then ac, i.e. happened before relation is transitive.
That is, past events causal affects future events.

Two distinct events a and b are concurrent (a||b) if not (ab or ba).
We cannot say whether one event happened-before
For any two events a and b in a distributed system, either ab, ba or a||b.

Logical Clocks
There is a clock Ci at each process pi
The clock Ci can be thought of as a function that assigns a number Ci(a) to any event a, called
the timestamp of event a, at pi
These clocks can be implemented by counters and have no relation to physical time.

Conditions Satisfied by the System of Clocks


For any events a and b: if ab , then C(a)<C(b).
implies the following two conditions:
[C1] For any two events a and b in a process Pi, if a occurs before b, then Ci(a) < Ci (b).
[C2] If a is the event of sending a message m in process Pi and b is the event of
receiving the same message m at process Pj, then Cj (a) < Cj(b).

TKSawe

S-ar putea să vă placă și