Sunteți pe pagina 1din 36

Distributed Computing

Lecture 1
Fundamentals

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 1


Course Description

 This course introduces the concepts and design of


distributed computing systems.
 Topics covered include
 Message passing
 Remote procedure calls
 Resource and process management
 Migration
 Mobile agents
 Distributed coordination
Distributed shared memory
 Distributed file systems, fault tolerance
 Naming
 And case studies
 The final project requires a team work where each team of two
students picks up any parallelizable application and compares the
programmability and the performance.

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 2


Textbooks: References:

 Distributed Operating Systems : Concepts and Design,


Pradeep K. Sinha
 Distributed Systems: Concepts and Design, George
Coulouris
 Distributed Systems: Principles and Paradigms
Andrew S. Tanenbaum and Maarten ,Van Steen,

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 3


Definition of a Distributed System
 A distributed system is
A collection of independent computers that appears its
users as a single coherent system
 A distributed system is one in which components
located at networked computers communicate and
coordinate their actions only by passing messages

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 4


Distributed System (A distributed system organized as
middleware.
Note that the middleware layer extends over multiple machines)

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 5


EXAMPLES
 The internet
 An intranet which is a portion of the
internet managed by an organization
 Mobile and ubiquitous computing(devices,
such as laptop computers, PDA, mobil phone, refrigerators,
together with their ability to connect conveniently to networks
in different places , ubiquitous computing is the
harnessing of many small cheap computational
devices that are present in users’ physical
environments, including the home, office and
elsewhere)

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 6


Evolution of distributed computing systems

 Main frame computers


 Batch processing ( to improve CPU utilization )
 Automatic seq. of jobs ( use of control cards)
 Off-line processing ( buffering and spooling)
 Multiprogramming (CPU had always something to
execute)
 Time sharing
 Multiple users
 Resource sharing

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 7


Parallel v.s. Distributed Systems
Parallel Systems Distributed Systems
Tightly coupled Loosely coupled
microprocessor sys microprocessor sys
Memory Tightly coupled shared Distributed memory
memory Message passing, RPC, and/or used
UMA, NUMA of distributed shared memory
Control Global clock control No global clock control
SIMD, MIMD Synchronization algorithms needed

Processor Order of Tbps Order of Gbps


interconnectio Bus, mesh, tree, mesh of tree, Ethernet(bus), token ring and SCI
n and hypercube (-related) (ring), myrinet(switching network)
network
Main focus Performance Performance(cost and
Scientific computing scalability)
Jan 2007 Reliability/availability
Nishi Tiku V.E.S.I.T_M.C.A 8
Information/resource sharing
Parallel (Tightly coupled) v.s. Distributed (Loosely
coupled )Systems
System wide
CPU CPU
CPU shared memory CPU
CPU

Interconnection h/w
Interconnection h/w

Local memory Local memory


Local memory Local memory
CPU CPU
CPU CPU

Communication n/w
Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 9
Milestones in Distributed
Computing Systems
1945-1950s Loading monitor
1950s-1960s Batch system
1960s Multiprogramming
1960s-1970s Time sharing systems Multics, IBM360
1969-1973 WAN and LAN ARPAnet, Ethernet
1960s- Minicomputers PDP, VAX
early1980s
Early 1980s Workstations Alto
1980s – present Workstation/Server Sprite, V-system
models
1990s Clusters Beowulf
Jan 2007
Late 1990s Grid Nishi Tiku V.E.S.I.T_M.C.A
computing Globus, Legion 10
Distributed Computing System Models

 Minicomputer model
 Workstation model
 Workstation-server model
 Processor-pool model
 Cluster model
 Grid computing

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 11


Minicomputer Model
Mini-
computer

ARPA
Mini- net Mini-
computer computer

 Extension of Time sharing system


 User must log on his/her home minicomputer.
 Thereafter, he/she can log on a remote machine by
telnet.
 Resource sharing
 Database
Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 12
 High-performance devices
Workstation Model(diskful and a local file
system)

Workstation

Workstation 100Gbps Workstation


LAN

Workstation Workstation
 Process migration
 Users first log on his/her personal workstation.
 If there are idle remote workstations, a heavy job may
migrate to one of them.
 Issues:
 How to find am idle workstation
 How to migrate a job
 What if a user logs onto a workstation which is running a
Jan 2007
process of a remote machine (preemptive process migration 13
Nishi Tiku V.E.S.I.T_M.C.A
facility)
Workstation-Server Model (c-s processes can be on the
same m/c )

 Client workstations
 Diskless
Workstation  Graphic/interactive applications processed in local
 All file, print, http and even cycle computation
requests are sent to servers.
Workstation Workstation
 Server minicomputers
 Each minicomputer is dedicated to one or more
100Gbps different types of services.
LAN  Client-Server model of communication
 RPC (Remote Procedure Call)
 RMI (Remote Method Invocation)
 A Client process calls a server process’
Mini- Mini- Mini-
function.
Computer Computer Computer
file server http serverPrint server  No process migration invoked so the
response time is guaranteed )
 Example: NSF
Jan 2007 
Nishi Tiku Advantages
V.E.S.I.T_M.C.A 14
(cost,maintenance,anywhere,upgradation)
Processor-Pool Model
 Clients:
 They log in one of terminals (diskless
workstations or X terminals)
 All services are dispatched to servers.
 Servers:
 Necessary number of processors are
100Gbps allocated to each user from the pool.
LAN  Better utilization but less interactivity

–Pool of backend processors handle processing


–Run server manages the backend processors
Run server
Server N –User does not log on to his home m/c ( like in
Server 1
other models)
–Better utilization of processing power ( when
recompiling a prog. Consisting of large no. of
Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 15
files)
Cluster Model
 Client
Workstation
 Takes a client-server
model
Workstation Workstation  Server
Consists of many

100Gbps PC/workstations
LAN
http server2 connected to a high-
http server1 http server N speed network.
 Puts more focus on
Master Slave Slave Slave performance: serves
node 1 2 N
for requests in
parallel.
1Gbps SAN
Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 16
Grid Computing
 Goal
Workstation  Collect computing power of
supercomputers and clusters
sparsely located over the nation and
make it available as if it were the
electric grid
Super- Mini-  Distributed Supercomputing
computer computer  Very large problems needing lots of
Cluster CPU, memory, etc.
High-speed  High-Throughput Computing
Information high way  Harnessing many idle resources
 On-Demand Computing
Super-  Remote resources integrated with
Cluster local computation
computer
 Data-intensive Computing
 Using distributed data
 Collaborative Computing
 Support communication among multiple
Workstation Workstation parties
Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 17
Reasons for Distributed
Computing Systems
 Inherently distributed applications
 Distributed DB, worldwide airline reservation, banking system
 Information sharing among distributed users
 CSCW or groupware
 Resource sharing
 Sharing DB/expensive hardware and controlling remote lab.
devices
 Better cost-performance ratio / Performance
 Emergence of Gbit network and high-speed/cheap MPUs
 Effective for coarse-grained or embarrassingly parallel applications
 Reliability
 Tolerance against errors and component failures ,Non-stopping
(availability) and voting features.
 Scalability
 Loosely coupled connection and hot plug-in

Jan 2007 Flexibility Nishi Tiku V.E.S.I.T_M.C.A 18
 Reconfigure the system to meet users’ requirements
Network v.s. Distributed
Operating Systems

Features Network OS Distributed OS

SSI NO YES
(Single System Ssh, sftp, no view of Process migration, NFS,
Image) remote memory DSM (Distr. Shared
memory)
Autonomy High Low
Local OS at each computer A single system-wide OS
No global job coordination Global job coordination
Fault Tolerance Unavailability grows as Unavailability remains
faulty machines little even if fault
increase.therefore little machines increase.thus
Jan 2007 fault tolerance
Nishi Tiku V.E.S.I.T_M.C.A the performance is 19
decreased
Issues in Distributed Computing System
Transparency
Transparency is defined as the concealment from the
user and the application programmer of the
separation of components in a distributed system,
so that the system is perceived as a whole (virtual
uniprocessor) rather than as a collection of
independent components

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 20


Transparency
 Access transparency
 Memory access: DSM
 Function call: RPC and RMI
 Global resource naming facility
 Location transparency
 File naming: NFS
 Domain naming: DNS (Still location concerned.)
 Migration transparency
 Automatic state capturing and migration
 Concurrency transparency
 Event ordering: Message delivery and memory
consistency
 Other transparency:
 Failure, Replication, Performance, and Scaling
Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 21
Transparency
 Eight forms of transparency:
 Access transparency ( user need not know whether the resourse is
local or remote)
 Location transparency ( resourse names should be unique systemwide
,name transparency,users should be able to logon from any m/c,
user mobility)
 Concurrency transparency( each user has a feeling that he is the sole
usert)
Event ordering property,Mutual exclusion property,
No starvation property, No deadlock property
 Replication transparency (mapping of the names and replication control
are the issues)
 Failure transparency( commn. link failure ,m/c failure,storage device
crash)
 Migration transparency(migration should automatically take place, no
change in name )
 Performance transparency(for load balancing the system should be
automatically be reconfigured)
 Scaling transparency (allow the sys. to expand without disrupting the
Jan 2007 acitivities of the userNishi
.i;e Tiku
open system arch.)
V.E.S.I.T_M.C.A 22
Transparency
Transparency Description

Hide differences in data representation and how a


Access
resource is accessed
Location Hide where a resource is located
Hide that a resource may move to another
Migration
location
Hide that a resource may be moved to another
Relocation
location while in use
Hide that a resource may be shared by several
Replication
competitive users
Hide that a resource may be shared by several
Concurrency
competitive users
Failure Hide the failure and recovery of a resource
Hide whether a (software) resource is in memory
Persistence
Jan 2007 or Nishi
on disk
Tiku V.E.S.I.T_M.C.A 23
Issues in Distributed Computing
System Reliability
A fault is a mech.or algo. defect that may generate an error
 Faults
When faults occur in hardware or software, programs may produce incorrect
results (Byzantine difficult to deal with )or they may stop ( fail stop )
before they have completed the intended computation
 Fail stop
 Byzantine failure
 Fault avoidance
 The more machines involved, the less avoidance capability
 Fault tolerance
 Techniques for dealing with failures:
 Redundancy techniques
 K-fault tolerance needs K + 1 replicas
 K-Byzantine failures needs 2K + 1 replicas.
 Distributed control
 Avoiding a complete fail stop
 Fault detection and recovery
 Atomic transaction
 Stateless servers
Jan 2007  Ack.and timeNishi Tiku V.E.S.I.T_M.C.A
based transmissions of msgs 24
Flexibility
 Ease of modification
 Ease of enhancement

User User User User User User


applications applications applications applications applications applications

Monolithic Daemons Daemons


Daemons
Kernel Monolithic Monolithic (file, name,) (file, name,)
(Unix) process (file, name,)
,mngmt ,memory Kernel Kernel
Device (Unix) (Unix) Microkernel Microkernel Microkernel
,file ,name mngmt (Mach) (Mach) (Mach)

Network Network
Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 25
Performance/Scalability

Unlike parallel systems, distributed systems involves OS


intervention and slow network medium for data transfer
 Send messages in a batch: chunks of data,piggybacking of ack.of
previous msgs. with the next msg
 Avoid OS intervention for every message transfer.
 Cache data ( at clients site)
 Avoid repeating the same data transfer
 Minimizing data copy moving data in and out of buffers can be
reduced)
 migrating a process closer to the recourses
 Avoid OS intervention (= zero-copy messaging).
 Avoid centralized entities and algorithms
 Avoid network saturation.
 Perform most operations on client sides
Jan 2007
 Avoid heavy traffic between
Nishi Tiku clients and servers
V.E.S.I.T_M.C.A 26
Heterogeneity
 Different networks, hardware, operating systems, programming
languages, developers, platforms,e;g some hosts use 32 bit wordlength
while others may use 16 or 64 bit length.therefore some form of data
translation is necesary
 We set up protocols to solve these heterogeneities.
 Middleware: a software layer that provides a programming abstraction as
well as masking the heterogeneity.
 Mobile code: code that can be sent from one computer to another and
run at the destination.

 Data and instruction formats depend on each machine architecture


 If a system consists of K different machine types, we need K–1
translation software.
 If we have an architecture-independent standard data/instruction
formats, each different machine prepares only such a standard
translation software.
 Java and Java virtual machine
Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 27
Security
 Security for information resources has three
components:
 Confidentiality: protection against disclosure to
unauthorized individuals.
 Integrity: protection against alteration or corruption.
 Availability: protection against interference with the means
to access the resources.
 Two new security challenges:
 Denial of service attacks (DoS).
 Security of mobile code.
These issues of security can , most of the time be
addressed by cryptography

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 28


Security

 Lack of a single point of control


 Security concerns:
 Messages may be stolen by an intruder.
 Messages may be plagiarized by an intruder.
 Messages may be changed by an intruder.

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 29


Distributed Computing Environment

DCE Applications

Threads
RPC

Distributed Time Service Security


Distributed File Service
Name

Various 0perating systems and networking


Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 30
Distributed Application Examples
 Automated banking systems
 Tracking roaming cellular phones
 Global positioning systems
 Retail point-of-sale terminals
 Air-traffic control
 The World Wide Web

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 31


Banking example
 central office:
 Clients at different ATMs may access the same accounts
simultaneously
 2 central offices:
 ATMs use nearest central office
 Each office acts as a backup of the other
 Outstanding transactions ?
 Network failures ?

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 32


Design goals
 Performance
 Reliability
 Scalability
 Consistency
 Security

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 33


Design issues
 Naming
 Communication
 Software structure
 well-defined interfaces
 abstractions/layering & support services
 Scale
 Partial failure
 detection, masking & tolerance
 recovery

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 34


Added Complexity
 Concurrency
 No global clock
 Inconsistent states
 Independent failures
 Scale

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 35


Exercises
1. In what respect are distributed computing systems superior
to parallel systems?
2. In what respect are parallel systems superior to distributed
computing systems?
3. Discuss the difference between the workstation-server and
the processor-pool model from the availability view point.
4. Discuss the difference between the processor-pool and the
cluster model from the performance view point.
5. What is Byzantine failure? Why do we need 2k+1 replica for
this type of failure?
6. Discuss about pros and cons of Microkernel.
7. Why can we avoid OS intervention by zero copy?

Jan 2007 Nishi Tiku V.E.S.I.T_M.C.A 36

S-ar putea să vă placă și