Distributed System

Instructions for current semester
•Maximum 2 extra practicals .
•If the student comes for extra practical, the lab

assignment will be different from the regular one.
•Minimum 80 % attendance in theory and 100% in

practicals.
•If any student is found playing games/ surfing

websites that are not related to the subject , will be
marked absent in that slot.
Introduction
to
Distributed System
PRASHASTI KANIKAR 2
Definitions
• Tanenbaum: “A distributed system is a collection of
independent computers that appears to its users as a
single coherent system”.
• Coulouris: “A system in which hardware or software

components located at networked computers
communicate and coordinate their actions only by
message passing”.
3
PRASHASTI KANIKAR 3
Introduction contd…
• 2 aspects to the definition:
– Hardware: autonomous computers, network links
– Software: communication protocols, system and
application software.
• A distributed system is built on top of a network.
• Distributed Computing is computing performed in a

distributed system.
4
PRASHASTI KANIKAR 4
terminal
mainframe computer
workstation
network link
network host
centralized computing
distributed computing
Centralized vs. Distributed Computing
5
PRASHASTI KANIKAR 5
• Examples of Distributed systems:
– Network of workstations (NOW): a group of
networked personal workstations connected to one or
more server machines.
– The Internet
– An intranet: a network of computers and workstations

within an organization.
6
PRASHASTI KANIKAR 6
• Why to study Distributed systems:
– Economics: distributed systems allow the pooling of
resources, including CPU cycles, data storage, input/output
devices, and services.
– Reliability: a distributed system allows replication of
resources and/or services
• Speed: a distributed system may have more total computing power than a
mainframe.
• Inherent distribution: Some applications are inherently distributed. Ex. a
supermarket chain.
• Incremental growth: Computing power can be added in small
increments. Modular expandability
• Another driving force: the existence of large number of personal
computers, the need for people to collaborate and share information.
PRASHASTI KANIKAR 7
Parallel vs Distributed Systems
Parallel Systems Distributed Systems
Memory Tightly coupled shared memory Distributed memory

UMA, NUMA Message passing, RPC, and/or used of
distributed shared memory
Control Global clock control No global clock control

SIMD, MIMD Synchronization algorithms needed
Processor Order of Tbps Order of Gbps

interconn Bus, mesh, tree, mesh of tree, and Ethernet(bus), token ring and SCI
ection hypercube (-related) network (ring), myrinet(switching network)
Main Performance Performance(cost and scalability)

focus Scientific computing Reliability/availability
Information/resource sharing
PRASHASTI KANIKAR 8
Goals of Distributed Systems
• Goals :
– Connecting users and resources
– Transparency
– Openness
– Scalability
9
PRASHASTI KANIKAR 9
Goals of Distributed Systems contd…
1. Connecting Users and Resources:
– Access remote resources
– Share resources
– Data sharing- Groupwares for Collaborative

editing, teleconferencing, video conferencing etc.
– Problem: Security
10
PRASHASTI KANIKAR 10
2. Transparency:
Transparency Description
Hide differences in data representation and how a resource is

Access
accessed
Location Hide where a resource is located
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another location while in use
enables multiple instances of resources to be used to increase

Replication reliability and performance without knowledge of the replicas by
users or application programmers.
enables several processes to operate concurrently using shared
Concurrency
resources without interference between them.
Failure Hide the failure and recovery of a resource
Persistence Hide whether a (software) resource is in memory or on disk

11
3. Openness: System that offers services according to
standard rules that describe the syntax and semantics
of those services.
• Services specified through interfaces : Interface

Definition Language(IDL)
– Interoperability
– Portability
– Flexibility
12
4. Scalability:
– Size scalability
• Can add more users and resources
– Geographical scalability
• Can spread across different geographical areas
– Administrative scalability
• Can be manageable even if it spans many
independent organizations
13
 Scalability Problems
 Scaling w.r.t. Size:
Concept Example
Centralized services A single server for all users
Centralized data A single online telephone book
Centralized algorithms Doing routing based on complete

information
 Geographical Scaling:
 Form of communication: Synchronous
 Communication is inherently unreliable and point-
to-point.
14
• Scalability Problems
• Administrative Scaling:
– Conflicting policies
– If DS expands to another domain, two types of
security measures need to be taken:
• DS has to protect itself against malicious attacks
from the new domain.
• The new domain has to protect itself against
malicious attacks from DS.
15
• Scaling Techniques:
– Hiding communication latencies
– Distribution
– Replication
16
• Scalability Techniques:
– Hiding communication latencies
17
• Distribution
– Distribution
An example of dividing the DNS name space into zones.
18
• Scaling Techniques:
– Replication
• Increases availability, balances load
• Caching :special form of replication
• Drawback: leads to inconsistency problems
19
Hardware Concepts
• Multiprocessors vs. Multicomputers
20
Hardware Concepts contd…
• Multiprocessors
– Property: all the CPUs have direct access to the
shared memory.
A bus-based multiprocessor.
21
• Multiprocessors
– Problem : bus will usually be overloaded
– Solution: high-speed cache memory
– Again there is problem with caches
– Problem: Limited scalability

– Solution: using crossbar switch, omega network
22
• Multiprocessors
a) A crossbar switch
b) An omega switching network
23
• Homogeneous Multicomputer systems
– Each CPU has direct connection to its own local
memory.
– Also referred as System Area Networks(SANs)
– Bus-based and switch-based
– Two popular topologies: Grid and Hypercube
24
• Homogeneous Multicomputer systems
a) Grid
b) Hypercube
25
• Heterogeneous Multicomputer systems
– Computers may vary w.r.t. processor type,
memory, sizes and I/O bandwidth.
– Varying interconnection networks
26
Software Concepts
• DS are very much like Operating systems
– Acting as resource managers
– Hides the heterogeneous nature of the underlying
H/W
Two categories:
tightly-coupled systems: DOS
loosely -coupled systems: NOS
Middleware
27
Software Concepts
System Description Main Goal
DOS Tightly-coupled operating Hide and manage

system for multi-processors hardware
and homogeneous multi- resources
computers
NOS Loosely-coupled operating Offer local

system for heterogeneous services to
multi-computers (LAN and remote clients
WAN)
Middleware Additional layer atop of Provide
NOS implementing general- distribution
purpose services transparency
28
Software Concepts contd…
• Distributed Operating systems: two types
– Multiprocessor operating system
– Multicomputer operating system
29
• Multiprocessor Operating systems:
– Goal: make the number of CPUs transparent to the
application
– Idea: protect the data against simultaneous access
– Semaphores and Monitors
30
• Multicomputer Operating systems:
31
• Network Operating systems:
32
• Network Operating systems:
Two clients and a server in a network operating system.
33
• Middleware:
General structure of a distributed system as middleware.
34
Role of Middleware (MW)
 In some early research systems: MW tried to provide
the illusion that a collection of separate machines was
a single computer.
 E.g. NOW project: GLUNIX middleware
 Today:
 clustering software allows independent computers to
work together closely
 MW also supports seamless access to remote services,
doesn’t try to look like a general-purpose OS
• Middleware Models:
– Treating everything as files: distributed file systems

(distribution transparency is supported for traditional
files)
– Based on Remote Procedure Calls(RPCs): hides
network communication by allowing a process to call a
procedure of which an implementation is located on a
remote machine.
– Distributed Objects: each object implements an
interface that hides the internal details of the object
from its users.
– Distributed Documents
36
• Middleware Services:
– Communication: implements access transparency
– Naming: allows entities to be shared and looked-
up
– Persistence: for storage by means of distributed
file systems, integrated databases
– Distributed Transactions: it allows multiple read
and write operations to occur atomically
– Security
37
Software Concepts
Item Distributed OS Network OS Middleware based
OS
Multiproc. Multicomp.
Degree of Very high high low high
Transparency
Same OS on all Yes Yes No No
nodes?
Number of 1 N N N
copies of OS
Basis for Shared Messages Files Model specific
communication Memory
Resource Global, Global, Per node Per node
Management central distributed
Scalability No Moderately Yes Varies
Openness Closed Closed Open Open
38
The Client-Server Model
• Server: process implementing a specific service eg., file
system service
• Client: process that requests a service from a server .
• Client-server interaction, known as Request-Reply behavior
General interaction between a client and a server.

39
Important Design Issues in Distributed Systems
• Concurrency
– Shared access to resources should be made
possible
• Openness and Extensibility

– Interfaces should be cleanly separated
• Migration and load balancing

– Allow movement of tasks and balance load among
available resources
• Security
– Resources to be secured
40
Advantages and Disadvantages of Distributed
Systems
• Advantages:
– The affordability of computers and availability of
network access
– Resource sharing
– Scalability
– Fault Tolerance
• Disadvantages:
– Multiple points of failure
– Security concerns
41
Types of Distributed Systems
• Distributed Computing Systems
– Clusters
– Grids
– Clouds
• Distributed Information Systems
– Transaction Processing Systems
– Enterprise Application Integration
• Distributed Embedded Systems
– Home systems
– Health care systems
– Sensor networks
Cluster Computing
• A collection of similar processors (PCs,
workstations) running the same operating
system, connected by a high-speed LAN.
• Parallel computing capabilities using
inexpensive PC hardware
• Replace big parallel computers
Cluster Types & Uses
• High Performance Clusters (HPC)
– run large parallel programs
– Scientific, military, engineering apps; e.g., weather
modeling
• Load Balancing Clusters
– Front end processor distributes incoming requests
– e.g., at banks or popular web site
• High Availability Clusters (HA)
– Provide redundancy – back up systems
– May be more fault tolerant than large mainframes
Clusters – Beowulf model
• Linux-based
• Master-slave paradigm
– One processor is the master; allocates tasks to
other processors, maintains batch queue of
submitted jobs, handles interface to users
– Master has libraries to handle message-based
communication or other features (the
middleware).
Cluster Computing Systems
• Figure 1-6. An example of a cluster computing
system.
Figure 1-6. An example of a (Beowolf) cluster

computing system
Clusters – MOSIX model
• Provides a symmetric, rather than hierarchical
paradigm
– High degree of distribution transparency (single
system image)
– Processes can migrate between nodes dynamically
and preemptively.Migration is automatic
• Used to manage Linux clusters
More About MOSIX
• “Operating-system-like”; looks & feels like a
single computer with multiple processors
• Supports interactive and batch processes
• Provides resource discovery and workload
distribution among clusters
• Clusters can be partitioned for use by an
individual or a group
• Best for compute-intensive jobs
Grid Computing Systems
• Modeled loosely on the electrical grid.
• Highly heterogeneous with respect to
hardware, software, networks, security
policies, etc.
• Grids support virtual organizations: a
collaboration of users who pool resources
(servers, storage, databases) and share
them
• Grid software is concerned with managing
sharing across administrative domains.
Grids
• Similar to clusters but processors are more loosely
coupled, tend to be heterogeneous, and are not all in
a central location.
• Can handle workloads similar to those on
supercomputers, but grid computers connect over a
network and supercomputers’ CPUs connect to a
high-speed internal bus/network
• Problems are broken up into parts and distributed
across multiple computers in the grid.
• less communication between parts than in clusters.
A Proposed Architecture for Grid Systems*
• Fabric layer: interfaces to local

resources at a specific site
• Connectivity layer: protocols to
support usage of multiple
resources for a single application;
e.g., access a remote resource or
transfer data between resources;
and protocols to provide security
• Resource layer manages a single
resource, using functions supplied
by the connectivity layer
• Collective layer: resource
discovery, allocation, scheduling,
etc.
• Applications: use the grid
resources
• The collective, connectivity and
resource layers together form the
middleware layer for a grid Figure 1-7. A layered architecture
for grid computing systems
OGSA – Another Grid Architecture
• Open Grid Services Architecture (OGSA) is a
service-oriented architecture
– Sites that offer resources to share do so by
offering specific Web services.
• The architecture of the OGSA model is more
complex than the previous layered model.
Globus Toolkit
• An example of grid middleware
• Supports the combination of heterogeneous
platforms into virtual organizations.
• Implements the OSGA standards, among
others.
Cloud Computing
• Provides scalable services as a utility over the
Internet.
• Often built on a computer grid
• Users buy services from the cloud
– Grid users may develop and run their own
software
• Cluster/grid/cloud distinctions blur at the
edges!
Types of Distributed Systems
• Distributed Computing Systems
– Clusters
– Grids
– Clouds
• Distributed Information Systems
• Distributed Embedded Systems
Distributed Information Systems
• Business-oriented
• Systems to make a number of separate
network applications interoperable and build
“enterprise-wide information systems”.
• Two types :
– Transaction processing systems
– Enterprise application integration (EAI)
Transaction Processing Systems
• Provide a highly structured client-server
approach for database applications
• Transactions are the communication model
• Obey the ACID properties:
– Atomic: all or nothing
– Consistent: invariants are preserved
– Isolated (serializable)
– Durable: committed operations can’t be
undone
• Figure 1-8. Example primitives for
transactions.
Figure 1-8. Example primitives for transactions
Transactions
• Transaction processing may be centralized
(traditional client/server system) or
distributed.
• A distributed database is one in which the
data storage is distributed – connected to
separate processors.
Nested Transactions
• A nested transaction is a transaction within
another transaction (a sub-transaction)
– Example: a transaction may ask for two things
(e.g., airline reservation info + hotel info) which
would spawn two nested transactions
• Primary transaction waits for the results.
– While children are active parent may only abort,
commit, or spawn other children
Figure 1-9. A nested transaction.

Implementing Transactions
• Conceptually, private copy of all data
• Actually, usually based on logs
• Multiple sub-transactions – commit, abort
– Durability is a characteristic of top-level
transactions only
• Nested transactions are suitable for
distributed systems
– Transaction processing monitor may interface
between client and multiple data bases.
Enterprise Application Integration
• Less structured than transaction-based systems
• EA components communicate directly
– Enterprise applications are things like HR data,
inventory programs, …
– May use different OSs, different DBs but need to
interoperate sometimes.
• Communication mechanisms to support this include
CORBA, Remote Procedure Call (RPC) and Remote
Method Invocation (RMI)
Enterprise Application Integration
Figure 1-11. Middleware as a communication facilitator in enterprise

application integration.
Thank you!
65

Distributed System

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Distributed System

Încărcat de

Drepturi de autor:

Formate disponibile

Instructions for current semester

•Maximum 2 extra practicals .

•If the student comes for extra practical, the lab

•Minimum 80 % attendance in theory and 100% in

•If any student is found playing games/ surfing

• Coulouris: “A system in which hardware or software

• A distributed system is built on top of a network.

• Distributed Computing is computing performed in a

Centralized vs. Distributed Computing

– An intranet: a network of computers and workstations

Memory Tightly coupled shared memory Distributed memory

Control Global clock control No global clock control

Processor Order of Tbps Order of Gbps

Main Performance Performance(cost and scalability)

– Access remote resources

– Data sharing- Groupwares for Collaborative

Hide differences in data representation and how a resource is

Location Hide where a resource is located

Migration Hide that a resource may move to another location

enables multiple instances of resources to be used to increase

Failure Hide the failure and recovery of a resource

Persistence Hide whether a (software) resource is in memory or on disk

• Services specified through interfaces : Interface

Centralized services A single server for all users

Centralized data A single online telephone book

Centralized algorithms Doing routing based on complete

An example of dividing the DNS name space into zones.

– Problem: Limited scalability

DOS Tightly-coupled operating Hide and manage

NOS Loosely-coupled operating Offer local

Two clients and a server in a network operating system.

General structure of a distributed system as middleware.

– Treating everything as files: distributed file systems

General interaction between a client and a server.

• Openness and Extensibility

• Migration and load balancing

Figure 1-6. An example of a (Beowolf) cluster

• Fabric layer: interfaces to local

Figure 1-8. Example primitives for transactions

Figure 1-9. A nested transaction.

Figure 1-11. Middleware as a communication facilitator in enterprise

S-ar putea să vă placă și