Sunteți pe pagina 1din 65

Instructions for current semester

•Maximum 2 extra practicals .

•If the student comes for extra practical, the lab


assignment will be different from the regular one.

•Minimum 80 % attendance in theory and 100% in


practicals.

•If any student is found playing games/ surfing


websites that are not related to the subject , will be
marked absent in that slot.
Introduction
to
Distributed System

PRASHASTI KANIKAR 2
Definitions
• Tanenbaum: “A distributed system is a collection of
independent computers that appears to its users as a
single coherent system”.

• Coulouris: “A system in which hardware or software


components located at networked computers
communicate and coordinate their actions only by
message passing”.

3
PRASHASTI KANIKAR 3
Introduction contd…
• 2 aspects to the definition:
– Hardware: autonomous computers, network links
– Software: communication protocols, system and
application software.

• A distributed system is built on top of a network.

• Distributed Computing is computing performed in a


distributed system.

4
PRASHASTI KANIKAR 4
Introduction contd…

terminal
mainframe computer
workstation

network link

network host
centralized computing
distributed computing

Centralized vs. Distributed Computing

5
PRASHASTI KANIKAR 5
Introduction contd…
• Examples of Distributed systems:
– Network of workstations (NOW): a group of
networked personal workstations connected to one or
more server machines.

– The Internet

– An intranet: a network of computers and workstations


within an organization.

6
PRASHASTI KANIKAR 6
Introduction contd…
• Why to study Distributed systems:
– Economics: distributed systems allow the pooling of
resources, including CPU cycles, data storage, input/output
devices, and services.
– Reliability: a distributed system allows replication of
resources and/or services
• Speed: a distributed system may have more total computing power than a
mainframe.
• Inherent distribution: Some applications are inherently distributed. Ex. a
supermarket chain.
• Incremental growth: Computing power can be added in small
increments. Modular expandability
• Another driving force: the existence of large number of personal
computers, the need for people to collaborate and share information.

PRASHASTI KANIKAR 7
Parallel vs Distributed Systems
Parallel Systems Distributed Systems

Memory Tightly coupled shared memory Distributed memory


UMA, NUMA Message passing, RPC, and/or used of
distributed shared memory

Control Global clock control No global clock control


SIMD, MIMD Synchronization algorithms needed

Processor Order of Tbps Order of Gbps


interconn Bus, mesh, tree, mesh of tree, and Ethernet(bus), token ring and SCI
ection hypercube (-related) network (ring), myrinet(switching network)

Main Performance Performance(cost and scalability)


focus Scientific computing Reliability/availability
Information/resource sharing
PRASHASTI KANIKAR 8
Goals of Distributed Systems
• Goals :
– Connecting users and resources

– Transparency

– Openness

– Scalability

9
PRASHASTI KANIKAR 9
Goals of Distributed Systems contd…
1. Connecting Users and Resources:

– Access remote resources

– Share resources

– Data sharing- Groupwares for Collaborative


editing, teleconferencing, video conferencing etc.

– Problem: Security
10
PRASHASTI KANIKAR 10
Goals of Distributed Systems contd…
2. Transparency:
Transparency Description

Hide differences in data representation and how a resource is


Access
accessed

Location Hide where a resource is located

Migration Hide that a resource may move to another location

Relocation Hide that a resource may be moved to another location while in use

enables multiple instances of resources to be used to increase


Replication reliability and performance without knowledge of the replicas by
users or application programmers.
enables several processes to operate concurrently using shared
Concurrency
resources without interference between them.

Failure Hide the failure and recovery of a resource

Persistence Hide whether a (software) resource is in memory or on disk


11
PRASHASTI KANIKAR 11
Goals of Distributed Systems contd…
3. Openness: System that offers services according to
standard rules that describe the syntax and semantics
of those services.

• Services specified through interfaces : Interface


Definition Language(IDL)

– Interoperability
– Portability
– Flexibility

12
PRASHASTI KANIKAR 12
Goals of Distributed Systems contd…
4. Scalability:
– Size scalability
• Can add more users and resources
– Geographical scalability
• Can spread across different geographical areas
– Administrative scalability
• Can be manageable even if it spans many
independent organizations

13
PRASHASTI KANIKAR 13
Goals of Distributed Systems contd…
 Scalability Problems
 Scaling w.r.t. Size:
Concept Example

Centralized services A single server for all users

Centralized data A single online telephone book

Centralized algorithms Doing routing based on complete


information

 Geographical Scaling:
 Form of communication: Synchronous
 Communication is inherently unreliable and point-
to-point.

14
PRASHASTI KANIKAR 14
Goals of Distributed Systems contd…
• Scalability Problems

• Administrative Scaling:
– Conflicting policies
– If DS expands to another domain, two types of
security measures need to be taken:
• DS has to protect itself against malicious attacks
from the new domain.
• The new domain has to protect itself against
malicious attacks from DS.

15
PRASHASTI KANIKAR 15
Goals of Distributed Systems contd…
• Scaling Techniques:
– Hiding communication latencies
– Distribution
– Replication

16
PRASHASTI KANIKAR 16
Goals of Distributed Systems contd…
• Scalability Techniques:
– Hiding communication latencies

17
PRASHASTI KANIKAR 17
Goals of Distributed Systems contd…
• Distribution
– Distribution

An example of dividing the DNS name space into zones.

18
PRASHASTI KANIKAR 18
Goals of Distributed Systems contd…
• Scaling Techniques:
– Replication
• Increases availability, balances load
• Caching :special form of replication
• Drawback: leads to inconsistency problems

19
PRASHASTI KANIKAR 19
Hardware Concepts
• Multiprocessors vs. Multicomputers

20
PRASHASTI KANIKAR 20
Hardware Concepts contd…
• Multiprocessors
– Property: all the CPUs have direct access to the
shared memory.

A bus-based multiprocessor.

21
PRASHASTI KANIKAR 21
Hardware Concepts contd…
• Multiprocessors
– Problem : bus will usually be overloaded
– Solution: high-speed cache memory
– Again there is problem with caches

– Problem: Limited scalability


– Solution: using crossbar switch, omega network

22
PRASHASTI KANIKAR 22
Hardware Concepts contd…
• Multiprocessors

a) A crossbar switch
b) An omega switching network
23
PRASHASTI KANIKAR 23
Hardware Concepts contd…
• Homogeneous Multicomputer systems
– Each CPU has direct connection to its own local
memory.
– Also referred as System Area Networks(SANs)
– Bus-based and switch-based
– Two popular topologies: Grid and Hypercube

24
PRASHASTI KANIKAR 24
Hardware Concepts contd…
• Homogeneous Multicomputer systems

a) Grid
b) Hypercube

25
PRASHASTI KANIKAR 25
Hardware Concepts contd…
• Heterogeneous Multicomputer systems
– Computers may vary w.r.t. processor type,
memory, sizes and I/O bandwidth.
– Varying interconnection networks

26
PRASHASTI KANIKAR 26
Software Concepts
• DS are very much like Operating systems
– Acting as resource managers
– Hides the heterogeneous nature of the underlying
H/W

Two categories:
tightly-coupled systems: DOS
loosely -coupled systems: NOS

Middleware
27
PRASHASTI KANIKAR 27
Software Concepts
System Description Main Goal

DOS Tightly-coupled operating Hide and manage


system for multi-processors hardware
and homogeneous multi- resources
computers

NOS Loosely-coupled operating Offer local


system for heterogeneous services to
multi-computers (LAN and remote clients
WAN)
Middleware Additional layer atop of Provide
NOS implementing general- distribution
purpose services transparency
28
PRASHASTI KANIKAR 28
Software Concepts contd…
• Distributed Operating systems: two types
– Multiprocessor operating system
– Multicomputer operating system

29
PRASHASTI KANIKAR 29
Software Concepts contd…
• Multiprocessor Operating systems:
– Goal: make the number of CPUs transparent to the
application
– Idea: protect the data against simultaneous access
– Semaphores and Monitors

30
PRASHASTI KANIKAR 30
Software Concepts contd…
• Multicomputer Operating systems:

31
PRASHASTI KANIKAR 31
Software Concepts contd…
• Network Operating systems:

32
PRASHASTI KANIKAR 32
Software Concepts contd…
• Network Operating systems:

Two clients and a server in a network operating system.

33
PRASHASTI KANIKAR 33
Software Concepts contd…
• Middleware:

General structure of a distributed system as middleware.

34
PRASHASTI KANIKAR 34
Role of Middleware (MW)
 In some early research systems: MW tried to provide
the illusion that a collection of separate machines was
a single computer.
 E.g. NOW project: GLUNIX middleware
 Today:
 clustering software allows independent computers to
work together closely
 MW also supports seamless access to remote services,
doesn’t try to look like a general-purpose OS

PRASHASTI KANIKAR 35
Software Concepts contd…
• Middleware Models:

– Treating everything as files: distributed file systems


(distribution transparency is supported for traditional
files)
– Based on Remote Procedure Calls(RPCs): hides
network communication by allowing a process to call a
procedure of which an implementation is located on a
remote machine.
– Distributed Objects: each object implements an
interface that hides the internal details of the object
from its users.
– Distributed Documents

36
PRASHASTI KANIKAR 36
Software Concepts contd…
• Middleware Services:
– Communication: implements access transparency
– Naming: allows entities to be shared and looked-
up
– Persistence: for storage by means of distributed
file systems, integrated databases
– Distributed Transactions: it allows multiple read
and write operations to occur atomically
– Security

37
PRASHASTI KANIKAR 37
Software Concepts
Item Distributed OS Network OS Middleware based
OS
Multiproc. Multicomp.
Degree of Very high high low high
Transparency
Same OS on all Yes Yes No No
nodes?
Number of 1 N N N
copies of OS
Basis for Shared Messages Files Model specific
communication Memory
Resource Global, Global, Per node Per node
Management central distributed
Scalability No Moderately Yes Varies
Openness Closed Closed Open Open

38
PRASHASTI KANIKAR 38
The Client-Server Model
• Server: process implementing a specific service eg., file
system service
• Client: process that requests a service from a server .
• Client-server interaction, known as Request-Reply behavior

General interaction between a client and a server.


39
PRASHASTI KANIKAR 39
Important Design Issues in Distributed Systems
• Concurrency
– Shared access to resources should be made
possible

• Openness and Extensibility


– Interfaces should be cleanly separated

• Migration and load balancing


– Allow movement of tasks and balance load among
available resources

• Security
– Resources to be secured
40
PRASHASTI KANIKAR 40
Advantages and Disadvantages of Distributed
Systems
• Advantages:
– The affordability of computers and availability of
network access
– Resource sharing
– Scalability
– Fault Tolerance
• Disadvantages:
– Multiple points of failure
– Security concerns
41
PRASHASTI KANIKAR 41
Types of Distributed Systems
• Distributed Computing Systems
– Clusters
– Grids
– Clouds
• Distributed Information Systems
– Transaction Processing Systems
– Enterprise Application Integration
• Distributed Embedded Systems
– Home systems
– Health care systems
– Sensor networks
PRASHASTI KANIKAR 42
Cluster Computing
• A collection of similar processors (PCs,
workstations) running the same operating
system, connected by a high-speed LAN.
• Parallel computing capabilities using
inexpensive PC hardware
• Replace big parallel computers

PRASHASTI KANIKAR 43
Cluster Types & Uses
• High Performance Clusters (HPC)
– run large parallel programs
– Scientific, military, engineering apps; e.g., weather
modeling
• Load Balancing Clusters
– Front end processor distributes incoming requests
– e.g., at banks or popular web site
• High Availability Clusters (HA)
– Provide redundancy – back up systems
– May be more fault tolerant than large mainframes
PRASHASTI KANIKAR 44
Clusters – Beowulf model

• Linux-based
• Master-slave paradigm
– One processor is the master; allocates tasks to
other processors, maintains batch queue of
submitted jobs, handles interface to users
– Master has libraries to handle message-based
communication or other features (the
middleware).

PRASHASTI KANIKAR 45
Cluster Computing Systems
• Figure 1-6. An example of a cluster computing
system.

Figure 1-6. An example of a (Beowolf) cluster


computing system
PRASHASTI KANIKAR 46
Clusters – MOSIX model
• Provides a symmetric, rather than hierarchical
paradigm
– High degree of distribution transparency (single
system image)
– Processes can migrate between nodes dynamically
and preemptively.Migration is automatic
• Used to manage Linux clusters

PRASHASTI KANIKAR 47
More About MOSIX
• “Operating-system-like”; looks & feels like a
single computer with multiple processors
• Supports interactive and batch processes
• Provides resource discovery and workload
distribution among clusters
• Clusters can be partitioned for use by an
individual or a group
• Best for compute-intensive jobs

PRASHASTI KANIKAR 48
Grid Computing Systems
• Modeled loosely on the electrical grid.
• Highly heterogeneous with respect to
hardware, software, networks, security
policies, etc.
• Grids support virtual organizations: a
collaboration of users who pool resources
(servers, storage, databases) and share
them
• Grid software is concerned with managing
sharing across administrative domains.
PRASHASTI KANIKAR 49
Grids
• Similar to clusters but processors are more loosely
coupled, tend to be heterogeneous, and are not all in
a central location.
• Can handle workloads similar to those on
supercomputers, but grid computers connect over a
network and supercomputers’ CPUs connect to a
high-speed internal bus/network
• Problems are broken up into parts and distributed
across multiple computers in the grid.
• less communication between parts than in clusters.

PRASHASTI KANIKAR 50
A Proposed Architecture for Grid Systems*

• Fabric layer: interfaces to local


resources at a specific site
• Connectivity layer: protocols to
support usage of multiple
resources for a single application;
e.g., access a remote resource or
transfer data between resources;
and protocols to provide security
• Resource layer manages a single
resource, using functions supplied
by the connectivity layer
• Collective layer: resource
discovery, allocation, scheduling,
etc.
• Applications: use the grid
resources
• The collective, connectivity and
resource layers together form the
middleware layer for a grid Figure 1-7. A layered architecture
for grid computing systems
PRASHASTI KANIKAR 51
OGSA – Another Grid Architecture
• Open Grid Services Architecture (OGSA) is a
service-oriented architecture
– Sites that offer resources to share do so by
offering specific Web services.
• The architecture of the OGSA model is more
complex than the previous layered model.

PRASHASTI KANIKAR 52
Globus Toolkit
• An example of grid middleware
• Supports the combination of heterogeneous
platforms into virtual organizations.
• Implements the OSGA standards, among
others.

PRASHASTI KANIKAR 53
Cloud Computing
• Provides scalable services as a utility over the
Internet.
• Often built on a computer grid
• Users buy services from the cloud
– Grid users may develop and run their own
software
• Cluster/grid/cloud distinctions blur at the
edges!

PRASHASTI KANIKAR 54
Types of Distributed Systems
• Distributed Computing Systems
– Clusters
– Grids
– Clouds
• Distributed Information Systems
• Distributed Embedded Systems

PRASHASTI KANIKAR 55
Distributed Information Systems
• Business-oriented
• Systems to make a number of separate
network applications interoperable and build
“enterprise-wide information systems”.
• Two types :
– Transaction processing systems
– Enterprise application integration (EAI)

PRASHASTI KANIKAR 56
Transaction Processing Systems
• Provide a highly structured client-server
approach for database applications
• Transactions are the communication model
• Obey the ACID properties:
– Atomic: all or nothing
– Consistent: invariants are preserved
– Isolated (serializable)
– Durable: committed operations can’t be
undone

PRASHASTI KANIKAR 57
Transaction Processing Systems
• Figure 1-8. Example primitives for
transactions.

Figure 1-8. Example primitives for transactions

PRASHASTI KANIKAR 58
Transactions
• Transaction processing may be centralized
(traditional client/server system) or
distributed.
• A distributed database is one in which the
data storage is distributed – connected to
separate processors.

PRASHASTI KANIKAR 59
Nested Transactions
• A nested transaction is a transaction within
another transaction (a sub-transaction)
– Example: a transaction may ask for two things
(e.g., airline reservation info + hotel info) which
would spawn two nested transactions
• Primary transaction waits for the results.
– While children are active parent may only abort,
commit, or spawn other children

PRASHASTI KANIKAR 60
Transaction Processing Systems

Figure 1-9. A nested transaction.


PRASHASTI KANIKAR 61
Implementing Transactions
• Conceptually, private copy of all data
• Actually, usually based on logs
• Multiple sub-transactions – commit, abort
– Durability is a characteristic of top-level
transactions only
• Nested transactions are suitable for
distributed systems
– Transaction processing monitor may interface
between client and multiple data bases.
PRASHASTI KANIKAR 62
Enterprise Application Integration
• Less structured than transaction-based systems
• EA components communicate directly
– Enterprise applications are things like HR data,
inventory programs, …
– May use different OSs, different DBs but need to
interoperate sometimes.
• Communication mechanisms to support this include
CORBA, Remote Procedure Call (RPC) and Remote
Method Invocation (RMI)

PRASHASTI KANIKAR 63
Enterprise Application Integration

Figure 1-11. Middleware as a communication facilitator in enterprise


application integration.
PRASHASTI KANIKAR 64
Thank you!

65
PRASHASTI KANIKAR 65

S-ar putea să vă placă și