Sunteți pe pagina 1din 30

Distributed Databases

Presentation-I


Mr. Gadakh Prashant J.
UNIT-II
DISTRIBUTED DATABASES
Study of DDBMS architectures,
Comparison of Homogeneous and Heterogeneous
Databases,
Analysis of Concurrency control in distributed
databases,
Implementation of Distributed query processing.
Distributed data storage,
Distributed transactions,
Commit protocols, Availability,
Distributed query processing,
Directory systems-LDAP,
Distributed data storage and transactions.

Distributed Database
In a distributed database system, the
database is stored on several computers.

The computers in distributed system
communicate with one another through
various communication media.

They do not share main memory or
disks.

Disadvantage of Parallel Database
Unlike parallel systems,

In which the processors are tightly
coupled and

Constitute a single database system,
INTRODUCTION
The computers in a distributed
system are referred to by a number
of different names, such as sites or
nodes, depending on the context in
which they are mentioned.

We mainly use the term site, to
emphasize the physical distribution
of these systems.



Objectives
Sharing data

Availability

Location Transparency
An Example of a Distributed
Database
Consider a banking system consisting of four branches
in four different cities. Each branch has its own
computer, with a database of all the accounts
maintained at that branch. Each such installation is
thus a site. There also exists one single site that
maintains information about all the branches of the
bank. Each branch maintains
(among others) a relation account(Account-schema),
where
Account-schema = (account-number, branch-name,
balance)

Cont..
The site containing information about all
the branches of the bank maintains the
relation.
branch(Branch-schema), where
Branch-schema = (branch-name, branch-
city, assets)
There are other relations maintained at
the various sites; we ignore them for
the purpose of our example.

Cont..
To illustrate the difference between the two types of
transactionslocal and globalat the sites,
consider a transaction to add $50 to account
number A-177 located at the Valleyview branch. If
the transaction was initiated at the Valleyview
branch, then it is considered local;
otherwise, it is considered global. A transaction to
transfer $50 from account A-177 to account A-305,
which is located at the Hillside branch, is a global
transaction, since accounts in two different sites are
accessed as a result of its execution.

Types of distributed database

Homogeneous distributed database System

Heterogeneous distributed database system
In a Homogeneous distributed database System

All the sites have identical database
management system software.

Are aware of each other and agree to cooperate
in processing user request.

Appears to user as a single system

DBMS
SOFTWARE
DBMS
SOFTWARE
DBMS
SOFTWARE
DBMS
SOFTWARE
DISTRIBUTED
DATABASE
Identical
DBMSS
Homogeneous distributed database System

In a heterogeneous distributed database System

Different sides may use different schemas and
software.

Sites may not be aware of each other and may
provides only limited facilities for cooperation
in transaction processing.
DBMS
SOFTWARE
DBMS
SOFTWARE
DBMS
SOFTWARE
DBMS
SOFTWARE
DISTRIBUTED
DATABASE
Non Identical
DBMSS
Heterogeneous distributed database System
Distributed DBMS Architectures

Client Server System

Collaborating server system

Middleware system
Client - Server System

It has one or more client processes & one or
more server processes.

Client are responsible for user interface issues ,
an servers manages data and execute
transaction
D/BASE
CLIENT
#1
SERVER #2
CLIENT
#2
CLIENT
#3
D/BASE
SERVER #1
Client
#1
Collaborating server system

In this system we have a collection of
database servers.

When a server receives a query that requires
access to data at other servers, it generates
appropriate sub queries to be executed by other
servers and puts the results together to
compute answers to the original query
Server
Server Server
Server
Query
result
Middleware Systems

The idea is that we need just one database server that
is capable of managing queries and transactions
spanning multiple servers; the remaining servers only
need to handle local queries and transactions.

special server as a layer of software that coordinates
the execution of queries and transactions across one or
more independent database servers; such a software is
often called middleware
Distributed Data Storage
Consider a relation r that is to be stored in the
database. There are two approaches to storing this
relation in the distributed database:

Replication. The system maintains several identical
replicas (copies) of the relation, and stores each replica
at a different site. The alternative to replication is to
store only one copy of relation r.

Fragmentation. The system partitions the relation
into several fragments, and stores each fragment at a
different site.
Storing data in distributed system
Fragmentation :
It consist of breaking a relation into smaller
relations or fragments & storing the fragment
possibly at different sites.

1. Horizontal fragmentation
2 .vertical Fragmentation

Horizontal Fragmentation


Each fragments consist of a subset of
rows of the original relation.

Tuples that belong to a given horizontal
fragment are identified by a selection query.
Vertical Fragmentation

Each fragments consist of a subset of
columns of the original relation.

Tuples that belong to a given horizontal
fragment are identified by a projection query.

68 Lucknow 8 19991012 Amit
63
Ghaziabad
8
19991020 Dhruv
61 Dehradun
8
19991041
Rishi
64 Mumbai 8 19991011 Amber
60 Banaras 8 19991014 Anurag
% Address Semester Roll No Name
19991012 Amit
19991020 Dhruv
19991041 Rishi
19991046 Amber
19991014 Anurag
Roll No Name
61 Dehradun 8 19991041 Rishi
64 Mumbai 8 19991011 Amber
% Address Semester Roll No Name
Vertical fragmentation
Horizontal fragmentation
Replication

System maintains several identical replicas
(copies) of the relation , and stores each replica
at a different site.
Advantages

Availability
Faster query evaluation

Disadvantages
Increased overhead on update


Distributed Query Processing
In a distributed system, we must take into account
several other matters, including The cost of data
transmission over the network The potential gain in
performance from having several sites process parts of
the query in parallel.

The relative cost of data transfer over the network and
data transfer to and from disk

varies widely depending on the type of network and on
the speed of the disks. Thus, in general, we cannot
focus solely on disk costs or on network costs. Rather,
we must find a good tradeoff between the two.
Cont..
In Next
Presentation,,,

S-ar putea să vă placă și