Documente Academic
Documente Profesional
Documente Cultură
Introduction
A major motivation behind the development of database systems is the
desire to integrate the operational data of an organization and to provide
controlled access to the data. Although integration and controlled access
may imply centralization, this is not the intention.
In fact, the development of computer networks promotes a decentralized
mode of work. This decentralized approach mirrors the organizational
structure of many companies, which are logically distributed into
divisions, departments, projects, and so on, and physically distributed
into offices, plants, factories, where each unit maintains its own
operational data. The sharing ability of the data and the efficiency of
data access should be improved by the development of a distributed
database system that reflects this organizational structure, makes the data
in all units accessible, and stores data proximate to the location where it
is most frequently used.
Distributed DBMS
The software system that permits the management of the
distributed database and makes the distribution transparent to
users.
A Distributed Database Management System (DDBMS) consists of a
single logical database that is split into a number of fragments. Each
fragment is stored on one or more computers under the control of a
separate DBMS, with the computers connected by a communications
network. Each site is capable of independently processing user requests
that require access to local data and is also capable of processing data
stored on other computers in the network.
Users access the distributed database via applications. Applications are
classified as those that do not require data from other sites (local
Applications) and those that do require data from other sites (global
applications). We require a DDBMS to have at least one global
application.
Banking Example
♦ Fragmentation
A relation may be divided into a number of subrelations, called
fragments, which are the distributed.
EMP2=σDEPTNO=20 (EMP)
EMP3=σDEPTNO=30 (EMP)
r=r1*r2*r3………rn
Mixed Fragmentation
Mixed fragmentation, also known as Hybrid fragmentation, intermixes
the horizontal and vertical fragmentation.
The relation r is divided into a number of fragment relations r1, r2……..
rn. Each fragment is obtained as the result of application of either the
horizontal fragmentation or vertical fragmentation scheme on relation r,
or on a fragment of r that was obtained previously.
For example, if we can combine the horizontal and vertical
fragmentation of the EMP relation, it will result into a mixed
fragmentation. This relation is divided initially into the fragments EMP1
and EMP2 as vertical fragments. We can now further divide fragment
EMP1 using the horizontal-fragmentation scheme, into the following two
fragments: EMP1a=σDEPTNO= 10 (EMP1)
EMP2a=σDEPTNO= 20 (EMP2)
EMP3a=σDEPTNO= 30 (EMP3)
Data Replication and Fragmentation
The techniques described for data replication and data fragmentation can
be applied successively to the same relation. That is, a fragment can be
replicated, replicas of fragments can be fragmented further, and so on.
For example, consider a distributed system consisting of sites S1, S2…
….S11. We can fragment EMP into EMP1a, EMP2a and EMP2, and for
example, store a copy of EMP1a at sites S1, S3 and S7; a copy of
EMP2a at sites S4 and S11; and a copy of EMP2 at sites S2, S8 and S9.
Complete replication
This strategy consists of maintaining a complete copy of the database at
each site. Therefore, locality of reference, reliability and availability, and
performance are maximized. However, storage costs and communication
costs for updates are the most expensive. To overcome some of these
problems, snapshots are sometimes used. A snapshot is a copy of the
data at a given time. The copies are updated periodically, for example,
hourly or weekly, so they may not be always up to date. Snapshots are
also sometimes used to implement views in a distributed database to
improve the time it takes to perform a database operation on a view.
Selective replication This strategy is a combination of fragmentation,
replication and centralized. Some data items are fragmented to achieve
high locality of reference and others, which are used at many sites and
are not frequently updated, are replicated; otherwise, the data items are
centralized. The objective of this strategy is to have all the advantages of
the other approaches but none of the disadvantages. This is the most
commonly used strategy because of its flexibility.