Sunteți pe pagina 1din 21

COIT20247

DATABASE DESIGN& DEVELOPMENT

Module 7 – Distributed databases


OBJECTIVES
 Define what is meant by a distributed database
 Describe how this differs from a decentralised
database
 Describe the difference between homogeneous
and heterogeneous distributed databases
 Describe location transparency and local
autonomy
 Describe options for distributing: replication
and partitioning.

2
DISTRIBUTED DATABASE
 So far we have mostly used Access databases
 Access databases consist of a single file in a
single location
 However, it is not always ideal to have all data
stored at the same physical location
 A distributed database is a single logical
database that is spread physically across
computers in multiple locations that are
connected by data communications.

3
SINGLE LOGICAL DATABASE

 “Physically spread” means that different


subsets of data are stored on different servers
in different locations.
 A “single logical database” means that:
 The database should appear the same as a single
local database to the user/program.
 Any user or program that accesses the database
should be unable to tell that the data is distributed.
 Users/programs should not have to “navigate”
(provide a file or network path) to find the data

4
ARCHITECTURE

 A DBMS runs at each physical site


 Each DBMS manages the data at that site.
 Each site has a subset (or possibly a complete
set) of the data in the database
 The next two slides illustrate how data might be
distributed

5
EXAMPLE

Notice the horizontal partitioning?

6
EXAMPLE

From: McFadden, F., Hoffer, J. & Prescott, M. 1998, Modern Database Management, 4th edn, Addison-
Wesley, New Jersey.
7
HOMOGENEOUS VS HETEROGENEOUS

 Note the DBMS at each slide at each site.


 If each site uses the same DBMS (e.g. if each
site uses Oracle), it is known as a
homogeneous system
 When the DBMSs are not all the same, it is
known as a heterogeneous system

8
DISTRIBUTED VS DECENTRALISED

 “Distributed” is not the same as “decentralised”


 Both types of databases are physically spread

 In a distributed database, the users should not


be aware of the physical spread/location of the
data
 In a de-centralised database, the users typically
have to provide a navigation path to the data.

9
DISTRIBUTED VS DECENTRALISED

 Distributed database
 Appears as one database to the user
 Users should not normally be aware of the
location of any given data
 Decentralised database:
 Does not appear as one database to the user
 User will have to manually navigate to data at
another site – will have to know where it is.

10
GOALS OF DISTRIBUTED DATABASE

 Goals of a distributed database include:


 Local autonomy
 Location transparency

 No reliance on central site

 Continuous operation

 Fragmentation independence

 Replication independence

 Optimised distributed query processing

11
GOALS OF DISTRIBUTED DATABASE

 Goals continued:
 Distributed transaction management
 Hardware independence

 Operating system independence

 Network independence

 DBMS independence

 We will look at just these two:


 Local autonomy
 Location transparency

12
LOCATION TRANSPARENCY

 Location transparency means that the user or


program need not know the location of the data
 Any request for data is automatically forwarded
to the appropriate DBMS at the appropriate
site.

13
LOCAL AUTONOMY

 Local autonomy means that a DBMS should


still continue to operate even if other nodes
have failed (obviously data from the failed node
may be unavailable).
 Each site should have the capability to provide
local users access to local data, administer
security, log transactions etc, even when any
central or coordinating site is unavailable.
 Means no reliance on a central site.

14
OPTIONS FOR DISTRIBUTING

 Data can be distributed among nodes in a


number of ways:
 Data replication
 Horizontal partitioning

 Vertical partitioning

 Combinations of the above.

15
DATA REPLICATION

 Data replication involves duplicating some


or all data at each site, e.g.:

16
DATA REPLICATION

 Advantages include:
 Fasterlocal access
 Greater autonomy

 Disadvantages include:
 Difficulty maintaining consistent copies of the
data

17
PARTITIONING

 Horizontal partitioning is where different rows


of a relation are distributed to different physical
locations (see for e.g. diagram on slide 6).
 Vertical partitioning is where different
attributes of a relation are distributed to
different physical locations.
 We discussed this in physical design lecture.
Concept is the same here.

18
PHYSICAL DISTRIBUTION

 It is important to remember that such


partitioning or replication occurs at the physical
level just like the partitioning described in the
physical design lecture.
 The users should always see a complete table.

19
SUMMARY

 A distributed database is one that appears as a


single local database to the user but is stored
across different physical locations.
 A distributed database appears as a single
database to the user, a decentralised database
does not appear as a single database.
 Two goals of a distributed database are local
autonomy and location transparency.

20
SUMMARY

 A homogeneous distributed database uses the


same DBMS at each site
 A hetereogeneous distributed database does
not use the same DBMS at each site.
 Options for distributed a database include:
 Data replication
 Horizontal partitioning

 Vertical partitioning

 Combinations of these.

21

S-ar putea să vă placă și