Sunteți pe pagina 1din 25

overview of Database

A Database is a collection of related data organised in a way that data can be easily accessed,
managed and updated. Any piece of information can be a data, for example name of your school.
Database is actualy a place where related piece of information is stored and various operations
can be performed on it.

DBMS
A DBMS is a software that allows creation, definition and manipulation of database. Dbms is
actualy a tool used to perform any kind of operation on data in database. Dbms also provides
protection and security to database. It maintains data consistency in case of multiple users. Here
are some examples of popular dbms, MySql, Oracle, Sybase, Microsoft Access and IBM DB2
etc.

Components of Database System


The database system can be divided into four components.

Users : Users may be of various type such as DB administrator, System developer and
End users.

Database application : Database application may be Personal, Departmental, Enterprise


and Internal

DBMS : Software that allow users to define, create and manages database access, Ex:
MySql, Oracle etc.

Database : Collection of logical data.

Functions of DBMS

Provides data Independence

Concurrency Control

Provides Recovery services

Provides Utility services

Provides a clear and logical view of the process that manipulates data.

Advantages of DBMS

Segregation of applicaion program.

Minimal data duplicacy.

Easy retrieval of data.

Reduced development time and maintainance need.

Disadvantages of DBMS

Complexity

Costly

Large in size

Database Architecture
Database architecture is logically divided into two types.
1.

Logical two-tier Client / Server architecture

2.

Logical three-tier Client / Server architecture

Two-tier Client / Server Architecture

Two-tier Client / Server architecture is used for User Interface program and Application
Programs that runs on client side. An interface called ODBC(Open Database
Connectivity) provides an API that allow client side program to call the dbms. Most
DBMS vendors provide ODBC drivers. A client program may connect to several
DBMS's. In this architecture some variation of client is also possible for example in
some DBMS's more functionality is transferred to the client including data dictionary,
optimization etc. Such clients are called Data server.

Three-tier Client / Server Architecture

Three-tier Client / Server database architecture is commonly used architecture for web
applications. Intermediate layer called Application server or Web Server stores the
web connectivty software and the business logic(constraints) part of application used to
access the right amount of data from the database server. This layer acts like medium
for sending partially processed data between the database server and the client.

Database Model
A Database model defines the logical design of data. The model describes the
relationships between different parts of the data. In history of database design, three
models have been in use.

Hierarchical Model

Network Model

Relational Model

Hierarchical Model
In this model each entity has only one parent but can have several children . At the top
of hierarchy there is only one entity which is called Root.

Network Model
In the network model, entities are organised in a graph,in which some entities can be
accessed through sveral path

Relational Model
In this model, data is organised in two-dimesional tables called relations. The tables or
relation are related to each other.

Database Keys
Keys are very important part of Relational database. They are used to establish and
identify relation between tables. They also ensure that each record within a table can be
uniquely identified by combination of one or more fields within a table.

Super Key
Super Key is defined as a set of attributes within a table that uniquely identifies each
record within a table. Super Key is a superset of Candidate key.

Candidate Key
Candidate keys are defined as the set of fields from which primary key can be selected.
It is an attribute or set of attribute that can act as a primary key for a table to uniquely
identify each record in that table.

Primary Key
Primary key is a candidate key that is most appropriate to become main key of the
table. It is a key that uniquely identify each record in a table.

Composite Key
Key that consist of two or more attributes that uniquely identify an entity occurance is
called Composite key. But any attribute that makes up the Composite key is not a
simple key in its own.

Secondary or Alternative key


The candidate key which are not selected for primary key are known as secondary keys
or alternative keys

Non-key Attribute
Non-key attributes are attributes other than candidate key attributes in a table.

Non-prime Attribute
Non-prime Attributes are attributes other than Primary attribute.

dBMS:Centralised vs Distributed
Contents
[hide]

1 Objective Data and database administrator


2 Centralised
o

2.1 Examples

2.2 Advantages

2.3 Disadvantages
3 Distributed database

3.1 Examples

3.2 Advantages

3.3 Disadvantages
4 Review

4.1 Review questions


5 References

Large commercial databases may exist in two different Topologies.

Centralised - where the database is physically in one location and users typically use an
Internet connection to access it. Banks (such as ANZ) tend to use centralised databases.

Distributed - Where the database is in many locations often where you have a national or
international company and customers tend to regularly interact with a local branch. For example:
Google uses Big-Table a distributed DBMS as searching tends to be by users in a particular
region of the world.

In both cases the database "looks" like one database.

Objective Data and database administrator


By the end of this page you will be able to:

Describe the features, advantages and disadvantages of:

Distributed Database, and

Centralised

Centralised

ANZ: Banking online

A single database maintained in one location.

Managed by a database administrator. (usually )

Access via a communications network

LAN

WAN

Terminals provide distributed access

Examples

Some major banks do all their processing on a mainframe, in some cases in a different
country.

Clients may use several branches, and online banking for transactions.

Airline reservation systems need to be centralised to avoid double bookings.

Inland Revenue in New Zealand is countrywide

In NZ Police and ambulance calls are sent to a central call center.

Question: For a National call centre (NZ Police/Ambulance) - What are


potential Problems?

In an emergency making sure you ask what town/city?

Missing local knowledge - e.g. Next to lime works

Advantages

Increased reliability and availability

Modular (incremental) growth

Lower communication costs

Faster Response

Disadvantages

Software cost and complexity

Processing overheads

Data integrity

Distributed database

Google search: Pacman at http//www.google.com/pacman


A single logical database that is spread physically across computers in multiple locations that are
connected by a data communications link.

Most processing is local

Need for local ownership of data

Data sharing require

Note that users think they are working with a single corporate database

Examples

Chain Stores like the MSD Spears (50% locally owned)

Google: Use a DBMS called Bigtable. (Note it is not a Relational Database).("What database
does Google use?", 2010[1]; Chang,F., et al.,2006[2])

Advantages

Minimise communications

Costs

Local control

Disadvantages

Adds to complexity and cost

Processing overheads

Data Integrity

Centralised Vs. Decentralised Data Processing:


Broadly speaking, there are two distinct approaches to the
organisation of IT infrastructure:

1. Centralised IT Infrastructure:
In a centralised IT infrastructure, a central computing facility comprising one or more large computers is located and all the applications are mounted on it, wherein the entire data, irrespective of its
source, origin and type, are located and processed.
A typical centralised IT infrastructure consists of a large central
computer system with-a variety of highly configured peripheral
devices concentrated at that location. A battery of dump terminals,
not necessarily physically close to the central computer system, are
connected to it with the help of communication links to enable users

to interact with the system to initiate flow of information to and from


the central computer system.
The advantages of centralizing IT infrastructure include:
(a) Economies of scale in procurement of hardware
(b) Software and maintenance facilities
(c) Convenience in effective enforcement of standards with regard
to programming, data structures and communication equipment/
protocol, and security systems.
Centralised IT infrastructure has been the most popular organisation
in the business enterprises that were using mainframes, earlier.
These mainframes required highly skilled computer professionals to
use and maintain even the daily routines of the computer systems
and communication links. The control systems were not very
efficient and thus, maintaining physical control over the information
and other costly resources was considered essential.

2. Decentralised IT Infrastructure:
However, with the advancements in data communication technologies and availability of reliable data communication facilities at
declining costs, business enterprises are switching over from centralised data processing to varying degrees of decentralised data

processing (DDP). The DDP facility would generally consist of relatively smaller computers located at different places in an enterprise
with or without a central computing facility.
Advantages:
The rationale behind switching over to varying types of DDP
can be traced from the following advantages that are associated with DDP:
(a) The data processing facility in DDP at each location is oriented
to satisfy the specific needs and is developed in the light of the local
constraints.
(b) DDP also covers, to some extent, the risk of putting all eggs in
one basket. It also ensures that any failure (of hardware, software or
personnel) has minimum possible impact on the overall functioning
of the system.
(c) DDP helps in efficient use of computing facilities by personnel
located at different places by the resource sharing.
(d) Obviously, DDP ensures increased users involvement which is
one of the important success factors for any system. It also permits
local development of small applications for local use.

(e) DDP offers the necessary flexibility for gradual growth in hardware and software, and ease in their replacement.
(f) It provides quicker response to users, more particularly when a
local portion of the facility is to be used. This, in turn, ensures higher
end-user productivity.
(g) As individual responsibilities can be assigned easily for security
and privacy of information in case of DDP, security system is likely
to be more effective.
(h) Vendor independence at each location also provides the flexibility of adapting systems and application software to the changing
needs at each location without adversely affecting the work at other
locations.
Disadvantages:
However, DDP also has its own limitations that warrant
selective use of the DDP approach. Some of these limitations
include:
(a) Data generated by one application may not be useful for another
application due to lack of standardization of data structures resulting
in problems of incompatibility and duplication of data. Thus, it is
necessary to draw a central plan for generation of information.

(b) Duplication may also occur in software effort as similar applications may be developed by different technical personnel.
(c) DDP depends for its day-to-day operations on data communication facilities that may not be operational all the time. Maintenance
problems in communication facilities may adversely affect the
functioning.
(d) As the data, in DDP, may be dispersed and stored in diverse
forms, it may become difficult to update data and exercise control
over it. As it can be observed, the impact of these limitations can be
minimised, if not eliminated, by adopting suitable managerial
policies.
It may be noted that the choice is not exclusive between centralisation and decentralisation. In fact, depending upon the application,
volume of data, nature of processing, communication facilities, information needs and other critical factors, a choice is made for various
applications with regard to the scope and degree of
decentralisation.

Scope of Decentralisation: What to Decentralise:


The decentralisation may take place not only for the databases with
which this concept is normally associated.

Rather, any of the functions may be decentralised, and the


scope and degree of DDP is defined by the decentralisation
that takes place in the following functions:
(a) Applications development and use
(b) Databases management
(c) Communication and
(d) Control

(a) Decentralised applications:


A data processing system may be decentralised primarily for
allocations of application function either by a) splitting up an
application into various sections and then allocating to various
locations or b) replicating the same application at different locations.
A typical example of split up of application would be DDP systems in
banks. Each branch processes data regarding its transactions and
summary information is passed on to the head office/zonal
office/main branch.
The application software for each branch is common and the head
office has the necessary application software for processing the
summary information. Where the operations are similar at all
locations and hierarchical relationship for that operation does not

exist among different locations, the application may be just


replicated at each location. A typical example of such replication is
the hotel reservation system. Each location books rooms at its
location and at other locations as well.

(b) Decentralised databases:


Databases may be maintained at a central computing facility
(centralised databases) or its parts may be dispersed at different
locations (Decentralised databases). The centralised databases
have the advantages that they are better manageable and data
consistency and integrity can be ensured easily. However,
distributed databases are preferred due to their high reliability,
expandability and lower communication expenses.
Tony Gunton identifies three basic options in decentralising
the databases. They are:
(a) In the dispersed databases, all the updation of information is
done at the place of origin of information and thus information is
divided on the basis of entities such as function, project or profit
centre or whatever matches the way the business activities are
organised.
Such an arrangement is more suitable where each of the entities
has little dependence on each other or the co-ordination among

them is not time critical. The transmission of information in such


cases is less frequent (normally periodic or need based) and
minimal in terms of number of data elements. As a result,
communication costs are less and pose less difficult control
problems.
(b) Distributed databases are most useful where the business entities or functions need closer co-ordination but most of the
processing takes place locally as is done in the case of dispersed
arrangement. In this arrangement, the relationships between
various data elements are stored either at various places or centrally and are maintained automatically.
Thus, the updation of relationships takes place almost on real time
basis or with a short delay between the transaction and updation.
The updation is quickened to ensure that the user at the other end
may need up-to-date information. The access to data at various
entities is controlled by a data dictionary. A part of the commonly
used data may be stored at the user end and updated periodically,
to minimise the data communication needs and costs.
Such an arrangement is more useful where the dependence of the
activities is sequential in nature and the activities at one place must
be carried out only after checking at the previous activities at some
other place. In applications where scheduling of activities is

essential, such an arrangement is highly desirable. However, such


an arrangement is most demanding in terms of technology and
requires a more mature IT infrastructure.
(c) Replicated arrangement is similar to the typical centralised data
processing except that a subset of the total database is sent to the
place where the transactions originate so that the subset is updated
immediately after the transaction has taken place.
The central data is updated either at the end of the day or almost
immediately after the transaction has taken place depending upon
the need of updation by users at the other ends. The replicating of
data files minimises communication needs between the origin of
information and user.
Such an arrangement is suitable even for the time critical
centralised activities such as centralised procurement of materials
for use at various locations. However, where the degree of time
criticality is very high such as an operation of bank deposit
accounts, this would not be advisable.
It may be remembered that these arrangements are not mutually
exclusive in an enterprise. One can use any combination of these
arrangements depending upon the nature of business activity and

the relationships between the origin of transaction and use of information.


If the information system is spread over a large geographical area,
replicated or dispersed type of arrangement may be preferred in
order to reduce the cost of communication. If all the users in a group
need the same kind of data at different occasions, again these two
arrangements would be preferred. If data gets updated quite
frequently and up-to-date data is essential for decision making at
the other end, distributed arrangement would be more suitable.

(c) Decentralised communication:


In a large DDP system, communication function may be performed
very frequently. In such a- case, some of the components of DDP
system may be dedicated only to the communication function. A
front end processor may act as an interface between the back end
processor and the other computer system, performing the function
of communication and relieving the back end processor of
considerable burden.

(d) Decentralised control:


For efficient functioning of DDP, some control system regarding
access to each of the facility and management and communication
facility, is essential. For the smooth working of such a control

system some management and control mechanism is required at


each location. Such a control mechanism may be centralised or
decentralised with varying degrees.

S-ar putea să vă placă și