Sunteți pe pagina 1din 8

A Database Management System (DBMS) is a set of computer programs that controls the creation,

maintenance, and the use of a database. It allows organizations to place control of database
development in the hands of database administrators (DBAs) and other specialists. A DBMS is a system
software package that helps the use of integrated collection of data records and files known as
databases. It allows different user application programs to easily access the same database. DBMSs may
use any of a variety of database models, such as the network model or relational model. In large
systems, a DBMS allows users and other software to store and retrieve data in a structured way. Instead
of having to write computer programs to extract information, user can ask simple questions in a query
language. Thus, many DBMS packages provide Fourth-generation programming language (4GLs) and
other application development features. It helps to specify the logical organization for a database and
access and use the information within a database. It provides facilities for controlling data access,
enforcing data integrity, managing concurrency, and restoring the database from backups. A DBMS also
provides the ability to logically present database information to users.

Overview
A DBMS is a set of software programs that controls the organization, storage, management, and
retrieval of data in a database. DBMSs are categorized according to their data structures or types.
The DBMS accepts requests for data from an application program and instructs the operating
system to transfer the appropriate data. The queries and responses must be submitted and
received according to a format that conforms to one or more applicable protocols. When a
DBMS is used, information systems can be changed more easily as the organization's
information requirements change. New categories of data can be added to the database without
disruption to the existing system.

Database servers are dedicated computers that hold the actual databases and run only the DBMS
and related software. Database servers are usually multiprocessor computers, with generous
memory and RAID disk arrays used for stable storage. Hardware database accelerators,
connected to one or more servers via a high-speed channel, are also used in large volume
transaction processing environments. DBMSs are found at the heart of most database
applications. DBMSs may be built around a custom multitasking kernel with built-in networking
support, but modern DBMSs typically rely on a standard operating system to provide these
functions.

Database normalization
From Wikipedia, the free encyclopedia

Jump to: navigation, search

In the field of relational database design, normalization is a systematic way of ensuring that a
database structure is suitable for general-purpose querying and free of certain undesirable
characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data
integrity.[1]
Edgar F. Codd, the inventor of the relational model, introduced the concept of normalization and
what we now know as the First Normal Form (1NF) in 1970.[2] Codd went on to define the
Second Normal Form (2NF) and Third Normal Form (3NF) in 1971,[3] and Codd and Raymond
F. Boyce defined the Boyce-Codd Normal Form (BCNF) in 1974.[4] Higher normal forms were
defined by other theorists in subsequent years, the most recent being the Sixth normal form
(6NF) introduced by Chris Date, Hugh Darwen, and Nikos Lorentzos in 2002.[5]

Informally, a relational database table (the computerized representation of a relation) is often


described as "normalized" if it is in the Third Normal Form.[6] Most 3NF tables are free of
insertion, update, and deletion anomalies, i.e. in most cases 3NF tables adhere to BCNF, 4NF,
and 5NF (but typically not 6NF).

A standard piece of database design guidance is that the designer should create a fully
normalized design; selective denormalization can subsequently be performed for performance
reasons.[7] However, some modeling disciplines, such as the dimensional modeling approach to
data warehouse design, explicitly recommend non-normalized designs, i.e. designs that in large
part do not adhere to 3NF.[8]

Objectives of normalization
A basic objective of the first normal form defined by Codd in 1970 was to permit data to be
queried and manipulated using a "universal data sub-language" grounded in first-order logic.[9]
(SQL is an example of such a data sub-language, albeit one that Codd regarded as seriously
flawed.)[10]

The objectives of normalization beyond 1NF (First Normal Form) were stated as follows by
Codd:

1. To free the collection of relations from undesirable insertion, update and


deletion dependencies;
2. To reduce the need for restructuring the collection of relations as new types of
data are introduced, and thus increase the life span of application programs;
3. To make the relational model more informative to users;
4. To make the collection of relations neutral to the query statistics, where these
statistics are liable to change as time goes by.

Normal forms
The normal forms (abbrev. NF) of relational database theory provide criteria for determining a
table's degree of vulnerability to logical inconsistencies and anomalies. The higher the normal
form applicable to a table, the less vulnerable it is to inconsistencies and anomalies. Each table
has a "highest normal form" (HNF): by definition, a table always meets the requirements of its
HNF and of all normal forms lower than its HNF; also by definition, a table fails to meet the
requirements of any normal form higher than its HNF.
The normal forms are applicable to individual tables; to say that an entire database is in normal
form n is to say that all of its tables are in normal form n.

Newcomers to database design sometimes suppose that normalization proceeds in an iterative


fashion, i.e. a 1NF design is first normalized to 2NF, then to 3NF, and so on. This is not an
accurate description of how normalization typically works. A sensibly designed table is likely to
be in 3NF on the first attempt; furthermore, if it is 3NF, it is overwhelmingly likely to have an
HNF of 5NF. Achieving the "higher" normal forms (above 3NF) does not usually require an
extra expenditure of effort on the part of the designer, because 3NF tables usually need no
modification to meet the requirements of these higher normal forms.

The main normal forms are summarized below.

Normal form Defined by Brief definition


First normal Two versions: E.F. Codd (1970), C.J. Table faithfully represents a relation and
form (1NF) Date (2003)[12] has no repeating groups
Second No non-prime attribute in the table is
normal form E.F. Codd (1971)[13] functionally dependent on a proper
(2NF) subset of a candidate key
E.F. Codd (1971)[14]; see +also Carlo Every non-prime attribute is non-
Third normal
Zaniolo's equivalent but differently- transitively dependent on every
form (3NF)
expressed definition (1982)[15] candidate key in the table

DBMS Keys
A key is an attribute (also known as column or field) or a combination of attribute that is
used to identify records. Sometimes we might have to retrieve data from more than one
table, in those cases we require to join tables with the help of keys. The purpose of the key
is to bind data together across tables without repeating all of the data in every table.

The various types of key with e.g. in SQL are mentioned below, (For examples let suppose
we have an Employee Table with attributes ‘ID’ , ‘Name’ ,’Address’ ,
‘Department_ID’ ,’Salary’)

(I) Super Key – An attribute or a combination of attribute that is used to identify the
records uniquely is known as Super Key. A table can have many Super Keys.
E.g. of Super Key
1 ID
2 ID, Name
3 ID, Address
4 ID, Department_ID
5 ID, Salary
6 Name, Address
7 Name, Address, Department_ID ………… So on as any combination which can identify the
records uniquely will be a Super Key.

(II) Candidate Key – It can be defined as minimal Super Key or irreducible Super Key.
In other words an attribute or a combination of attribute that identifies the record uniquely
but none of its proper subsets can identify the records uniquely.
E.g. of Candidate Key
1 Code
2 Name, Address
For above table we have only two Candidate Keys (i.e. Irreducible Super Key) used to
identify the records from the table uniquely. Code Key can identify the record uniquely and
similarly combination of Name and Address can identify the record uniquely, but neither
Name nor Address can be used to identify the records uniquely as it might be possible that
we have two employees with similar name or two employees from the same house.

(III) Primary Key – A Candidate Key that is used by the database designer for unique
identification of each row in a table is known as Primary Key. A Primary Key can consist of
one or more attributes of a table.
E.g. of Primary Key - Database designer can use one of the Candidate Key as a Primary
Key. In this case we have “Code” and “Name, Address” as Candidate Key, we will consider
“Code” Key as a Primary Key as the other key is the combination of more than one
attribute.

(IV) Foreign Key – A foreign key is an attribute or combination of attribute in one base
table that points to the candidate key (generally it is the primary key) of another table. The
purpose of the foreign key is to ensure referential integrity of the data i.e. only values that
are supposed to appear in the database are permitted.
E.g. of Foreign Key – Let consider we have another table i.e. Department Table with
Attributes “Department_ID”, “Department_Name”, “Manager_ID”, ”Location_ID” with
Department_ID as an Primary Key. Now the Department_ID attribute of Employee Table
(dependent or child table) can be defined as the Foreign Key as it can reference to the
Department_ID attribute of the Departments table (the referenced or parent table), a
Foreign Key value must match an existing value in the parent table or be NULL.

(V) Composite Key – If we use multiple attributes to create a Primary Key then that
Primary Key is called Composite Key (also called a Compound Key or Concatenated Key).
E.g. of Composite Key, if we have used “Name, Address” as a Primary Key then it will be
our Composite Key.

(VI) Alternate Key – Alternate Key can be any of the Candidate Keys except for the
Primary Key.
E.g. of Alternate Key is “Name, Address” as it is the only other Candidate Key which is not
a Primary Key.

(VII) Secondary Key – The attributes that are not even the Super Key but can be still
used for identification of records (not unique) are known as Secondary Key.
E.g. of Secondary Key can be Name, Address, Salary, Department_ID etc. as they can
identify the records but they might not be unique.

Entity-Relationship Diagram
Definition: An entity-relationship (ER) diagram is a specialized graphic that illustrates the
interrelationships between entities in a database. ER diagrams often use symbols to represent three
different types of information. Boxes are commonly used to represent entities. Diamonds are normally
used to represent relationships and ovals are used to represent attributes.

Developing an ERD
Developing an ERD requires an understanding of the system and its
components. Before discussing the procedure, let's look at a narrative
created by Professor Harman.

Consider a hospital:
Patients are treated in a single ward by the doctors assigned to them.
Usually each patient will be assigned a single doctor, but in rare cases they will
have two.
Heathcare assistants also attend to the patients, a number of these are
associated with each ward.
Initially the system will be concerned solely with drug treatment. Each patient is
required to take a variety of drugs a certain number of times per day and for
varying lengths of time.
The system must record details concerning patient treatment and staff
payment. Some staff are paid part time and doctors and care assistants work
varying amounts of overtime at varying rates (subject to grade).
The system will also need to track what treatments are required for which
patients and when and it should be capable of calculating the cost of treatment
per week for each patient (though it is currently unclear to what use this
information will be put).

How do we start an ERD?

1. Define Entities: these are usually nouns used in descriptions of the


system, in the discussion of business rules, or in documentation; identified
in the narrative (see highlighted items above).

2. Define Relationships: these are usually verbs used in descriptions of the


system or in discussion of the business rules (entity ______ entity);
identified in the narrative (see highlighted items above).
3. Add attributes to the relations; these are determined by the queries,and may
also suggest new entities, e.g. grade; or they may suggest the need for keys or
identifiers.
What questions can we ask?
a. Which doctors work in which wards?
b. How much will be spent in a ward in a given week?
c. How much will a patient cost to treat?
d. How much does a doctor cost per week?
e. Which assistants can a patient expect to see?
f. Which drugs are being used?
4. Add cardinality to the relations
Many-to-Many must be resolved to two one-to-manys with an additional entity
Usually automatically happens
Sometimes involves introduction of a link entity (which will be all foreign key)
Examples: Patient-Drug
5. This flexibility allows us to consider a variety of questions such as:
a. Which beds are free?
b. Which assistants work for Dr. X?
c. What is the least expensive prescription?
d. How many doctors are there in the hospital?
e. Which patients are family related?

6. Represent that information with symbols. Generally E-R Diagrams


require the use of the following symbols:
Reading an ERD

It takes some practice reading an ERD, but they can be used with clients to
discuss business rules.

These allow us to represent the information from above such as the E-R
Diagram below:
ERD brings out issues:
Many-to-Manys
Ambiguities
Entities and their relationships
What data needs to be stored
The Degree of a relationship

S-ar putea să vă placă și