Documente Academic
Documente Profesional
Documente Cultură
File-based systems was the first method to store data in computers. The data was stored
and retrieved sequentially from the disk.
File based systems are an early attempt to computerise the manual filing system (organizing
the files with the help of papers). For example, a manual file can be set up to hold all the details
relating to a particular matter as a project, product, task, client or employee. In an organisation
there could be such files which may be labeled and stored.
The manual filing system works good when the number of items to be stored is small. It even
works quite well when the number of items stored is quite large and they are only needed to be
stored and retrieved. However, a manual file system crashes when the referencing of files are
not proper and processing of information in the files are carried out.
As shown in Figure1.0, in a file-based system, different programs in the same application may be
interacting with different private data files.
Thus
a.
the
drawbacks
Data
of
Redundancy
the
file
and
system
are:
Inconsistency
Since data resides in different private data files, the file system leads to uncontrolled duplication of
data. This duplication leads to wastage of a lot of store room. This also costs time and money to
enter the data more than once. For example, the address information of student may have to be
duplicated in transport details data file (Figure 1.0). The data in a file system can become
inconsistent if multiple persons modify the data concurrently. For example, if any student changes
the residence and the change is notified to only his/her file and not to bus list. Entering wrong data is
also
another
reason
b.
for
inconsistency.
Unanticipated
Queries
Handling sudden/ad-hoc queries can be difficult in a file-based system, because it requires changes
in the existing programs. For example, finding the number of faculty members(Faculty Details Data
file in figure1.0) who used the transport facility (Transport Details Data File in figure1.0) in this year
may
be
difficult
as
separate
c.
file
system
is
used
to
store
the
Data
details.
Isolation
Though data used by different programs in the application may be related, they reside in isolated
data files. For example, there is a relationship between the course details (Course Details data file)
and student details (Student Personal Details data File), but they stand isolated as shown in Figure
1.0.
d.
Concurrent
Access
Anomalies
In large multi-user systems the same file or record may need to be accessed by multiple users
simultaneously. Handling this in a file-based system is difficult. For example, the transport details
may
be
accessed
by
both
e.
faculty
and
students
simaltaneously
(Figure
Security
1.0)
Issues
In data-intensive applications, security of data is a major concern. Users should be given access
only to required data and not the whole database. The data in a file-based system can be made
secure only by additional programming in each application. For example, the student marks should
be accessed only by the faculty members and not by the fellow students (Figure 1.0).
f.
Integrity
Issues
In any application, there are some data integrity rules that need to be maintained. These could be in
the form of certain conditions/constraints on the elements of the data records. In a file-based system,
all these rules need to be explicitly programmed in the application program. For example, the salary
structure of each faculty will be dependant on their designation as well as on any other conditions
which
g.
needs
to
be
explicitly
Recovery
programmed
(Figure
1.0).
Issues
System failures or loss of connection to remote systems should be dealt by the file system. In the
event of operating system failure or "soft" power failure, special routines in the file system must be
invoked similar to individual program failure. The damaged structures must also be corrected by the
file system . These may occur as a result of an operating system failure for which the OS is not able
to notify the file system, power failure or reset. The file system must also record events to allow the
analysis of systemic issues as well as the problems with specific files or directories.For example,
during the admission process of students if there is a system failure, the non recovery of data will
cause issues (Figure 1.0).
It may be noted that, facing the above issues like concurrent access, integrity security problems,
etc., is possible in a file-based system. Though all these are general issues of concern to any dataintensive application (An application which processes huge volume of data simultaneously), each
application has to handle all these problems on its own. The application programmer should bother
not only about implementing the application business process but also about handling these general
issues.
Database
Approach
In order to overcome the limitations of the file-based approach, the concept of database and the
Database
Management
System
(DBMS)
was
introduced
in
1960's.
The Database Management System (DBMS) is the System that works on a database which is a
static storage. The user interacts with database through DBMS. The purpose of a DBMS is to
provide an efficient way of storing and retrieving the data quicker for both single-user and multi-user
systems.
Database Management largely involves:
Storage of Data
following
are
the
types
of
Data
Bases.
The related fields or records are grouped together so that there are higher-level records
and lower-level records.
The root record is the parent record at the top of the pyramid. (i.e, Segment which has
no parent record).
A "leaf" is a segment with no children. In figure 1.2 leaf segment is at the botton and it
does not have children.
A child record maps to only one parent record to which it is linked. In contrast, a parent
record may have one or more child records linked to it.
A record search is performed by starting at the top of the pyramid and working down
through the tree from parent to child until the correct child record is found.
Example of HDBMS
Advantages
This database type can be accessed and updated rapidly because of the tree-like
structure and the relationships between records are defined in advance.
Disadvantages
Each child in a tree may have only one parent and relationships or linkages between
the children are not permitted, even if they seem right from a logical standpoint.
Hierarchical databases are so firm in their design that adding a new field or record
requires that the entire database be redefined.
In network database, each child or member can have one or more parents (or owners).
Example
of
NDBMS
In Figure 1.4, The Store node is parent or Owner. Customer node, Manager node and Salesman
node are the children or members of the Store node. They also act as the parent or owner of Order
node, where in store, customer, manager and salesman are interconnected. Items node has a single
parent or owner which is the Salesman node.
A lot of connections can be made between the different types of data and the network
databases, and they work more flexibly.
Disadvantages
There is a limit to the number of connections which can be made between records.
Relational databases connect the information (Data) in different files by using common
information elements (data elements) or a key field.
Information (Data) is stored in different tables in relational databases, each having a key
field which is used to uniquely identify each row.
Each table has a key field that uniquely identifies each row which is the principle, and
these key fields can be used to connect one table of information (data) to another.
In relational databases, a relation is a table or file filled with data, rows or records are
termed as tuples, and columns are termed as attributes or fields.
Please refer figure 1.6 which explains about RDBMS. It has three tables (i.e Database1, Database2,
Database3). All these three tables have a unique column called Social Security No.(SSN*) which is
identified as a Key field to link all the three tables. This key field is used to uniquely identify each
row.
This
represents
the
relationship
between
the
tables.
Advantage
The database server and application tools can be easily installed and upgraded.
Authorization and privilege control features in an RDBMS permit the DBA (Database
Administrator) to restrict the access to authorized users and grant privileges to individual users
derived from the types of database tasks they need to perform.
RDBMSs support a generic language called "Structured Query Language" (SQL). The SQL
syntax is simple and the language uses standard English language keywords and phrasing,
making it fairly intuitive and easy to learn.
Disadvantage
Searching for a data or information can take extra time as compared to other methods.
following
are
the
types
of
Data
Bases.
The related fields or records are grouped together so that there are higher-level records
and lower-level records.
The root record is the parent record at the top of the pyramid. (i.e, Segment which has
no parent record).
A "leaf" is a segment with no children. In figure 1.2 leaf segment is at the botton and it
does not have children.
A child record maps to only one parent record to which it is linked. In contrast, a parent
record may have one or more child records linked to it.
A record search is performed by starting at the top of the pyramid and working down
through the tree from parent to child until the correct child record is found.
Example of HDBMS
Advantages
This database type can be accessed and updated rapidly because of the tree-like
structure and the relationships between records are defined in advance.
Disadvantages
Each child in a tree may have only one parent and relationships or linkages between
the children are not permitted, even if they seem right from a logical standpoint.
Hierarchical databases are so firm in their design that adding a new field or record
requires that the entire database be redefined.
In network database, each child or member can have one or more parents (or owners).
Example
of
NDBMS
In Figure 1.4, The Store node is parent or Owner. Customer node, Manager node and Salesman
node are the children or members of the Store node. They also act as the parent or owner of Order
node, where in store, customer, manager and salesman are interconnected. Items node has a single
parent or owner which is the Salesman node.
A lot of connections can be made between the different types of data and the network
databases, and they work more flexibly.
Disadvantages
There is a limit to the number of connections which can be made between records.
Relational databases connect the information (Data) in different files by using common
information elements (data elements) or a key field.
Information (Data) is stored in different tables in relational databases, each having a key
field which is used to uniquely identify each row.
Each table has a key field that uniquely identifies each row which is the principle, and
these key fields can be used to connect one table of information (data) to another.
In relational databases, a relation is a table or file filled with data, rows or records are
termed as tuples, and columns are termed as attributes or fields.
This table works as mapping between the student and course details and its used to get the reports
similar to the number of students taking up a particular course etc.
Relational Database Management System(RDBMS)
Example of RDBMS
This
represents
the
relationship
between
the
tables.
Advantage
The database server and application tools can be easily installed and upgraded.
Authorization and privilege control features in an RDBMS permit the DBA (Database
Administrator) to restrict the access to authorized users and grant privileges to individual users
derived from the types of database tasks they need to perform.
RDBMSs support a generic language called "Structured Query Language" (SQL). The SQL
syntax is simple and the language uses standard English language keywords and phrasing,
making it fairly intuitive and easy to learn.
Disadvantage
Searching for a data or information can take extra time as compared to other methods.
following
are
the
types
of
Data
Bases.
The related fields or records are grouped together so that there are higher-level records
and lower-level records.
The root record is the parent record at the top of the pyramid. (i.e, Segment which has
no parent record).
A "leaf" is a segment with no children. In figure 1.2 leaf segment is at the botton and it
does not have children.
A child record maps to only one parent record to which it is linked. In contrast, a parent
record may have one or more child records linked to it.
A record search is performed by starting at the top of the pyramid and working down
through the tree from parent to child until the correct child record is found.
Example of HDBMS
Advantages
This database type can be accessed and updated rapidly because of the tree-like
structure and the relationships between records are defined in advance.
Disadvantages
Each child in a tree may have only one parent and relationships or linkages between
the children are not permitted, even if they seem right from a logical standpoint.
Hierarchical databases are so firm in their design that adding a new field or record
requires that the entire database be redefined.
In network database, each child or member can have one or more parents (or owners).
Example
of
NDBMS
In Figure 1.4, The Store node is parent or Owner. Customer node, Manager node and Salesman
node are the children or members of the Store node. They also act as the parent or owner of Order
node, where in store, customer, manager and salesman are interconnected. Items node has a single
parent or owner which is the Salesman node.
A lot of connections can be made between the different types of data and the network
databases, and they work more flexibly.
Disadvantages
There is a limit to the number of connections which can be made between records.
Relational databases connect the information (Data) in different files by using common
information elements (data elements) or a key field.
Information (Data) is stored in different tables in relational databases, each having a key
field which is used to uniquely identify each row.
Each table has a key field that uniquely identifies each row which is the principle, and
these key fields can be used to connect one table of information (data) to another.
In relational databases, a relation is a table or file filled with data, rows or records are
termed as tuples, and columns are termed as attributes or fields.
Please refer figure 1.6 which explains about RDBMS. It has three tables (i.e Database1, Database2,
Database3). All these three tables have a unique column called Social Security No.(SSN*) which is
identified as a Key field to link all the three tables. This key field is used to uniquely identify each
row.
This
represents
the
relationship
between
the
tables.
Advantage
The database server and application tools can be easily installed and upgraded.
Authorization and privilege control features in an RDBMS permit the DBA (Database
Administrator) to restrict the access to authorized users and grant privileges to individual users
derived from the types of database tasks they need to perform.
RDBMSs support a generic language called "Structured Query Language" (SQL). The SQL
syntax is simple and the language uses standard English language keywords and phrasing,
making it fairly intuitive and easy to learn.
Disadvantage
Searching for a data or information can take extra time as compared to other methods.
following
are
the
functions
which
are
performed
by
typical
DBMS:
a. Data Definition
The DBMS provides the functions to describe the structure of the data in the application. These
include defining and modifying the record structure, the type and size of fields and the various
constraints/conditions to be satisfied by the data in each field.
Data
Manipulation
The DBMS must be able to handle requests from the users to retrieve, update and delete the
existing data in the database, and add new data to the database .The DBMS performs these
operations in the database .
c.
Data
Dictionary
Management
The data dictionary stores the definition of data elements (information elements) and their
relationships.This information is called as metadata.The metadata contains the definition of data,
data type, integrity constraints, relationship between data elements etc. Any change made in a
database structure is automatically reflected in the data dictionary. The DBMS provides the data
abstraction and it removes structural and data dependency from the system.
d.
Data
Security
&
Integrity
Data integrity is an important component of information security. It refers to the consistency and
accuracy of data is stored in a database. Data integrity ensures that the data entered into the
database is accurate, valid, and consistent. The DBMS includes the functions which manage the
integrity and security of data in the application. These can be easily invoked by the application and
hence the application programmer need not code these functions in his/her programs.
e.
Data
Concurrency
&
Consistency
The DBMS makes sure that multiple users can access the database concurrently without
compromising the integrity of the database. In a single-user database, the user can modify data in
the database without being concerned that other users would modify the same data at the same
time. However, in multi-user database, several simultaneous transactions can update the same data
at same time. Transactions executing at the same time should produce meaningful and consistent
results. Hence, the control of data concurrency and data consistency is vital in multiuser database.
Data
concurrency
means
that
many
users
can
access
data
at
the
same
time.
Data consistency means that each user sees a consistent view of data, including visible changes
made by the user's own transactions and transactions of other users.
f.
Data
Backup
&
Recovery
The DBMS provides the backup and data recovery procedures to ensure data safety and integrity.
DBMS system provides special utilities which allow the DBA to perform routine and special backup
and restore procedures. Recovery Management handles the recovery of the database after a failure.
g.
Performance
Optimizing the performance of the queries is one of the important functions of a DBMS. The DBMS
has a set of programs that form the Query Optimizer which evaluates the different implementations
of
query
and
selects
the
best
among
them.
Thus the DBMS provides a convenient and effective environment to use when a large volume of
data and many transactions are to be processed in the environment.
The
following
are
the
functions
which
are
performed
by
typical
DBMS:
a. Data Definition
The DBMS provides the functions to describe the structure of the data in the application. These
include defining and modifying the record structure, the type and size of fields and the various
constraints/conditions to be satisfied by the data in each field.
Data
Manipulation
The DBMS must be able to handle requests from the users to retrieve, update and delete the
existing data in the database, and add new data to the database .The DBMS performs these
operations in the database .
c.
Data
Dictionary
Management
The data dictionary stores the definition of data elements (information elements) and their
relationships.This information is called as metadata.The metadata contains the definition of data,
data type, integrity constraints, relationship between data elements etc. Any change made in a
database structure is automatically reflected in the data dictionary. The DBMS provides the data
abstraction and it removes structural and data dependency from the system.
d.
Data
Security
&
Integrity
Data integrity is an important component of information security. It refers to the consistency and
accuracy of data is stored in a database. Data integrity ensures that the data entered into the
database is accurate, valid, and consistent. The DBMS includes the functions which manage the
integrity and security of data in the application. These can be easily invoked by the application and
hence the application programmer need not code these functions in his/her programs.
e.
Data
Concurrency
&
Consistency
The DBMS makes sure that multiple users can access the database concurrently without
compromising the integrity of the database. In a single-user database, the user can modify data in
the database without being concerned that other users would modify the same data at the same
time. However, in multi-user database, several simultaneous transactions can update the same data
at same time. Transactions executing at the same time should produce meaningful and consistent
results. Hence, the control of data concurrency and data consistency is vital in multiuser database.
Data
concurrency
means
that
many
users
can
access
data
at
the
same
time.
Data consistency means that each user sees a consistent view of data, including visible changes
made by the user's own transactions and transactions of other users.
f.Data
Backup
&
Recovery
The DBMS provides the backup and data recovery procedures to ensure data safety and integrity.
DBMS system provides special utilities which allow the DBA to perform routine and special backup
and restore procedures. Recovery Management handles the recovery of the database after a failure.
g.Performance
Optimizing the performance of the queries is one of the important functions of a DBMS. The DBMS
has a set of programs that form the Query Optimizer which evaluates the different implementations
of
query
and
selects
the
best
among
them.
Thus the DBMS provides a convenient and effective environment to use when a large volume of
data and many transactions are to be processed in the environment.
entities
and
relationships.
Let us take University database as an example and try to understand how ER model is arrived at.
Example:
A university consists of a number of departments. Each department offers several courses. Each
course includes a number of modules. Students enroll in a particular course and study modules
towards the completion of that course. Each module is taught by a lecturer from the appropriate
department, and each lecturer teaches a group of students.
Entities
Entities are real world items or concepts that exist on their own and are represented as objects or
things of interest. An entity type is a collection of entities that share a common definition.
Identify all nouns in our university example,
A university consists of a number of departments. Each department offers several courses. Each
course includes a number of modules. Students enroll in a particular course and study modules
towards the completion of that course. Each module is taught by a lecturer from the appropriate
department,
and
each
lecturer
teaches
group
of
students.
This scenario consists of students, lecturers, modules, courses and departments. So here the
physical things(Physical things are those which exist in this world, that we can touch, feel etc.) like
students, lecturers and abstract things(An abstract thing is an idea or a concept in your mind. It is
not something that you can physically reach out and touch, smell, hear, taste, see) like
modules,department etc., make an entity type. If we take students as an entity type, then each
student in the university is an entity. The entities are represented as nouns in the description
because they are objects or things.
We can touch an entity of physical things and feel the entity of abstract things but an entity type is
simply an idea. Student is an idea of physical things (entity type) while Scott, Nancy, Lindsey, and
Mackenzie are touchable (Student names are entities). Department is an idea of abstract things
(entity type) while IT,CSE,ECE and CIVIL are entities.
Entity Diagrams
The box is labeled with the name of the entity type. The entities identified in our
example are shown in Figure 2.1.
Entity
If an entity depends on another existing entity then it is considered as weak. A weak entity cannot be
identified by its own attributes. A weak entity is represented by double rectangles in E-R diagram.
Example:
SubModule is a good example for weak entity. The SubModule will be meaningless without a Module
entity and so it depends on the existence of Module as shown in Figure 2.2
that
describe
each
entity.
In our University database each student in the university will have a Student ID, Name, Course taken
etc. Similarly each lecturer will have his/her own properties of ID, Name, department etc.
Attributes will have a name, an associated entity and properties of an entity. Attributes are often
nouns
also.
Attributes in ER diagram
The figure below represents the entities and their corresponding attributes in the University
database.
Attribute
A multivalued attribute is an attribute that has more than one value attached to it. For instance if
phone number and graduating degree are the attributes of an Entity called Person, then those
attributes could have multiple values, as a person could have multiple phone numbers or could hold
multiple graduating degrees. We represent a multivalued attribute by double oval in E-R diagram.
Single Valued Attribute: Attribute that holds a single value; in Our example the attributes of Students
such as Roll number, Age, Date of Birth, City etc., can have only a single value.
In our example, a Student can have multiple phone numbers, and so Phone number is a multivalued
attribute.
between
entities.
in
an
ER
diagram
The name of the relationship is given in a diamond box (For example Belongs to as
shown in Figure 5.1).
Cardinality
Each
Ratio
entity
can
be
involved
in
three
types
of
relationships
as
shown:
Each student belongs to one University. We can illustrate this ratio by writing ones on
the lines indicating the relationship as shown in Figure 2.5.
A lecturer teaches many students, and this One to Many relationship is illustrated in
figure 2.7.
Each student takes many modules, and each module is taken by many students as
shown in figure 2.9.
E/R
Models
Till now we have seen how to identify the basic elements in an ER Diagram. Finally, to make an E/R
model you need to identify:
Entities
Attributes
Relationships
Cardinality ratios
Now lets see how an ER model will look like when all these elements are put together. The final ER
Model of our University database is shown in the Figure 2.10. In this figure we have shown the
entities and the relationship between the entities which depict the complete ER model of a
University. Here
Department,
Course,
Module,
Lecturer
and
Student
are
the
entities.
The relationships in the Figure 2.10 are defined as Department Offers many Courses and those two
entities have One to Many relationship. A Department Assigns Many Lecturers(One(1) To Many(n)).
Each Lecturer teaches Many Students(One(1) To Many(n)). Every Student takes several
Modules(Many(n) To Many(n)). Every Module includes Many Courses(Many(n) To Many(n)). A
Course is enrolled by Many Students(One(1) to Many(n)).
The
ER
Model
for
the
above
example
is
given
below:
The complete ER Model for our University database will be as shown in the diagram below. It is an
Integrated ER model containing the Entities and Relationships for a University database.
Entities represent real world things; They can be conceptual as a transaction or physical
as a bank.
the
advantages
of
normalization.
Advantages
Helps to avoid update anomalies. That is, it isolates data so that additions, deletions, and
modifications of a field can be made in just one table. The changes are then propagated to the
rest of the database through the defined relationships.
Edgar Codd invented the relational model and he proposed the theory of normalization with the
introduction of First Normal Form. He continued to extend the theory with Second and Third Normal
Forms. Later Edgar Codd joined with Raymond F. Boyce to develop the theory of Boyce-Codd
Normal Form(BCNF).
Theory of Normalization is still developing. For example, the discussions on 6th Normal Form are in
progress. However, in most practical applications normalization achieves its best in Third Normal
Form. The evolution of Normalization theories is illustrated below:
understand
What
few
is
things
before
we
KEY
proceed
--
A KEY is a value used to uniquely identify a row in a table. It could be a single column or a
combination
of
multiple
columns.
Note: The columns in a table that are NOT used to uniquely identify a record or row in a table are
called non-key columns.
What
is
primary
Key?
A primary key is a single column value that is used to uniquely identify a database record.
The primary key column in a table cannot have duplicate values. Each primary key value
must be unique.
The primary key column should have a value when a new record is inserted into the table.
Example:
The table below contains the details of students. Here studentId is Primary Key which is used to
uniquely identify the details of a student from the table.
Composite
Key
If two or more columns are used to uniquely identify a record then combination of those multiple
columns
constitutes
composite
key.
In the Student table given below, we have StudentId, TestId and Mark. Here one student can take
multiple tests and one test can be taken by multiple students. In this case in order to uniquely
identify the mark of a student in a test we require both StudentId and TestId. This is a composite key.
Student Table
Table 2.1
Functional
Dependency
In simple terms, functional dependency can be explained as follows. If you know one attribute then
you can get another attribute. Then both these attributes are said to be functionally dependent. In
the Student table given below, we can get the attribute 'Name' if you know the attribute 'StudentId',
then Name and StudentId are functionally dependent. Here we can say StudentId is determinant and
Name as dependent.
For example, let's consider the Student table given below. Table 2.2 stores student details(StudentId,
Name, Languages Known), student's department details (Dept_No, Dept_Name) and lecturer details
(LecturerInCharge, Designation) for Students.
In this approach, we keep repeating the languages known and department details data for all the
students in the same field. This is called an UnNormalized table. Instead of storing the same data
again and again, we could normalize the data and create related tables.
Let's see how we can normalize the table,create related tables and learn forms with the Student
table(which
is
not
normalized):
Table 2.2
First
Normal
Form
To move from unnormalized form to first normal form all multi-valued attributes (called repeating
groups) should be removed. The repeating groups nust be eliminated. All attributes must be atomic.
Table 2.2 is not in 1NF since there are repeating groups (more than 1 value in a field). The column
"Languages Known" has(English, Hindi and Tamil) in the Row(Tuple)1 and (English and Hindi) in the
Row(Tuple) 2 .To satisfy 1NF we can create separate rows for each value in Languages Known by
duplicating
the
values
in
the
remaining
columns.
Table
1NF
2.3
represents
the
same.
Rules
Normal
Form
Partial functional dependencies must be removed. If two attributes of a table are combined to form a
composite key, then the non-key attributes of that table must depend on both the attributes of the
composite key. They must not depend on one of the attributes, which is the part of the composite
key.
2NF Rules
A relation in 1NF will be in second normal form (2NF) if there are no partial dependencies.
Partial dependency
It is the functional dependency on part of the primary key instead of the entire primary key.
It is clear that we can't move forward to make our simple database in 2nd Normalization form unless
we partition the columns in Table 2.3. Here, assume that StudentId and Dept_No together act as the
key (Composite key). As per 2NF all non-key attributes must be dependent on whole key.
In Table 2.3 the attribute 'Dept_Name' is functionally dependent on whole key (StudentId+Dept_No).
That is, you can get the department name only if you know both StudentId and Dept_No. All other
column attributes can be identified by just providing 'StudentId'. So for all other columns StudentId
acts as the primary key. So split the table as given below to satisfy 2NF.
Student
Table 2.4
Department
Table 2.5
Languages
Table 2.6
Introducing
Foreign
Key
A foreign key is a field in a table that matches the primary key column of another table. The crossreference
tables
can
be
achieved
Table 2.7
by
Foreign
Key.
The foreign key ensures that a row in a table is mapped to a corresponding row in another
table.
Foreign key does not have to be unique; most often it is not unique.
Foreign Key
do
key
is
you
required
in
Referential
need
RDBMS
for
a
the
concept
foreign
of
Referential
key?
Integrity.
integrity
It is a concept used in database to ensure that there is consistency in table relationships. If one table
has a foreign key to another table, then the concept of referential integrity states that you cannot add
a record to the table that contains the foreign key unless there is a corresponding record in the
link/relationship with the other table.
For example, consider the Figure 2.16 given in the previous page, where Dept_No in the Student
table is foreign key of Dept_No in Department table. Here let's try to add a student with StudentId as
"103" and Dept_No as "D003" in Student table as shown below. But the entry for Dept_No "D003" is
not present in Department table which means we have added a student to a department which does
not exist. This leads to inconsistency of data across related tables. Hence RDMS has the concept of
referential integrity which does not allow to add a record to the table that contains the foreign key
unless
there
is
corresponding
record
in
the
table
to
which
it
is
linked.
Student
Table 2.8
Department
Table 2.9
Transitive
functional
dependencies
When changing a non-key column might cause any of the other non-key columns to change, it is
called transitive functional dependency. Attributes that are not a part of the key must not depend on
any non-key attribute.
Consider the table 2.9. Changing the non-key column Lecturer In Charge , may change Designation.
Here Dept_No acts as the key. All other columns are non-key attributes. As per 3NF non-key
attributes should not be dependent on any other non-key attributes but 'Lecturer In Charge' is
dependent on 'Designation'. Both Lecturer In Charge and Designation are non-key attributes. So it
forms transitive dependency. So, to satisfy 3NF let's split the table in a short while.
Third
Normal
Form
Third normal form (3NF) is the third step in database normalization and it builds on the first (INF)and
second
normal
forms(2NF).
The Third Normal Form(3NF) states that all column references in the referenced data that are not
dependent on the primary key should be removed. Another way of putting this statement is that only
foreign key columns should be used to reference another table, and the other columns from the
parent
table
should
not
exist
in
the
reference
table.
The Second Normal form(2NF) covers in case of multi-column primary keys. 3NF is meant to cover
single
column
keys
as
mentioned
in
transitive
functional
dependencies
3NF
above.
Rules
Rule 2- The table has no transitive functional dependencies which is explained above.
We need to divide our table if it has to be moved from second normal form(2NF) into Third Normal
form(3NF). In table 2.1 Dept_No acts as the key. All other columns are non-key attributes. The nonkey attributes should not be dependent on any other non-key attributes as per third normal form. The
'Designation' is dependent on 'Lecturer In Charge' and these are non key attributes in the Lecturer
table explained. It forms transitive dependency. So, to satisfy 3NF split the table as follows.
Student
Table 2.10
Department
Table 2.11
Lecturer
Table 2.12
Languages
Table 2.13
The example given above cannot be decomposed further to attain higher forms of normalization
because it is already normalized to the highest level.Normally only complex data bases would need
next levels of normalization.
2.3. Joins
What
are
Joins?
A join is a technique where records from two or more tables are retrieved through a single SQL
query and shown as a single output. As it forms a set, It can be saved as a table or used as it is. A
join is a means of combining columns from two tables by using values common to both tables. It
allows us to combine data from more than one table into a single result set. A join condition is used
in
the
WHERE
clause
of
select,
update
and
delete
queries.
Note: The query will give results from two tables as Cartesian product(A Cartesian product is defined
as all possible combinations of rows in all tables). If join condition is omitted. The first table's rows
are joined with all rows of the second table. For example, if the first table has 30 rows and the
second table has 10 rows, the result will be 30 * 10, or 300 rows. This query will take a long time to
execute.
Let's
use
the
two
tables
below
to
Table "Student"
Table 2.14
Table "Department"
Table 2.15
explain
the
join
conditions.
In the above example the column that is common between both the tables is Dept_No. Using
Dept_No,the Student and Department tables can be joined to combine data from both the tables as
shown below.
Table 2.16
A Data Base Management System that is based on a relational model is called as RDBMS.
Relational model is the most successfully used Data Base Management System Model (DBMS)
model.
Relational model represents data in the form of a table. A table is a two dimensional array which
contains
rows
and
columns.
Consider a scenario of a college where we need to maintain huge amount of student details. All
these
student
details
are
stored
in
table
as
mentioned
in
Figure
3.1.
In Figure 3.1, (as discussed in Section 2 ER model) students is the entity and Name is one of the
attributes of this students entity. Other attributes are RollNo and Phone. The table given below
contains rows and columns. Each row contains data related to an entity/students. Each column
contains the data related to an attribute.
Tuple
Row
A single row that is available in the table is called as tuple. Each row in the table represents the data
of a single entity. For example, in Figure 3.1 s1, Louis Figo, 454333 represents a row.
Attribute
Column
A column in the table stores an attribute of the entity. For example, in Students table (Figure 3.1)
Louis Figo, Rahul, etc. are the attributes as highlighted in figure.
Column
Name
Each column that is available in the table is given a name. This name is used to refer to values in the
column. In Students table (Figure 3.1), RollNo, Name and Phone are the column names of the table.
Table
Name
Each table is provided with a name. The name that is provided is used to refer to the table. The
name of the table depicts the contents of the table. In the above Figure 3.1, Students is the name of
the table.
Structured
Query
Language
(SQL)
Relational database management systems ( RDBMS) use SQL (Structured Query Language) for
data manipulation and retrieval. SQL is the standard language for relational database systems. It is a
non-procedural
language.
Non-procedural language requires the programmer to specify what the program should do, rather
than providing the sequential steps indicating how the program should perform a task.
SQL Commands are divided into three categories, depending upon what they do:
http://www.sqlcourse.com/index.html
http://www.w3schools.com/sql/sql_intro.asp
http://www.studytonight.com/dbms/rdbms-concept.php
TABLE
table_name
column_name1
data_type
column_name2
constraints,
data_type
constraints,
data_type
constraints,
...
column_nameN
);
where,
table_name
column_name1,
data_type
is
the
column_name2,....
is
the
data
type
name
,column_nameNfor
the
column
is
of
the
like
name
char,
the
of
date,
table
the
columns
number
etc.
constraints - constraints are used to validate or limit the type of data that can go into a
table.
Constraints are optional for the columns.
We will focus on a few constraints now:
NOT NULL
PRIMARY KEY
FOREIGN KEY
UNIQUE
NOT NULL: The NOT NULL constraint enforces a column to not accept NULL values. This means
that
this
column
must
contain
some
value
while
inserting
or
updating
record.
PRIMARY KEY: Primary key uniquely identifies each record in the database. So a primary key
column cannot contain NULL values.(Refer Section 2 for more details about Primary Key).
FOREIGN KEY: A foreign key is a column in a table that matches the primary key column of another
table. The foreign key can be used to map two tables. (Refer Section 2 for more details about
Foreign
Key).
UNIQUE: Unique constraints are used to make sure that no duplicate values are entered in specific
columns that do not participate in a primary key. A column defined as UNIQUE can contain NULL
values.
Basic SQL DATA types :
CHAR: The CHAR data type is used for storing fixed length character strings with a
maximum size of 2000 bytes. The CHAR(n) holds fixed length of n characters.
DATE: It allows to define the Date attributes as Date fields in the database. Here the
DATE data type stores year, month, and day values.
NUMBER: It allows to define a column as number field. Only number values can be stored
in the database.
Now let's see how to implement the SQL queries with examples:
Example1:
With the help of CREATE statement, let's create Students table with columns as RollNo, Name and
Phone as shown below.
CREATE TABLE Students (
RollNo NUMBER PRIMARY KEY,
Name CHAR(25) NOT NULL,
Phone NUMBER
);
Here "Students" is the name of the table. RollNo, Name and Phone are the columns of the table.
NUMBER and CHAR(25) are the data types which convey what kind of data that particular column
will hold.
Here RollNo is given as PRIMARY KEY which means that this particular column will not accept any
duplicate values. The other two columns are defined as NOT NULL which conveys that these two
columns will not accept NULL values.
Note: NULL specifies that the column doesn't have any value or the column is empty.
Example2:
Let's see another example using the Create Statement.
The query below is used to create "employees" table with columns such as employee_id, first_name,
etc.
CREATE TABLE employees (
employee_id NUMBER PRIMARY KEY,
first_name CHAR(10) NULL,
last_name CHAR(10) NOT NULL,
email CHAR(25) NOT NULL,
phone_number NUMBER NOT NULL,
hire_date DATE NOT NULL,
job_id CHAR(10),
salary NUMBER,
commission_pct NUMBER,
manager_id NUMBER,
department_id NUMBER
);
Example3:
In Example1 and Example2 we explained how to create primary key and NOT NULL constrains.
Now let's see how to implement foreign key constraints. To implement foreign key we need two
tables
that
are
dependent
on
each
other.
In Example 2 we have employees table which contains the department_id as one of the columns but
does not have department details. Now let's create a department table which contains details of the
department
such
as
department_id
CREATE
and
department
TABLE
department_id
name.
department(
NUMBER
PRIMARY
KEY,
department_name
CHAR(10)
);
Consider a scenario where we need to identify the department_name of an employee. In this case
the employees table is dependent on department table to get the department name based on the
common column department_id. Foreign key constraint comes into picture in this case. The syntax
below
creates
foreign
key.
(For
more
details
about
Foreign
key
CREATE
TABLE
employees
employee_id
NUMBER
PRIMARY
first_name
CHAR(10)
last_name
CHAR(10)
email
phone_number
NOT
NUMBER
hire_date
DATE
Section
2).
(
KEY,
NULL,
NOT
CHAR(25)
refer
NULL,
NOT
NOT
NULL,
NULL,
NULL,
job_id
CHAR(10),
salary
NUMBER,
commission_pct
NUMBER,
manager_id
NUMBER,
department_id
NUMBER
FOREIGN
KEY
REFERENCES
department(department_id
);
DROP
statement
The DROP command is used to remove a table from the database . If you drop a table, all the rows
in the table are deleted and the table structure is removed from the database permanently. Once a
table is dropped using DROP command , we cannot retrieve the data / table back. So we should be
careful
while
using
this
command.
Syntax:
DROP
TABLE
table_name;
Example1:
The following command is used to permanently remove the Students table structure/definition along
with
the
data
that
DROP
was
created.
TABLE
Students;
After execution of the above command the entire Students table is removed from the database. We
cannot
get
back
any
data
about
Students
table.
Example2:
Let's see how the employees table definition/structure is removed from the database.
DROP
TABLE
employees;
After execution of the above command the entire employees table is removed from the database.
We cannot get back any data about employees table.
ALTER
statement
The ALTER statement helps to modify the structure of an existing table in the database.
Once you've created a table within a database, you may wish to modify it's definition at some
instance. ALTER statement allows you to make changes to the structure of a table without deleting
or
General
Syntax
ALTER
recreating
syntax
for
of
adding
TABLE
Alter
a
it.
statement
column
table_name
to
ADD
is
given
the
existing
column_name
below.
table:
data_type;
Example:
Let's see how we can alter or edit the structure of the Students table that we created using CREATE
statement in Section 3.2.1 using SQL queries. Let's assume that we have to add a new column
called
ALTER
'gender'
TABLE
to
the
existing
Students
Students
ADD
gender
table
CHAR(10);
In the above example the "gender" column with the data type as CHAR(10) has been added to the
existing Students table.
Syntax
ALTER
for
TABLE
adding
table_name
ADD
constraints:
CONSTRAINT
clause
where: A CONSTRAINT clause is optional in the above ALTER TABLE statement for defining the
constraint.
Example:
In the example below Unique constraint is applied to Phone column in order to avoid duplicate
phone
ALTER
numbers
TABLE
getting
inserted
employees
ADD
into
the
UNIQUE
table.
Phone;
The constraint UNIQUE has been added on column Phone of employees table to show unique data.
INSERT Statement
The INSERT statement inserts new rows into an existing table.
The syntax for INSERT statement is as follows.
Syntax:
INSERT INTO table_name (col1,col2,col3,....)
VALUES (vallue1,value2,value3,.....);
Example 1:
Now let's see how to insert the details of students into Students table. The following is the structure
of the Students table. Let's see how to insert the details of a student named David.
Table 3.1
In the query given below RollNo, Name, Phone and Gender are the columns defined in the Students
table. Using INSERT statement the corresponding values 100, David, 9830028200, Male are
inserted
INSERT
into
INTO
those
Students(RollNo,
columns.
Name,
VALUES
Phone,
Gender)
(100,'David',9830028200,'Male');
Similarly, we can insert details of another student named 'Peter'. Let's try to ignore a column which
accepts
INSERT
NULL
INTO
value
Students(RollNo,
Name,
during
Gender)
VALUES
insertion.
(200,'Peter','Male');
In the above query we have given values only for three columns (RollNo, Name, Gender). Though
we didn't mention Phone, the record will be successfully inserted because it is not mandatory to
provide values for the columns which can accept NULL values during insertion. In Students table
Phone and Gender are the columns which can accept NULL values. For Peter's record Phone
column
will
be
empty.
Table 3.2
Example
Let us assume employees table structure as below:
2:
Table 3.3
Query:
INSERT INTO employees (First_Name,Last_Name,Email,Phone_Number,
Hire_Date,Job_ID,Salary,Commission_PCT,Manager_ID,Age,Department_ID)
VALUES ('George', 'Gordon','GGORDON',6505062222,
'01-JAN-07','SA_REP',9000,.1,148,25,80);
Result:
ERROR at line 1:
ORA-01400: cannot insert NULL into ("E668292"."EMPLOYEES"."EMPLOYEE_ID")
In the above query we are trying to insert the details of an employee without providing value for
Employee_ID column. Employee_ID column is a NOT NULL column. So it is mandatory to provide
value for the same. Since we tried to insert some data excluding the NOT NULL column value, the
execution of the query gives an error. Since we didn't give any value for the Employee_ID column
the value that will get into the table would be NULL. So it has thrown as error as "cannot insert NULL
into EMPLOLYEE_ID column".
Let's now insert a row by providing Employee_ID column value.
INSERT INTO employees
(Employee_ID,First_Name,Last_Name,Email,Phone_Number,
Hire_Date,Job_ID,Salary,Commission_PCT,Manager_ID,Age,Department_ID)
VALUES (10,'George', 'Gordon','GGORDON',6505062222,
'01-JAN-07','SA_REP',9000,.1,148,25,80);
Inserting another employee:
INSERT INTO employees
(Employee_ID,First_Name,Last_Name,Email,Phone_Number,
Hire_Date,Job_ID,Salary,Commission_PCT,Manager_ID,Age,Department_ID)
VALUES (11,'James', 'Keats','j_keats@gm',6505062221,
'01-JAN-07','SA_REP',7000,.1,148,25,80);
The inserted data is represented in table format below:
Table 3.4
SELECT
Select
General
Statement
statement
is
syntax
used
of
to
the
retrieve
Select
the
data
from
statement
the
is
database
given
table.
below:
Syntax:
SELECT
column_list
FROM
table_name
WHERE
search_condition
where
table_name is the name of the table from which the information is retrieved.
Table 3.5
Example
1:
If we want to view the details of all students after inserting the values in the Students table, the query
below
can
be
executed.
FROM
Students;
SELECT
Result:
Table 3.6
Here
denotes
all
the
columns
and
rows
of
the
table.
Example
2:
Let's assume that we want to select a row from Students table whose roll no is 200.
To
retrieve
SELECT
this
name
the
following
FROM
query
Students
is
executed.
WHERE
RollNo=200;
Result:
Table 3.7
Example
3:
Now let's consider a scenario where we need to retrieve the Salary from employees table whose first
name
SELECT
is
George.
Salary
The
FROM
query
for
employees
the
scenario
WHERE
will
be
first_name=
as
follows:
'George'
There are chances that there are more than one employees with first name as George. The above
query will retrieve all the employees whose first name is George. But if we need only one specific
employee whose first name is George then we can add one more condition in WHERE clause which
will
help
in
retrieving
the
exact
required
data.
SELECT
Salary
WHERE
first_name=
FROM
'George'
AND
employees
employee_id=
10
Result:
Table 3.8
In the above query we have added two conditions with the help of the "AND" key word. AND checks
for both the conditions and will retrieve the record which matches both. So salary of the employee(i.e
George as shown in result) is retrieved from the employees table whose FIRST_NAME is "George"
and EMPLOYEE_ID is equal to 10.
So far we saw how to retrieve data from one table. Now let's see how to retrieve data from more
than
To
one
retrieve
or
combine
data
from
table.
more
than
one
table
we
use
Joins.
Joins:
Join command is used to combine records from two or more tables in a database. Join command
creates
set
that
can
be
saved
as
table
or
used
as
it
is.
A Join is a means of combining fields from two tables by using values common to each other.
A Join condition can be used in the WHERE clause of SELECT, UPDATE, DELETE statements.
(Refer
The
SELECT
Section
following
col1,
in
is
the
col2,
this
document
syntax
col3...
for
FROM
Table 3.9
for
joining
table_name1,
more
details)
two
tables:
table_name2
Table 3.10
The column that is common between the two tables is Department_ID. So using Department_ID we
can join Department table and employees table. Please find below query for the same.
SELECT employee_Id, first_name,department_name,department_id
FROM department,employees
WHERE department.department _id = employees.department_id;
Result:
Table 3.11
Here data from employees table and department table are joined and displayed.
Department_Id from department table is compared with Department_Id from employees table and
the records that have same value for Department_ Id(i,e. 80) have been displayed.
UPDATE Statement
Let' see how to modify the existing rows in a table.
In Section 3.3.1 we saw how to insert the records into a table. Here we will see how to update the
inserted records.
The UPDATE statement modifies the set of existing table rows.
General syntax for the UPDATE statement is given below.
Syntax:
UPDATE table_name
SET (column_name1 = value,column_name2=value,..)
WHERE condition;
Note: The WHERE clause in the above syntax specifies which record or records should be updated.
All records will be updated, if we omit the WHERE clause in UPDATE statement.
Let's see a few examples for the UPDATE statement.
Example 1:
Table 3.12
Let's update the Name "David" in the students tables to "John".
We use the below query for the same.
UPDATE students SET name = 'John' WHERE rollno = 100;
Result:
1 row updated.
Students table will look like this now:
Table 3.13
Example
2:
Table 3.14
Now we want to update the salary of the employee whose manager_ID is 148.
We use the below query for the same.
Table 3.15
DELETE
The
Statement
DELETE
The
statement
DELETE
is
used
to
statement
delete
syntax
the
is
rows
from
given
table.
below.
Syntax:
DELETE
FROM
table_name
WHERE
condition
If we include the WHERE clause, the statement deletes only those records that satisfy the condition.
If we omit the WHERE clause, the statement deletes all records from the table, but the table still
exists without records.
Example
1:
Table 3.16
If we want to delete a row from the above table whose RollNo is 100, we use the below query.
DELETE FROM Students WHERE ROLLNO = 100;
Result:
1 row deleted.
After deleting the record, Students table will look like this:
Table 3.17
Example
2:
Table 3.18
Now let's delete the employee details whose hire date is 1st Jan 07.
DELETE FROM employees WHERE hire_date = '01-JAN-07';
Result:
2 rows deleted.
Two rows which have the Hire_Date value as '01-JAN-07' have been deleted from employees table
(Refer Example 2 of Sec3.2.1).
Note : The DELETE statement is different from the DROP statement. The DELETE statement
deletes some (or all) data from the table but the table exists in the data base. The DROP statement
removes the table permanently from the data base.
GRANT command
In order to do anything within a database you must be given the appropriate privileges. Database
operates in a closed system where you cannot perform any action at all unless you have been
authorized to do so. This includes logging onto the database, creating tables, manipulating data (ie
select, insert, update and delete) in tables created by other users, etc.
Syntax:
GRANT privilege_name ON table_name TO user_name;
Where,
user_name is the name of the user to whom an access right is being granted.
Example:
GRANT SELECT ON employees TO user10;
This command grants a SELECT permission on employees table to user10.
REVOKE command
The SQL command is used to revoke a privilege on a table.
Syntax:
REVOKE privilege_name ON table_name FROM user_name;
Where,
user_name is the name of the user from whom an access right is being revoked.
Example:
REVOKE SELECT ON employees FROM user10;
This command will REVOKE a SELECT privilege on employees table from user10. If you REVOKE
SELECT privilege on a table from a user, then the user is not able to SELECT data from that table
anymore.