Documente Academic
Documente Profesional
Documente Cultură
0 Introduction
Data: Data is a piece of information which is fact. Data are known facts and figures that are recorded. Data have implicit meaning. Data Base: Data base is a collection of inter related data. More or less stored permanently stored in a computers memory. The data base has the following implicit properties.
A data base represents same aspect of the real world. A data base is logically coherent. A data base is designed, built and populated with data for a specific purpose.
Application programmers: They implement the specification (given by the system analysts) into program. They test, debug, document & maintain the transactions. They are also known as Software Engineers. Application Programs: These are programs written by the Software Engineers to define, construct or to manipulate the database. Queries: Queries are used to retrieve specific data, update the database. Queries are generally handled by end users. Software to Process Queries / Programs: The Applications or Queries are processed by the DBMS software. This software handles programs as well Queries & passes them to a format that is understood by the next level. Software to Access Stored Data: It receives the format from the software and interprets it. It generates a hardware interrupts to access the storage devices. These interrupts may be to store, list, manipulate or even to delete the data.
(4) Difficulty in accessing data: To access data from the files, programming language programs must be written. (5) Information is available in reports: Information is available only through reports, in between queries are not supported. (6) Data Isolation: Data are accessed through reports. This results in data isolation. (7) Inadequate Security: Only operating system level security is provided. In depth security is not provided. (8) High Maintenance Cost: Software maintenance cost will be very expense. (9) Each data file of an application is a separate entity.
The data base allows multiple users to access the data base at the same time. DBMS include concurrency control that is if several users try to update the same data it should control such that result of the updates are correct. The applications of the multi user are called Online Transaction Processing OLTP.
(c) Sophisticated End Users: Such users are familiarizing with the facilities of the DBMS. The sophisticated end users are engineers, scientists, business analysts & others. They learn most of the DBMS facilities. (d) Stand alone Users: Such users maintain personal data bases by using readymade programs that provide easy to use menu or graphics based interfaces. They are proficient in using a specific software package. (iv) System Analysts: They determine the requirements of end users and develop specifications for transactions that meet the requirement. (v) Application Programmers: Application programmer implements the specifications in to programs. The specifications are defined by the System Analysts. The application programmer will test, debug, document and maintain the transaction.
Storage space is wasted when the same data is stored repeatedly. Files that represent the same data may become inconsistent.
ii. Restricting Unauthorized Access: In multiple users share a data base, all users will not be authorized to access all information in the data base. Some users may be permitted only to retrieve data while others are may be permitted to retrieve & to update. iii. Providing persistent storage for program objects & Data Structures: Databases are used to provide persistent storage for program objects & data structures. iv. Permitting Inferences & Actions using Rules: Data base system allows defining the deduction rules for inference new information from the stored data base facts. v. Providing Multiple User Interfaces: The data base is used by various types of users. They include query languages for casual user, programming languages interfaces for application programmers; forms & command codes for parametric users; menu driven & natural language interfaces for standalone interfaces. So the DBMS provides access to the multiple user interfaces. vi. Representing Complex Relationship among Data: The DBMS has the capability to represent a variety of complex relationships among the data as well as to retrieve & update related data easily & efficiently. vii. Integrity Constraints: The DBMS must provide capabilities to define & enforce the constraints. The simplest type of integrity constraint involves specifying a data type for each data item. viii. Providing Backup & Recovery: The DBMS must provide facilities to recover from hardware or software failures. The backup & recovery subsystem of DBMS is responsible for recovery. The benefits of Data base Approach are as follows: (i) Potential for Enforcing Standards: The data base approach permits the DBA to define & enforce standards among data base users in a large Organization. It allows
communications & co-operation among various departments, projects & users within the organization. (ii) Reduced Application Development Time: The DBMS provides facilities to create new applications in short durations. The estimated time is less than to one-sixth to onefourth of that of traditional file system. (iii) Flexibility: It is necessary to change the structure of a data base as requirements change. The DBMS allow certain types of changes to the structure of the data base without affecting the stored data & the existing applications programs. (iv) Availability of Up-to-Date Information: A DBMS makes the data base available to all users. One users update is applied to the data base; all other users can immediately see this update. Economies of Scale: The DBMS permits consolidation of data & applications, thus reducing the amount of wasteful overlap between activities of data processing personnel in different departments.
Categories of Data Models: Data models are classified based on the types of
concepts. (i) High level or Conceptual Data Models (ii) Low level or Physical Data Models
(iii) Representational or Implementation Data Models (i) High Level or Conceptual Data Models: This level provides concepts that are close to the way the users understand the data. This level is meant for the programmer not for end users. Conceptual data models use concepts such as entities, attributes & relationships. (ii) Low level or Physical Data Models: It provides concepts that describe the details how the data is stored within the storage devices. This level is meant for the computer specialists not for end users. (iii) Representational Data Models: This level provides concepts that are understood by end users. It provides details the way data is organized within the computer. It hides the details of data storage implemented on a computer system. Representational or Implementation data models are relational data model network or hierarchical models. SQL is standard language for relational data bases. This data model represents data by using record structures & hence is known as record based data models. Physical data model describes how data is stored in the computer by representing information such as record formats, record ordering & access paths. An access path is a structure that makes the search particular data base records efficient. Entity: Entity represents a real world object or concept. Example: employee, project. Attribute: Attribute represents property. It describes an entity. Ex: employees salary. Relationship: Relationship represents an interaction among the entities. It is among two or more entities. For example: works on relationship between an employee & a project.
Exam Reg. No Sports Roll No Game Sfees Sub 1 Sub 2 Sub 3 Sub 4
A schema diagram displays name of records, data items types & few constraints. Data Base State: The actual data in a data base may change frequently. The data in the data base at a particular moment in time is called a data base state or snapshot. It is also known as current set of occurrences or instances in the data base.
The three schemas describes the data, the data is actually exists at the physical level. The user refers to its own external schema. The DBMS transform a request specified on an external schema. Hence, the DBMS transform a request specified on an external schema into a required== against the conceptual schema, & then onto a request on the internal schema for processing over the stored data base. This processing of transforming requests & results between levels are called Mappings.
Data Independence:
Data independence is the capacity to change the schema at one level of a data base system without having to change the schema at the next higher level. Types of data independence. (i) Logical Data Independence: (ii) Physical Data Independence: Logical Data independence: It is the capacity to change the conceptual schema without having to change external schemas or application programs. The conceptual schema is changed during data base expansion or reduction of data base. Physical Data Independence: It is the capacity to change the internal schema without having to change the conceptual schemas. Changes to the internal schema are needed because some physical files are to be reorganized.
(c) View Definition Languages (VDL): This language specifies user views and their mappings to the conceptual schema. (d) Data Manipulation Language (DML): This language allows manipulating the data. The manipulations include retrieval, insertion, deletion and modification of the data.
DBMS Interfaces:
The user friendly interfaces provided by the DBMS are: (i) MENU BASED INTERFACES FOR BROWSING: The interfaces present the user with lists of options called menus. Menus avoid the need to memorize the specific commands & syntax of a query language. (ii) Forms based Interfaces: A form based interface displays a form to each user. Users have to fill out all the form entries to insert the new data. Forms are designed & programmed for nave users as interfaces. DBMS provides various front end tools to design the forms.
2) Graphical User Interfaces: It displays a schema to the user in diagrammatic form. GUIs use both menus & forms. GUIs use pointing device such as mouse, to pick the parts of displayed schema diagram. 3) Natural Language Interfaces: Such interface receives the requests written in human language & tries to understand them. It processes this requests & submits to the DBMS. 4) Interfaces for Parametric Users: Parametric users have small set of operations that must be repeatedly performed. System analysts & programmers design & implement a special interface. 5) Interfaces for the DBA: The data base system contain privileged commands that are used by the DBAs staff.
DML Compiler: Accepts data manipulation language (DML) commands, compiles them into object code. Host language compiler: Other than DML commands, remaining commands are sent to the host language compiler, generates the object code. Canned Transaction: It contains the object code of the DML commands and other program codes, also includes calls to the run-time data base processor. DBA: The responsibility of the DBA is to administering the data base and the secondary resource. End users: This class of users job is to access the data base for querying, updating, and generating reports. Application Programmer: Determines the requirements of the end users and develop specifications for canned transactions that meet these requirements.
Well, fine. Up to this point the ERD shows how boy and ice cream are related. Now, every boy must have a name, address, phone number etc. and every ice cream has a manufacturer, flavor, price etc. Without these the diagram is not complete. These items which we mentioned here are known as attributes, and they must be incorporated in the ERD as connected ovals.
But can only entities have attributes? Certainly not. If we want then the relationship must have their attributes too. These attribute do not inform anything more either about the boy or the ice cream, but they provide additional information about the relationships between the boy and the ice cream.
Step 3 We are almost complete now. If you look carefully, we now have defined structures for at least three tables like the following: Boy Name Address Phone Ice Cream Manufacturer Flavor Price Eats Date Time
However, this is still not a working database, because by definition, database should be collection of related tables. To make them connected, the tables must have some common attributes. If we chose the attribute Name of the Boy table to play the role of the common attribute, then the revised structure of the above tables become something like the following. Boy Name Address Phone Ice Cream Manufacturer Flavor Price Name Eats Date Time Name This is as complete as it can be. We now have information about the boy, about the ice cream he has eaten and about the date and time when the eating was done. Cardinality of Relationship While creating relationship between two entities, we may often need to face the cardinality problem. This simply means that how many entities of the first set are related to how many entities of the second set. Cardinality can be of the following three types. One-to-One Only one entity of the first set is related to only one entity of the second set. E.g. A teacher teaches a student. Only one teacher is teaching only one student. This can be expressed in the following diagram as:
One-to-Many Only one entity of the first set is related to multiple entities of the second set. E.g. A teacher teaches students. Only one teacher is teaching many students. This can be expressed in the following diagram as:
Many-to-One Multiple entities of the first set are related to multiple entities of the second set. E.g. Teachers teach a student. Many teachers are teaching only one student. This can be expressed in the following diagram as:
Many-to-Many Multiple entities of the first set is related to multiple entities of the second set. E.g. Teachers teach students. In any school or college many teachers are teaching many students. This can be considered as a two way one-to-many relationship. This can be expressed in the following diagram as:
In this discussion we have not included the attributes, but you can understand that they can be used without any problem if we want to. The Concept of Keys A key is an attribute of a table which helps to identify a row. There can be many different types of keys which are explained here. Super Key or Candidate Key: It is such an attribute of a table that can uniquely identify a row in a table. Generally they contain unique values and can never contain NULL values. There can be more than one super key or candidate key in a table e.g. within a STUDENT table Roll and Mobile No. can both serve to uniquely identify a student. Primary Key: It is one of the candidate keys that are chosen to be the identifying key for the entire table. E.g. although there are two candidate keys in the STUDENT table, the college would obviously use Roll as the primary key of the table.
Alternate Key: This is the candidate key which is not chosen as the primary key of the table. They are named so because although not the primary key, they can still identify a row. Composite Key: Sometimes one key is not enough to uniquely identify a row. E.g. in a single class Roll is enough to find a student, but in the entire school, merely searching by the Roll is not enough, because there could be 10 classes in the school and each one of them may contain a certain roll no 5. To uniquely identify the student we have to say something like class VII, roll no 5. So, a combination of two or more attributes is combined to create a unique combination of values, such as Class + Roll. Foreign Key: Sometimes we may have to work with an attribute that does not have a primary key of its own. To identify its rows, we have to use the primary attribute of a related table. Such a copy of another related tables primary key is called foreign key. Strong and Weak Entity Based on the concept of foreign key, there may arise a situation when we have to relate an entity having a primary key of its own and an entity not having a primary key of its own. In such a case, the entity having its own primary key is called a strong entity and the entity not having its own primary key is called a weak entity. Whenever we need to relate a strong and a weak entity together, the ERD would change just a little. Say, for example, we have a statement A Student lives in a Home. STUDENT is obviously a strong entity having a primary key Roll. But HOME may not have a unique primary key, as its only attribute Address may be shared by many homes (what if it is a housing estate?). HOME is a weak entity in this case. The ERD of this statement would be like the following
As you can see, the weak entity itself and the relationship linking a strong and weak entity must have double border. Different Types of Database There are three different types of data base. The difference lies in the organization of the database and the storage structure of the data. We shall briefly mention them here.
Relational DBMS This is our subject of study. A DBMS is relational if the data is organized into relations, that is, tables. In RDBMS, all data are stored in the well-known row-column format. Hierarchical DBMS In HDBMS, data is organized in a tree like manner. There is a parent-child relationship among data items and the data model is very suitable for representing one-to-many relationship. To access the data items, some kind of tree-traversal techniques are used, such as preorder traversal. Because HDBMS is built on the one-to-many model, we have to face a little bit of difficulty to organize a hierarchical database into row column format. For example, consider the following hierarchical database that shows four employees (E01, E02, E03, and E04) belonging to the same department D1.
There are two ways to represent the above one-to-many information into a relation that is built in one-to-one relationship. The first is called Replication, where the department id is replicated a number of times in the table like the following. Dept-Id Employee Code D1 D1 D1 D1 E01 E02 E03 E04
Replication makes the same data item redundant and is an inefficient way to store data. A better way is to use a technique called the Virtual Record. While using this, the repeating data item is not used in the table. It is kept at a separate place. The table, instead of containing the repeating information, contains a pointer to that place where the data item is stored.
This organization saves a lot of space as data is not made redundant. Network DBMS The NDBMS is built primarily on a oneto-many relationship, but where a parent-child representation among the data items cannot be ensured. This may happen in any real world situation where any entity can be linked to any entity. The NDBMS was proposed by a group of theorists known as the Database Task Group (DBTG). What they said looks like this: In NDBMS, all entities are called Records and all relationships are called Sets. The record from where the relationship starts is called the Owner Record and where it ends is called Member Record. The relationship or set is strictly one-to-many.
In case we need to represent a many-to-many relationship, an interesting thing happens. In NDBMS, Owner and Member can only have one-to-many relationship. We have to introduce a third common record with which both the Owner and Member can have oneto-many relationship. Using this common record, the Owner and Member can be linked by a many-to-many relationship. Suppose we have to represent the statement Teachers teach students. We have to introduce a third record, suppose CLASS to which both teacher and the student can have a many-to-many relationship. Using the class in the middle, teacher and student can be linked to a virtual many-to-many relationship.