Documente Academic
Documente Profesional
Documente Cultură
COM
This document is prepared for IBPS SO (IT-Officer) Examination 2014. The key concepts of DBMS are explained in a very precise & lucid way to assist the aspirants in their preparation. If you have any queries, doubts, or suggestions, please do share with us in our Forum. We wish you All The Best TEAM Engistan
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Data: Data is the quantities, characters, or symbols on which operations are performed by a computer. Data (or) Information Processing: The process of converting the facts into meaningful information is known as Data processing. It is also known as Information processing. Meta Data: The term Metadata refers to "data about data. Metadata is defined as the data providing information about one or more aspects of the data, such as:
Means of creation of the data Purpose of the data Time and date of creation Creator or author of the data Location on a computer network where the data were created Standards used
Database: A database is a structured collection of data, which is organized into files called tables. o A logically coherent collection of related data that (i) describes the entities and their inter-relationships, and (ii) is designed, built & populated for a specific reason.
Database Model
A Database model defines the logical design of data. The model describes the relationships between different parts of the data. In history of database design, three models have been in use.
Hierarchical Model: In this model each entity has only one parent but can have several children. At the top of hierarchy there is only one entity which is called Root.
Network Model: In the network model, entities are organised in a graph, in which some entities can be accessed through several path
Relational Model: In this model, data is organised in two-dimesional tables called relations. The tables or relation are related to each other.
RDBMS Concepts
A Relational Database management System (RDBMS) is a database management system based on relational model introduced by E.F Codd. In relational model, data is represented in terms of tuples (rows). RDBMS is used to manage Relational database. Relational database is a collection of organized set of tables from which data can be accessed easily. Relational Database is most commonly used database. It consists of number of tables and each table has its own primary key.
What is Table ? In Relational database, a table is a collection of data elements organised in terms of rows and columns. A table is also considered as convenient representation of relations. But a table can have duplicate tuples while a true relation cannot have duplicate tuples. Table is the most simplest form of data storage. Below is an example of Employee table. ID Name Age Salary
3
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] 1 2 3 4 Adam Alex Stuart Ross 34 28 20 42 13000 15000 18000 19020
What is a Record ? A single entry in a table is called a Record or Row. A Record in a table represents set of related data. For example, the above Employee table has 4 records. Following is an example of single record. 1 Adam 34 13000
What is Field ? A table consists of several records (row), each record can be broken into several smaller entities known as Fields. The above Employee table consist of four fields, ID, Name, Age and Salary.
What is a Column ? In Relational table, a column is a set of value of a particular type. The term Attribute is also used to represent a column. For example, in Employee table, Name is a column that represent names of employee. Name
4
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Adam Alex Stuart Ross
Sample Databases Shown below is an extract from a (relational) database that might be part of a Universitys Academic Information System: Engistan.com | Engineers Community
Terminology:
relation = table (file) attribute = column (field) tuple = row (record)
6
Database Keys:
Keys are very important part of Relational database. They are used to establish and identify relation between tables. They also ensure that each record within a table can be uniquely identified by combination of one or more fields within a table.
Super Key: Super Key is defined as a set of attributes within a table that uniquely identifies each record within a table. Super Key is a superset of Candidate key.
Candidate Key: Candidate keys are defined as the set of fields from which primary key can be selected. It is an attribute or set of attribute that can act as a primary key for a table to uniquely identify each record in that table.
Primary Key: Primary key is a candidate key that is most appropriate to become main key of the table. It is a key that uniquely identify each record in a table.
Foreign Key: A foreign key is generally a primary key from one table that appears as a field in another where the first table has a relationship to the second. In other words, if we had a table A with a primary key X that linked to a table B where X was a field in B, then X would be a foreign key in B.
Composite Key: Key that consists of two or more attributes that uniquely identify an entity occurrence is called Composite key. But any attribute that makes up the Composite key is not a simple key in its own.
Secondary or Alternative key: The candidate key which are not selected for primary key are known as secondary keys or alternative keys
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Non-key Attribute: Non-key attributes are attributes other than candidate key attributes in a table.
Non-prime Attribute: Non-prime Attributes are attributes other than Primary attribute.
Database Users:
Database Administrators (DBA): o individual(s) that determine & implement policy regarding users, their permissions on a database and the design & construction of that database Database Designers: o individual(s) possibly also software engineers who apply design techniques to produce database structures pertinent to a specific application End Users: o People who, from time to time, access the contents of a database: Casual end users may submit ad-hoc queries as the need arises, using a high-level query language nave, or parametric, end-users access the database through pre-written programs that effect an appropriate interface to the database database programmers write code, using a relevant programming language and the high-level query language, that can later be used by parametric users
Normalization
Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anomalies. It is a twoEngistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] step process that puts data into tabular form by removing duplicated data from the relation tables.
Eliminating redundant (useless) data. Ensuring data dependencies make sense i.e data is logically stored.
Problem Without Normalization Without Normalization, it becomes difficult to handle and update the database, without facing data loss. Insertion, Updation and Deletion Anomalies are very frequent if Database is not Normalized. To understand these anomalies let us take an example of Student table. S_id 401 402 403 404 S_Name Adam Alex Stuart Adam S_Address Noida Panipat Jammu Noida Subject_opted Bio Maths Maths Physics
Updation Anomaly : To update address of a student who occurs twice or more than twice in a table, we will have to update S_Address column in all the rows, else data will become inconsistent.
10
Insertion Anomaly: Suppose for a new admission, we have a Student id(S_id), name and address of a student but if student has not opted for any subjects yet then we have to insert NULL there, leading to Insertion Anamoly. Deletion Anomaly: If (S_id) 401 has only one subject and temporarily he drops it, when we delete that row, entire student record will be deleted along with it.
Normalization Rule
Normalization rule are divided into following normal form. 1. First Normal Form 2. Second Normal Form 3. Third Normal Form 4. BCNF 1. First Normal Form (1NF): A row of data cannot contain repeating group of data i.e each column must have a unique value. Each row of data must have a unique identifier i.e Primary key. For example consider a table which is not in First normal form Student Table : S_id 401 401 402 403 S_Name Adam Adam Alex Stuart subject Biology Physics Maths Maths
11
You can clearly see here that student name Adam is used twice in the table and subject math is also repeated. This violates the First Normal form. To reduce above table to First Normal form breaks the table into two different tables Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] New Student Table : S_id 401 402 403 Subject Table : subject_id 10 11 12 12 student_id 401 401 402 403 subject Biology Physics Math Math S_Name Adam Alex Stuart
In Student table concatenation of subject_id and student_id is the Primary key. Now both the Student table and Subject table are normalized to first normal form
12
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Customer Table: customer_id 101 101 102 103 Customer_Name Adam Adam Alex Stuart Order_id 10 11 12 13 Order_name order1 order2 order3 order4 Sale_detail sale1 sale2 sale3 sale4
In Customer table concatenation of Customer_id and Order_id is the primary key. This table is in First Normal form but not in Second Normal form because there are partial dependencies of columns on primary key. Customer_Name is only dependent on customer_id, Order_name is dependent on Order_id and there is no link between sale_detail and Customer_name. To reduce Customer table to Second Normal form break the table into following three different tables. Customer_Detail Table : customer_id 101 102 103 Customer_Name Adam Alex Stuart
13
Order_Detail Table :
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Order_id 10 11 12 13 Order_Name Order1 Order2 Order3 Order4
Sale_Detail Table :
Order_id 10 11 12 13
Now all these three table comply with Second Normal form.
3. Third Normal Form (3NF): Third Normal form applies that every non-prime
attribute of table must be dependent on primary key. The transitive functional dependency should be removed from the table. The table must be in Second Normal form. For example, consider a table with following fields. Engistan.com | Engineers Community
14
Student_id
Student_name
DOB
Street
city
State
Zip
In this table Student_id is Primary key, but street, city and state depends upon Zip. The dependency between zip and other fields is called transitive dependency. Hence to apply 3NF, we need to move the street, city and state to new table, with Zip as primary key.
The advantage of removing transitive dependency is, Amount of data duplication is reduced. Data integrity achieved.
4. Boyce and Codd Normal Form (BCNF): Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals with certain type of anamoly that is not handled by 3NF. A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF. Engistan.com | Engineers Community
15
E-R Diagram
ER-Diagram is a visual representation of data that describes how data is related to each other.
16
Weak Entity Weak entity is an entity that depends on another entity. Weak entity doen't have key attribute of their own. Double rectangle represents weak entity.
2) Attribute An Attribute describes a property or characterstic of an entity. For example, Name, Age, Address etc can be attributes of a Student. An attribute is represented using eclipse.
17
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Key Attribute Key attribute represents the main characteristic of an Entity. It is used to represent Primary key. Ellipse with underlying lines represent Key Attribute.
Composite Attribute An attribute can also have their own attributes. These attributes are known as Composite attribute.
3) Relationship A Relationship describes relations between entities. Relationship is represented using diamonds.
Binary Relationship Binary Relationship means relation between two Entities. This is further divided into three types. Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] 1. One to One : This type of relationship is rarely seen in real world.
The above example describes that one student can enroll ony for one course and a course will also have only one Student. This is not what you will usually see in relationship. 2. One to Many : It reflects business rule that one entity is associated with many number of same entity. For example, Student enrolls for only one Course but a Course can have many Students.
The arrows in the diagram describes that one student can enroll for only one course. 3. Many to Many :
The above diagram represents that many students can enroll for more than one courses.
Recursive Relationship When an Entity is related with itself it is known as Recursive Relationship.
19
Ternary Relationship Relationship of degree three is called Ternary relationship. Engistan.com | Engineers Community
Specialization: Specialization is opposite to Generalization. It is a top-down approach in which one higher level entity can be broken down into two lower level entity. In specialization, some higher level entities may not have lower-level entity sets at all.
Aggregation: Aggregation is a process when relation between two entity is treated as a single entity. Here the relation between Center and Course is acting as an Entity in relation with Visitor.
20
SQL Basics
Introduction to SQL Structure Query Language (SQL) is a programming language used for storing and managing data in RDBMS. SQL was the first commercial language introduced for E.F Codd's Relational model. Today almost all RDBMS (MySql, Oracle, Infomix, Sybase, MS Access) uses SQL as the standard database language. SQL is used to perform all type of data operations in RDBMS.
SQL Command SQL defines following data languages to manipulate data of RDBMS.
DDL : Data Definition Language All DDL commands are auto-committed. That means it saves all the changes permanently in the database. Command create Description to create new table or database for alteration delete data from table to drop a table to rename a table
DML : Data Manipulation Language DML commands are not auto-committed. It means changes are not permanent to database, they can be rolled back. Engistan.com | Engineers Community
21
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Command insert update delete merge Description to insert a new row to update existing row to delete a row merging two rows or two tables
TCL : Transaction Control Language These commands are to keep a check on other commands and their affect on the database. These commands can annul changes made by other commands by rolling back to original state. It can also make changes permanent. Command commit rollback savepoint Description to permanently save to undo change to save temporarily
DCL : Data Control Language Data control language provides command to grant and take back authority. Command grant revoke Description grant permission of right take back permission. Engistan.com | Engineers Community
22
DQL : Data Query Language Command select Description retrieve records from one or more table
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Examples: 1. SELECT * FROM annual_summaries WHERE sd_duration_code = 1 2. SELECT state_name FROM states WHERE state_population > 15000000 3. SELECT state_name, state_population FROM states WHERE state_name LIKE %NORTH% 4. SELECT * FROM annual_summaries WHERE sd_duration_code IN (1, , W, , X) AND annual_summary_year = 2000 OR Means at least 1 of the Conditions is TRUE You May Group Statements with ( ) BE CAREFUL MIXING AND & OR Conditions
Transaction Management:
Transaction: A transaction is a unit of program execution that accesses and possibly updates various data items. Or in simple words A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the database. Goal Of Transactions: The ACID properties Atomicity: Either all actions are carried out, or none are. Consistency: If each transaction is consistent, and the database is initially consistent, then it is left consistent. Isolation: Transactions are isolated, or protected, from the effects of other scheduled transactions. Durability: If a transaction completes successfully, then its effects persist.
1. Atomicity: A transaction can Commit after completing its actions, or Abort because of - Internal DBMS decision: restart - System crash: power, disk failure, - Unexpected situation: unable to access disk, data value, A transaction interrupted in the middle could leave the database inconsistent Engistan.com | Engineers Community
24
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] DBMS needs to remove the effects of partial transactions to ensure atomicity: either all a transactions actions are performed or none.
2. Consistency: Database consistency is the property that every transaction sees a consistent database instance. It follows from transaction atomicity, isolation and transaction consistency Users are responsible for ensuring transaction consistency - when run to completion against a consistent database instance, the transaction leaves the database consistent For example, consistency criterion that my inter-account-transfer transaction does not change the total amount of money in the accounts!
3. Isolation: Guarantee that even though transactions may be interleaved, the net effect is identical to executing the transactions serially For example, if transactions T1 and T2 are executed concurrently, the net effect is equivalent to executing - T1 followed by T2, or - T2 followed by T1 NOTE: The DBMS provides no guarantee of effective order of execution. 4. Durability: DBMS uses the log to ensure durability. If the system crashed before the changes made by a completed transaction are written to disk, the log is used to remember and restore these changes when the system is restarted. Again, this is handled by the recovery manager
25