Documente Academic
Documente Profesional
Documente Cultură
DBMS is software system and its main purpose is to store data but every software which
stores data is not DBMS.
Characteristics of DBMS
- it must provide a easy language for retrieval and manipulation of data.
Language should not have complex programming techniques and it should support
structural programming.
Language supported by DBMS – SQL(Structured Query Language)
- It must provide concurrent access to data( Multiple transaction can be performed on data at
a time).
- It must provide data integrity ( No replicated data).
- It must provide security( prevent accessing of data by unauthorized user)
DBMS store data in form of table or relation. A table consist of rows( or records or tuples)
and column(or field or attribute). A Database is collection of tables. Consider student table
given below.
RollNo Name Address DOB
1 Vivek 12, jivajinagar 18/8/87
2 Priyesh M25-gandhinagar 15/6/90
In this Student table there are four columns(RollNo, Name, Address, DOB) and two
rows(records). Each column is defined with a data type. Some General data types supported
by DBMS are:
1. Varchar- for string of characters
2. Number- for numeric values
3. Date- for date and time
Constraints are the conditions which are required to be satisfied when data is inserted or
deleted or modified. Suppose for Student table a constraint is defined to check age of students
is not more than 20, then we can apply check for DOB > “31/12/1990”. If we try to insert
entry for a student with DOB less than 31/12/1990 then insertion of record will give error and
transaction will not be completed successfully.
Note: when we use varchar and date data in query thenit should be in single quotes.
Keywords in SQL, column names and table names are not case sensitive but data is case
sensitive.
If we try to insert a tuple without name then query will give error in execution, because we
have defined Name field with constraint ‘not null’.
Example: insert into student (Rollno) values(3);
This query will generate error because null value is not accepted in Name field.
But if we want to change address of particular fields then we can use where clause.
Example: update student set Address= ‘12,Jivajinagar’ where rollno=1
This query will update address of student with rollno 1 to ‘21,gandhinagar’
10. Where clause: to understand where clause and conditionsConsider Relation Employee (
empid, ename , salary, job, deptid )
Q4.Write a query to find name of employees having salary between 1000 and 2000
Ans. Select enamefrom employee where salary>=1000 and salary <=2000
Q5. Write a query to find name of employees have their name start with A or B or C or D
Note:Before solving this query we first understand how strings are handles in SQL.
Suppose if S1= ‘ABC’ and S2= ‘X’ then which one of the S1 and S2 is greater. SQL
compares string from left to right and by position of each character according to their
ASCII value. ‘A’ from S1 is compared with ‘X’ from S2 , since ASCII value of ‘ X’ is
greater than ASCII value of ‘A’, so S2 > S1 is answer. If first character in S1 and S2 is
same then they are compared according to next character in position.
Examples:
Ans. Now to find name of employees whom name start with A or B or C or D , if we use
this query
Select ename from employee where ename between ‘A’ and ‘D’
then it will display all names start from ‘A’ and ‘B’ and ‘C’ and only ‘D’ , not all name
starting with ‘D’ . Instead this we can use this query
Select ename from employee where ename< ‘E’ and ename between ‘A’ and ‘E’
To compare with Null SQL provide ‘is’ and ‘is not’ operators. Above query correctly can
be written as:
Select ename from employee where deptid is null
Q7. Write a query to find name of employees whom name starts with ‘C’ and ends with
‘T’.
Ans. for this type of quries SQL provides wild cards and Like operator. Wild card ‘%’ is
used for string of zero to any length. Wild card ‘_’ is used for any single character.Like
and not like operator is used when string with wild card is compared. Query for this
question is:
Select ename from employee where ename like ‘C%T’
Q8. Write a query to find name of employees whom name contains ‘C’ as second
character.
Ans. Select ename from employee where ename like ‘_C%’
Q9. Write a query to find name of employees whom name contain at least two ‘T’ .
Ans. Select ename from employee where ename like ‘%T%T%’
Q10. Write a query to find name of employees whom name contain at exactly two ‘T’ .
Q11. Write a query to find name of employees whom name contain exactly two characters.
Ans. Select ename from employee where ename like ‘_ _’
Q12. Write a query to find name of employees whom name contain ‘_’ .
Ans. If wild card character are to be compared in data then ‘/’is used before character as a
SCOTT escape character.
Allen Select ename from employee where ename like ‘% /_ %’
Null
123 Q13. Write a query to find name of employees whom name contain ‘/’
.
Adams
Ans. Select ename from employee where ename like ‘% / / %’
Smith
11. Order by: used to display data in(either increasing or decreasing) order. In tables data is
present in the order in which we insert data. But while retrieving data we can use order by
keyword to display data in increasing or decreasing order.
Example: select enamefrom employee order by ename
This display names in alphabetical order. Suppose employee table contains ename
Null
123
Alphabetical increasing order Adams
SCOTT
Allen
Smith
If there are two or more same name then we can order these same name according to other
column
Select ename from employee order by ename, job desc
Desc used to specify decreasing order.
If you count a tuple of two or more fields then only tuple having all null values {Null,
Null} is counted as 0 and tuple like {22000,Null} will be counted as one .
13. As :used for renaming a field returned by select. In above query avg(salary) is column
name return by select, we can rename it using ‘as’ operator.
Query: select avg(salary) as average_salary from employee
Output:
Average_salary
30500
14. Group value function(group by): suppose if we want to find out average salary in each
department.
Query: select deptid, avg(salary) from employee group by(dept id)
Output:
Deptid Avg(salary)
25 35000
If null is a entry in deptid then a separate group is created for deptidhaving null value and
then aggregate function is applied.
If you try to print avg(salary) and deptid without using group function as
Select deptidd, avg(salary) from employee
Then this query will give error because result is not compatible (too see how result is not
compatible try to draw table for above query).
Note:we can select column with aggregate functions only if that column appear in Group
by function.
Query: select deptid, job, avg(salary) from employee group by(deptid, job)
This query first divide table int the group of deptid, then each deptid divided into subgroup
of job.
Output:
Deptid Job Avg(salary)
25 Manager 50000
25 Programmer 20000
26 HR 30000
Null Manager Null
Null Programmer 22000
15. Having: its like where clause used only for group by function. We cannot apply where
clause to group by functions. So you want to apply aggregate function with some
conditions to each group divided by group by function then use having as:
Q14.write a query to find deptid with more than 5 employee having salary more than
5000.
Ans. select deptid from employee where salary > 5000 group by(deptid) having
count(empid) >=5
Example: select ename from employee where salary > all (select salary from employee
where ename = ‘ Hermesh‘).
Now, if table have more than one employee with name ‘Hermesh’ and set of salaries are
returned by inner query, suppose it is {1000,30000,15000,20000}, then ‘>All’ will
compare values in salary field in table by all values in set. In simpler way, ‘>All’ select
maximum value from set and then compare this maximum value with values in table.
There is no need to compare for all other values in set. If values in salary field in table is
greater than maximum value in Set than it will also greater than all other values in Set.
Similarly,
<All - means Minimum in set
=ALL - not possible
> Any - means Minimum in set
<Any - means Maximum in set
=Any - same as ‘In’ operator
Some is same as Any.
Primary key-This is defined as unique key with a Not Null constraint. If a field is
primary key than it cannot have null values, similarly if group of fields is primary key then
all fields for same record cannot have null values.Example: if (empid, ename) is primary
key then values (1,Null) is allowed but values (Null,Null) is not allowed in table.
Candidate key: keys or combination of keys those can be assigned as primary key. In
other words keys those are candidature of primary key are called as primary keys. For
example: ifboth empid and (ename,deptno) are able to assigned as primary key of table, so
both are candidate key. But only one of the is chosen as primary key, We should choose
candidate key which is used most in queries. Primary key is used for indexingpurpose in
table and indexes speed up searching process.
Super key: all possible keys or combination of keys those uniquely identify a record is
called super key.
Primary key(candidate key) is minimal set of super keys which uniquely identifies a record
in table. Super keys can be subdivided while candidate key cannot be divided further,
means if combination of keys is taken as candidate key and if we remove any field from
that combination then the combination will not remain candidate key or super key. If we
add more fields to primary key then combination will not remain candidate key but
becomes super key. Suppose empid and (ename,deptno) is primary key of employee
relation then possible super keys are: empid , (empid,ename) , (empid,deptno) , (ename
,deptno) and (empid,ename,deptno)
All candidate key are super key but not all super keys are candidate key. Similarly, every
primary key is candidate key but not vice-versa.
Unique key does not identify a record uniquely but values in this field or combination of
field are unique.
While defining table we can create primary key and unique key.
Example: create table employee (empidnumber(6) primary key ,
enamevarchar(20) unique,
deptno Number(2) );
If we try to insert tuple(5, XYZ,20) in employee table then query will generate error
because deptno 40 is not in Department table.
Suppose if we delete row with deptno=10 from department table then two records(having
empid 1 and empid 2) in employee table, given above, become invalid, so SQL not allow
to delete row from department table until these two rows are deleted.
On delete cascade :this is used while defining foreign key for above situation.
Example: Create table employee (empidnumber(6)
,eeamevarchar(20),deptnoNumber(2) ,foreign key employee.deptno references
department.deptnoon delete cascade ) ;
Now , rows having deptno 10 in employee table is deleted first then row(with deptno 10) is
deleted from department table.
On delete NULL:in above case data from employee table is lost, For example ,if a
department is removed from company then this does not mean that employee belongs to
that company also removed but they can be shifted to other department or remain in no
department. For this case ‘on delete Null’ can be used instead of ‘on delete cascade’. So
when we delete row with deptno=10 from department table , first deptno of ‘Hermesh’ and
‘Priyesh’ is set to null then row is deleted from department table.
18. Join: to understand join first we look at the Cartesian product of table. Result of the
Cartesian product of two tables from previous example is (each row of one table is
combined with each row of other table):
Employee × Department
If 1st table have m rows and 2nd table have n rows then Cartesian product results in m×n
rows.
Query for above Cartesian product is:
Select * from employee , department
If two tables have same column name then they are differentiated by preceding table name
and a dot(‘.’) .
Now if we want department information of each employee then query will be:
Select empid ,ename, department.deptno, dname, city from employee, department where
employee.deptno = department.deptno;
Result of this query will be:
This is called Naturaljoin, in which two or more tables are joined according to their
common fields.
Note: There are some problems that can be solved by both join and nested query , some
problems can only solved by join, similarly, some problems can only solved by nested
queries. There are also some problems which cannot solved by both join and nested
queries.
Now if we want to find out who is manager of Aditya then we have to use self join as:
Select e2.ename from employee e1, employee e2 where e1.mgrid=e2.empid and
e1.ename= ‘Aditya’;
Here e1 and e2 are Alias of employee table. e1 and e2 are like two copies of employee
table then these copies joined by equating e1.mgrid and e2.empid.
Outer Joins:Notice that much of the data is lost when applying a join to two relations. In
some cases this lost data might hold useful information. An outer join retains the
information that would have been lost from the tables, replacing missing data with nulls.
There are three forms of the outer join, depending on which data is to be kept.
LEFT OUTER JOIN - keep data from the left-hand table
RIGHT OUTER JOIN - keep data from the right-hand table
FULL OUTER JOIN - keep data from both tables
Example:
19. Correlated queries: this is special type of nested query in which inner query executed for
every row selected outer query.
Example: for relation employee(empid, ename, deptno, salary) write a query to find the
name of employees earning highest salary in their department.
Select * from employee e1 where e1.salary=(select max(salary) from employee e2
e1.deptno=e2.deptno)
Now suppose if there are 10 records in employee table selected by outer query then inner
query is executed for each record in employee.(i.e. 10 time total). For example if employee
table is:
20. Grant and Revoke: used for deciding access permission for select/update/delete/insert/
queries on table/views to other users:
Example:
Grant update, delete on employee to Rahul
This query gives permission to Rahul to delete and update on employee table.
Query: Grant all on employee to Rahul with grant option
This query gives permission to Rahul for all operation on employee table and also
permission to give permission to other user.
Similarly, Revoke is used to withdraw/cancel granted permission from a user.
Example: revoke update on employee from Rahul
21. Views:A SQL View is a virtual table, which is based on SQL SELECT query. Essentially
a view is very close to a real database table (it has columns and rows just like a regular
table), except for the fact that the real tables store data, while the views don’t. The view’s
data is generated dynamically when the view is referenced. A view references one or more
existing database tables or other views. In effect every view is a filter of the table data
referenced in it and this filter can restrict both the columns and the rows of the referenced
tables.
Here is an example of how to create a SQL view using already familiar employee and
department table
Create view employeeinfo as
Select empid ,ename, department.deptno, city
From employee, department where employee.deptno=department.deptno
Importance of views: if we want that other user should have access to only some fields of
tables then create views using only those field and make view accessible to other users
instead of original table.
22. Set operations: There are four set operation supported by SQL
UNION ALL: Combines the results of two SELECT statements into one result set.
UNION: Combines the results of two SELECT statements into one result set, and then
eliminates any duplicate rows from that result set.
MINUS: Takes the result set of one SELECT statement, and removes those rows that are
also returned by a second SELECT statement.
INTERSECT: Returns only those rows that are returned by each of two SELECT
statements.
RELATIONAL ALGEBRA
In order to implement a DBMS, there must exist a set of rules which state how the database
system will behave. For instance, somewhere in the DBMS must be a set of statements which
indicate than when someone inserts data into a row of a relation, it has the effect which the
user expects. One way to specify this is to use words to write an `essay' as to how the DBMS
will operate, but words tend to be imprecise and open to interpretation. Instead, relational
databases are more usually defined using Relational Algebra.
Relational Algebra is :
2. Select(σ): same as where clause in sql. The only difference is in sql where clause checks
conditions in it but σ return complete rows from table according to condition.
σsalary>5000(employee)
this relational expression will return records of employees having salary > 5000
if we want to select only name of employees having salary>5000 then relational
expression will be:
Πename(σsalary>5000(employee))
3. Cartesian product(×):
Example: Πename,dname(employee × department)
This relational expression is same as sql statement:
Select ename ,dname from employee, department
4. Joins: in relational algebra special operators are used for joins. Joins are performed by
equating fields with same name in two tables.
Natural join:
Full outer join:
Right outer join:
Left outer join:
6. Group by ( ):
Example: write a relational expression to find average salary of each department.
Πsalary(employee) (deptno)
ENGINEER’S CIRCLE, GWALIOR Page 15
Q21. Let R1 (A,B,C) and R2 (D,E) be two relation schema, where the primary keys are
shownunderlined, and let C be a foreign key in R 1 referring to R2. Suppose there is no
violation ofthe above referential integrity constraint in the corresponding relation instances r 1
and r2.Which one of the following relational algebra expressions would necessarily produce
an emptyrelation?
(a) ΠD(r2) – ΠC(r1)
(b) ΠC(r1) – ΠD(r2)
(c) ΠD(r1 C=D r2)
(d) ΠC(r1 C=D r2) CS2004
Ans. b
Explanation: C is foreign key referring to R2(D of R2), means C contains values those are
already in D. Applying MINUS operator as ΠC(r1) – ΠD(r2) will return empty set.
We can also solve this query by taking example:
R1 R2
A B C D E
a1 b1 c1 c1 e1
a2 b2 c2 c2 e2
a3 b3 c2 c3 e2
a4 b4 c3 c4 e4
c5 e5
Q22.Consider the relation Student (name, sex, marks), where the primary key is
shownunderlined, pertaining to students in a class that has at least one boy and one girl. What
doesthe following relational algebra expression produce)(Note: is the rename operator).
Πname (σsex= female (Student)) — Πname [σsex=female /\ x=male /\ marks mStudent n,x,m (Student)]
(a) names of girl students with the highest marks
(b) names of girl students with more marks than some boy student
(c) names of girl students with marks not less than some boy student
(d) names of girl students with more marks than all the boy students CS2004
Ans.d
Explanation:Πname (σsex= female (Student)) will return only name of female student.
Πname [σsex=female /\ x=male /\ marks mStudent n,x,m (Student)] will return name of female student
having marks less or equal than any male student.
Subtracting result of second query from result of first query will return name of female
students those do not have marks less or equal than any male student.
RELATIONAL CALCULUS
Relational calculus consists of two calculi, the tuple relational calculus and the domain
relational calculus, that are part of the relational model for databases and provide a
declarative way to specify database queries. This in contrast to the relational algebra which is
also part of the relational model but provides a more procedural way for specifying queries.
Relational calculus query specifies what is to be retrieved rather than how to retrieve it.
Tuple Relational Calculus: Interested in finding tuples for which a predicate is true. Based
on use of tuple variables.Tuple variable is a variable that ‘ranges over’ a named relation: i.e.,
variable whose only permitted values are tuples of the relation. Specify range of a tuple
variable S as the Staff relation as:
Staff(S)
To find set of all tuples S such that P(S) is true:
{S | P(S)}
Examples:
To find details of all staff earning more than 10,000:
{e | Staff(e) S.salary> 10000}
To find a particular attribute, such as salary, write:
{e.salary | Staff(S) e.salary> 10000}
In relational calculus two quantifiers are used to tell how many instances the predicate
applies to:
– Existential quantifier (‘there exists’)
– Universal quantifier (‘for all’)
Tuple variables qualified by or are called bound variables, otherwise called free
variables.
Existential quantifier used in formulae that must be true for at least one instance, such as:
Staff(e)(B)(Branch(B)
(B.branchNo = e.branchNo) B.city = ‘London’)
Means ‘There exists a Branch tuple with same branchNo as the branchNo of the current Staff
tuple, S, and is located in London’.
Universal quantifier is used in statements about every instance, such as:
(B) (B.city¹ ‘Paris’)
Means ‘For all Branch tuples, the address is not in Paris’.
Examples:
List the names of all managers who earn more than £25,000.
{S.fName, S.lName | Staff(S)
S.position = ‘Manager’ S.salary> 25000}
List the names of staff who currently do not manage any properties.
{S.fName, S.lName | Staff(S) (~(P) (PropertyForRent(P)(S.staffNo = P.staffNo)))}
Or
{S.fName, S.lName | Staff(S) ((P) (~PropertyForRent(P) ~(S.staffNo = P.staffNo)))}
ENGINEER’S CIRCLE, GWALIOR Page 17
Expressions can generate an infinite set. For example:
{S | ~Staff(S)}
This type of expression are called unsafe expression. To avoid this, add restriction that all
values in result must be values in the domain of the expression.
Q23.With regard to the expressive power of the formal relational query languages, which
ofthe following statements is true?
(a) Relational algebra is more powerful than relational calculus
(b) Relational algebra has the same power as relational calculus.
(c) Relational algebra has the same power as safe relational calculus.
(d) None of the above CS2002
Ans. b
Explanation: there is no restriction on unsafe query is given in question.
FUNCTIONAL DEPENDECY
Definition: A set of attributes X functionally determines a set of attributes Y if the value of X
determines a unique value for Y.
This is similar to functions in mathematics. In mathematics, a function f is said to be valid
function when for every value of x ,f(x) return single value of y. For example, function y = x 2
returns single value of y for every value of x (at x=0 y=0,at x=1 y=1,at x=2 y=4). Now
consider the function, y=√x this function returns 2 values of by for every value of x(for x=4
it returns y = -2 and y = +2). So it is not a valid function. For a valid function we can say x
functionally determine y.
Note:by using sample data we can only decide which functional dependency is not holding. If
a functional dependency is holding in sample data then it may or may not hold in whole
relation.
Q25.From the following instance of a relation schema R(A,B,C), we can conclude that:
A B C
1 1 1
1 1 0
2 3 2
2 3 2
Ans. d
Explanation: from a instance of schema we can only prove that particular functional
dependency is not holding but we can’t determine that functional dependency is holding. So,
options (a) and (b) are wrong. Option (c) is wrong because for data ‘1’ in B there are two
values in C.
Closure of aattribute(*): closure of attributes contains all attributes those are directly or
indirectly driven by this attribute(using above rules). Example: for a relation R(A,B,C,D),
functional dependencies are: AB , BC , BCD.
Closure of A: By 1st rule a attribute derives itself so its closure contain A(i.e {A}*={A}). Now
from AB, B can be directly derive from B. if AB and BC then AC(2nd rule), C can
be derived from A . similarly if AB and AC then ABC(4th rule) and if ABC and
BCD then AD, so D can be derived from A.
A* = {A,B,C,D}
ENGINEER’S CIRCLE, GWALIOR Page 19
Simlarly, B*={B,C,D} , C*={C} , D*={D}
Note: How to find closure of group of attributes: suppose we want to find closure of BC in
above example then closure of BC contains attribute directly or indirectly driven by B,C and
BC. {BC}*={B,C,D}
Ans. B
Explanation: Find closure of attributes in left of all options
(A) {CD}+ = { CDEAB} - AC is in closure so AC can be derived from CD
+
(B) {BD} = {BD} - CD is not in closure so CD can not be derived from BD
(C) {BC}+ = { BCDEA} - CD is in closure so CD can be derived from BC
+
(D){AC} = { ACBDEA} - CD is in closure so CD can be derived from AC
Step2: to check whether a functional dependency is redundant or not , first hide that
functional from set and then find closure attributes those are at left of that functional
dependency without using reflexivity rule , if closure contains same attributes for whom we
are finding closure then functional dependency is redundant, remove this functional
dependency from the set.
1st Normal Form: a relation is said to be in 1st normal form if it’s data is represented in
tabular form or atomic and there should not be duplicated row(whole row should not be
duplicated, at least value in one same field of two rows must be different ).
Example:
Consider following data in employee table.
Above table appears to be in tabular form but it’s not in tabular form. A table in is in tabular
form if it for every row each column have single value.
Above table will be in 1NF if it is represented as
A relation is said to be in 2NF if and only if it is in 1NF and every non-key attribute is fully
dependent on the primary key. or in other words, A relation is said to be in 2NF if and only
if it is in 1NF and there exist no partial dependency.
3rd Normal Form(3NF):A relation R is in third normal form (3NF) if and only if it is in 2NF
and every non-key(non-prime) attribute is non-transitively dependent on the primary key.
A functional dependency XY not violates 3NF conditions if either X is candidate key or Y
is prime attribute , where X and Y attributes or group of attributes. If any of the functional
dependencies violates 3NF conditions then relation is not in 3NF.
An attribute C is transitively dependent on attribute A if there exists an attribute B such that:
AB and BC. Note that 3NF is concerned with transitive dependencies which do not
involve candidate keys.
If A 3NF relation have more than one candidate key then itcan have transitive dependencies
of the form: primary_keyother_candidate_keyany_non-key_column.
A relation R having just one candidate key is in third normal form (3NF) if and only if the
non-key attributes of R (if any) are:
1) mutually independent(attributes, those are not present in any functional dependency, are
mutually independent) , and
2) fully dependent on the primary key of R.
A non-key attribute is any column which is not part of the primary key. Two or more
attributes are mutually independent if none of the attributes is functionally dependent on any
of the others.
A relation R having just one candidate key is in third normal form (3NF) if and only if no
non-key(non-prime) column (or group of columns) determines another non-key(non-prime)
column (or group of columns).
Example: consider a relation ShipDetails (Ship, Capacity, Date, Cargo ,Value) with following
functional dependencies:
Ship,DateCargo,Capacity
Cargo Value
Capacity Value
To find whether given relation is in 3NF or not, first find all candidate keys of relation using
closure of attributes, then find whether relation is in 2NF or not, then check for 3NF.
Step1: candidate keyof above relation is {ship,date}.
Step 2: There is no partial dependency so relation is in 2NF.
Step 3:
Ship, DateCargo,Capacity not violates 3NF conditions(candidate keynon-prime
attribute)
ENGINEER’S CIRCLE, GWALIOR Page 22
Cargo Value violates 3NF(non-primenon-prime)
Capacity Value violates 3NF(non-primenon-prime)
A relation in 3NF does not have any anomalies but it still have redundancy.
BoyceeCott’s Normal form(BCNF): A relation is in BCNF if it contains functional
dependencies of form XY, where X is superkey. This is Strongest than 3NF.
Ans. D
Explanation: functional dependencies applicable for relation (Roll_Number, Name,
Date_of_Birth, Age) are:
Date_of_Birth Age
Name Roll_Number
Roll_Number Name
To check that a relation is in which normal form we should apply test from lower level. First
apply test for 2NF
Candidate keys of relations are: {Name ,Date_of_birth} and {Roll_number , Date_of_birth}
Now check for partial dependencies
Date_of_Birth Age - partial
Name Roll_Number - partial
Roll_Number Name - partial
There exist partial dependency in relation, relation is not in 2NF , so relation will not be in
neither 3NF nor BCNF.
Q29. The relation scheme Student Performance (name, courseNo, rolINo, grade) has
thefollowing functional dependencies:
name, courseNo grade
rolINo, courseNo grade
namerolINo
rolINo name
The highest normal form of this relation scheme is
(a) 2 NF (b) 3 NF (c) BCNF (d) 4 NF CS2004
Now check for BCNF: for every XY X should be super key.
name, courseNo grade - not violating BCNF( super key at left side)
rolINo, courseNo grade - not violating BCNF( super key at left side)
namerolINo - violating BCNF
rolINo name - violating BCNF
Relation is not in BCNF.
Highest normal form of relation is 3NF.
Suppose we decompose the above relation into two relations enrol1 and enrol2 as follows
enrol1 (sno, cno, date-enrolled)
enrol2 (date-enrolled, room-No., instructor)
There are problems with this decomposition but we wish to focus on one aspect at the
moment.
Let the decomposed relations enrol1 and enrol2 be:
Step2: Now put X into cell(m,n) where m is decomposed relation and n is field which is
present in relation m
A B C D E
R1 X X X
R2 X X X Step 3:
Now
search for all column in table which X in two rows(which is D here).
Step 4: find those functional dependencies which have column, found in step 3, at left side.
(DC and DE in above example).
Put X into cell(m,n) where m is row selected in step 3 and n is attributes in the right of these
functional dependencies(C and E for rows selected in D).
A B C D E
R1 X X X X X
R2 X X X
If any of the rowcontain X in all columns then decomposition is lossless-join else it is lossy.
Suppose if we would have taken BC first instead of CD then R and R1 after first
decomposition would be
R(A,B,D) R1(B,C)
Now CD is not holding by R and R1, so this dependency is lost, no further decomposition
is possible. This decomposition is lossless but not dependency preserving.
Note: if a relation R is having no functional dependency then highest normal form supported
by such relation is BCNF.
Ans. a
Explanation: if a relation is in BCNF then there is no redundancy left in relation , but if a
relation is in 3NF then there will be redundancy with no anomalies.
Ans.C
Explanation: if we apply BCNF test to both F and G then only one of them will pass the
test(which is in BCNF) other will fail(which is in 3NF).
Ans. c
TRANSACTION MANAGEMENT
Transactions isA sequence of many actions which are considered to be one atomic unit of
work.Transacttion in DBMS uses following operations:
– Read, write, commit, abort
Each transaction has a unique starting point, some actions and one end point.A transaction is
a unit of work which completes as a unit or fails as a unit.
Properties of transactions(ACID)
Atomicity: All actions in the transaction happen, or none happen .in other words, An event
either happens and is committed or fails and is rolled back. e.g. in a money transfer, debit
one account, credit the other. Either both debiting and crediting operations succeed, or
neither of them do.Transaction failure is called Abort. Commit and abort are irrevocable
actions. There is no undo for these actions.An Abort undoes operations that have already
been executed. For database operations, restore the data’s previous value from before the
transaction (Rollback-it); a Rollback command will undo all actions taken since the last
commit for that user. But some real world operations are not undoable.Examples - transfer
money, print ticket, fire missile
ENGINEER’S CIRCLE, GWALIOR Page 27
Consistency: If each transaction is consistent, and the DB starts consistent, it ends up
consistent.Consistency preservation is a property of a transaction, not of the database
mechanisms for controlling it (unlike the A, I, and D of ACID). If each transaction
maintains consistency, then a serial execution of transactions does also. A database state
consists of the complete set of data values in the database. A database state is consistent if
the database obeys all the integrity constraint. A transaction brings the database from one
consistent state to another consistent state.
Isolation: Execution of one transaction is isolated from that of other transactions
Durability: If a transaction commits, its effects persist.When a transaction commits, its
results will survive failures (e.g. of the application, OS, DB system … even of the disk).
Durability makes it possible for a transaction to be a legal contract. Implementation is
usually via a log
– DB system writes all transaction updates to a log file. To commit, it adds a record
“commit(Ti)” to the log.When the commit record is on disk, the transaction is committed.
Then system waits for disk acknowledgement before acknowledging to user. There can be
fivestate of transactions:
1. Active: transaction is started and is issuing reads and writes to the database.
2. Partially committed: operations are done and values are ready to be written to the
database.
3. Committed: writing to the database is permitted and successfully completed.
4. Abort: the transaction or the system detects a fatal error.
5. Terminated: transaction leaves the system.
A transaction reaches its commit point when all operations accessing the database are
completed and the result has been recorded in the log. It then writes a [commit, ] and
terminates
When a system failure occurs, search the log file for entries [start, ]and if there are no
logged entries [commit, ]then undo all operations that have logged entries [write, , X,
old_value, new_value]
If two transaction are valid and they executed serially(eighther<T1,T2> or <T2,T1> in above
case) then system will always move from one valid state to another valid state. This type of
execution is called serial execution and schedule is serial schedule.
A concurrent schedule is called serializable if it behaves like(or equivalent) serial schedule.
Consider following transaction T1 and T2
T1 T2
Read A Read A
A=A+30 A=A*5
Write A Write A
Read B Read B
B=B-30 B=B/5
Write B Write B
There are two possible serial schedule for above two transactions:
S1:<T1 ,T2> execute T1 first then T2
S1:<T2 ,T1> execute T2 first then T1
Lets analyze these schedule, assume initially A is 100 and B is 200
S1:<T1 ,T2> Initially After T1 T2
Value of A 100 130 650
Value of B 200 170 34
This schedule is equivalent to the serial schedule S1( values of A and B read by T1 and T2 is
same as in S1).
A schedule is called serializable if it is equivalent to a serial schedule. So S3 is serializable.
In other words ,if change in order of instruction in a serial schedule results in a concurrent
schedule that exihibit same behavior as that serial schedule then concurrent schedule is
serializable schedule.
Conflict Serializable: Conflict actions are the sequence of actions which should not be
changed to maintain serializability for every data item. As in above example two transaction
is concurrent and they are executing by interleaving there action but for data item A sequence
T1--->T2 is maintained ,similarly for data item B sequence T1--->T2 is maintained. So all
actions on data items are executing in sequence same as a serializable schedule. This schedule
is conflict serializable.
T1 T2
This schedule is equivalent to S4 since order of
1 Read A conflicting actions is same as S4.
2 Read A
3 A=A+30
4 Write A
5 Read B
6 A=A*5
7 B=B-50
8 Write A
9 Write B
10
11 Read B
12 B=B / 5
Write B
To find whether a schedule is conflict serializable or not draw dependency graph. This graph
is directed contains transactions as node and conflicting actions as edges(lables on edges can
be given, label contains name off data item for which conflict occur). For S4 and S5
dependency graph would be:
If dependency graph contains cycle then schedule is not conflict serializable. This graph
contains cycle so S4 and S5 is not conflict serializable.
If dependency graph does not contain cycle then we can find that schedule is equivalent to
which serial schedule by using topological sort of dependency graph.
Q33. Consider three data items D1, D2, and D3, and the following execution schedule of transactions
T1,T2,and T3. In the diagram, R(D) and W(D) denotes the actions reading and writing the data item D
respectively.
T1 T2 T3
R(D3)
R(D2)
W(D2)
R(D2)
R(D3)
R(D1)
W(D1)
W(D2)
W(D3)
R(D1)
R(D2)
Ans. A
Explanation: If a schedule is conflict serializable then schedule is serializable. So first we
apply conflict serializability test on schedule.
Step 1. Find all conflicts
T1 T2 T3
R(D3)
R(D2)
W(D2)
R(D2)
R(D3)
R(D1)
W(D1)
W(D2)
W(D3)
R(D1)
R(D2)
W(D2)
W(D1)
T2 T3 T1
X(A)
A=A+30 growing phase
write(A)
X(B)
unlock(A)
read(B) Shrinking phase
B=B+30
write(B)
unlock(B
T1 T2
To avoid cascade rollback there is some modification done in 2PL and modified 2PL is called
Strict 2PL.
All data item can be unlock at the end of transaction, so that interleaving of transaction are
minimum(restrict concurrent execution to certain limit). This increases waiting time of other
transactions.
In this case T1 will wait for B and T2 will wait for A and this situation is called deadlock
when two or more transaction are waiting for other transaction to unlock data item but no
transaction can make any progress.
Q34. which of the following scenario may lead to an irrecoverable error in database system?
(A) A transaction writes a data item after it is read by an uncommitted transaction
(B) A transaction reads a data item after it is read by an uncommitted transaction
(C) A transaction reads a data item after it is written by an committed transaction
(D) A transaction reads a data item after it is written by an uncommitted transaction
CS2003
Ans. C
A relation can have attributes. For example, in above ER diagram relation issued has attribute
“issuedate” which shows the date on which book is issued.
Cardinality of relations: express the number of entities to which another entity can be
associated via a relationship. For binary relationship sets between entity sets A and B, the
mapping cardinality must be one of:
1. One to one:An entity in A is associated with at most one entity in B, and an entity in B is
associated with at most one entity in A. E.g. if everystudent is allowed to borrow one book
only.
2. many to one: An entity in A is associated with any number in B. An entity in B is
associated with at most one entity in A. E.g. .if every student is allowed to borrow multiple
books.
3.many to many: Entities in A and B are associated with any number from each other. E.g. if
a book can be issued to many student and a student can borrow many books.
1 n
Student issue Book
Partial participation: there may be some entry in entity which are not in relation, so there is
partial participation of entity in relation. Example: there may be students , those don’t have
borrowed any books or there may be books those are not issued to any student.
Total participation : if every entry in entity participate in relation. Total participation of an
entity into the relation is denoted by double line
Types of entities: -
1. Strong entity: those entities having primary key are called strong entities. These are
represented by single rectangle.
2. Weak Entity: those entities not having primary key are called weak entities. These are
represented by doubly-outlined rectangle. Relation joining weak entity is also represented by
doubly-outlined diamond.
Example: consider ER Diagram for Bank-Loan system.
n 1
Loan payment
installmentNo amount
LoanNo Date
In this System Payment is weak entity because installment number for two customer is
same( Customer A pays 3000 for his 1st installment and Customer B pays 3000 for his 1st
installment ). So in payment there is no primary key.
Facts about Weak Entities
weak entity always have total participation in relation.
If weak entity have relation with strong entity then cardinality is one to many as shown in
above figure(with one at weak entity and many at strong entity).
Weak entity can also be represented as multivalued attribute. Or in other words if a
multivalued dependency have composite attribute(more than one) then multivalued
dependency can be represented as weak entity.
Ans. B
Explanation: According to rules table created will be
M(M1,M2,M3,P1)
P(P1,P2)
N(N1,N2,P1)
No relation is many-to-many, so no table created for relation.
36. Which of the following is a correct attribute set for one of the tables for thecorrect answer
to the above question?
(A) {M1,M2,M3,P1} (B) {M1,P1,N1,N2}
(C) {M1,P1,N1} (D) {M1,P1} CS2008
Ans. A
Explanation: see explanation of previous question
In spanned organization, accessing a record require more time than unspanned organization,
as in above example accessing 6th record require accessing of two blocks.
Main purpose of indexing is to speed up searching. Searching in a table using linear or binary
search(if records are sorted) is not practical, since they require large time(for above example
searching in 100 blocks require loading and unloading of 100 blocks in to the memory,
accessing time will be large). A sophisticated algorithm is needed for searching in tables.
Index are created for every record using some key (generally primary key), this identifies
each record uniquely. Each index is associated with a record pointer which points to record
which is stored at secondary storage. Index and record pointer associated with it is called
index record. These index records are kept in separate file, called index file.If we want to
search a record, first we search index in index file, if it is found then we can locate whole
record using record pointer.
Suppose we have index record of 6 bytes, then for 500 records in table we have 500 index
records. A block of 512 byte can store 512/6= 85 index records, and for 500 index records we
require 500/85 = 5.88 =6 blocks. Now to search a record in table we have to access only 6
blocks.
Drawbacks of indexing:
- requires extra storage
- if we want to search using other key than we have to create another index file using
that key.
Still, searching index in index file using linear search or binary search is time consuming. To
reduce time complexity of searching we use B-tree/B+-tree in index files. B-tree /B+-tree are
created using index records.
If we have a B-tree of degree n then every internal node can have maximum n-1 keys, n-1
record pointers associated with keys and n child pointers or block pointer (Here term “block
pointer” is used for child pointer because, generally, every node of B-tree is stored on a
separate block).
for above example maximum height of B-tree for 1000 index record will be log 291000 =
1.46=2 , which means to search a record we have to access only 2 blocks(instead of
1000/85= 12 blocks in linear search).
Q37. Consider a table T in a relational database with a key field K. A B-tree of order p is used
asan access structure on K, where p denotes the maximum number of tree pointers in a B-
treeindex node. Assume that K is 10 bytes long; disk block size is 512 bytes; each data
pointer Dis 8 bytes long and each block pointer PB is 5 bytes long. In order for each B-tree
node to fitin a single disk block, the maximum value of p is:
(A) 20 (B) 22 (C) 23 (D)32 IT2004
Ans.C
Explanation: by using formula nb+(n-1)k+ (n-1) r = block size. Here n is p
p *5+ (p-1)*10 + (p-1) * 8 = 512
23p= 530
p=23
B+-trees have different leaf structure. In B+- tree leaf node contains keys and record pointer
associated with it and a block pointer pointing to next leaf node. Non-leaf nodes contains
only keys and child pointer, there is no need to store record pointer at non-leaf node, because
all keys are ultimately present on leaf node.
For leaf node order will be maximum number of keys, record pointer pair a node can hold,
but order of non leaf node is determined by maximum child pointers it can have.
For leaf node equation will be:
n*k+ n* r + b = block size
For non-leaf node equation will be:
(n-1)k+ n b = block size
Q38.A B+ - tree index is to be built on the Name attribute of the relation STUDENT.
Assumethat all student names are of length 8 bytes, disk blocks are of size 512 bytes, and
indexpointers are of size 4 bytes. Given this scenario, what would be the best choice of the
degree(i.e. the number of pointers per node) of the B+ - tree?
(a) 16 (b) 42 (c) 43 (d) 44 CS2002
Q39.The order of an internal node in a B+ tree index is the maximum number of children it
canhave. Suppose that a child pointer takes 6 bytes, the search field value takes 14 bytes,
andthe block size is 512 bytes. What is the order of theinternal node?
(a) 24 (b) 25 (c) 26 (d) 27 CS2004
Ans. c
Explanation: by formula for internal node of B+ tree of n degree
(n-1) k+ n b = block size
(n-1)*14 + n*6= 512
20 n=526
n=26
Q40. The order of a leaf node in a B +- tree is the maximum number of (value, datarecord
pointer) pairs it can hold. Given that the block size is 1K bytes, datarecord pointer is 7 bytes
long, the value field is 9 bytes long and a block pointeris 6 bytes long, what is the order of the
leaf node?
(A) 63 (B) 64 (C) 67 (D) 68
CS2007
Ans. A
Explanation: order of leaf node B+ tree can be determined by formula
n*k+ n* r + b = block size
n*9 + n*7 + 6=1024
n*16=1018
n=63
Hashing
In B/B+tree ,searching is faster but still we have to search. Hashing is used to remove
searching complexity. In this we use a hash functions and indexes are then mapped into hash
table according that hash function. if we want to locate an index then use hash function to
find that index. In hashing, searching is removed completely only hash function is to map and
locate indexes.
Hash function:the hash function is chose in such a way that it can map all keys into hash
table. for example we have hash function of ‘mod 10’ and we want to map keys 2010,4011 ,
3127,4256,3214 then hash table will look like
0 2010
1 4011
2
3
4 3124
5
If a new key result in the position which is already filled in the hash table then collision
occurs. Ex. If new entry is 5414 for above hash table then hash function returns 4th location in
the table which is already filled. There are two ways to handle these collisions.
1. Open Addressing / Rehashing : slightly change the hash function for new key which is
causing collision. Ex. Use (key+5)mod 10 when collision occurs.
Linear probing: This is a type of rehashing function. In this if new entries collides then search for
next free block in the table and fill this block by new entry. If in above example 5414 and 5444 arrives
then hash table will be
0 2010
1 4011
2
3
4 3124
5 5414
6 4256
7 3127
8 5444
9
In linear probing there is problem of primary clustering, means data concentrated at one
place. Hash function should distribute data uniformly to avoid primary clustering. Ex.
Quadratic function can be used for distribute data as:
Hashing function=(key+n2 )mod 10 + where n denotes number of collision
Key Collision number Hash table index
2010 0 (2010+02)%10= 0
4010 1 (4010+12)%10= 1
7120 2 (7120+22)%10= 4
9650 3 (9650+32)%10= 9
3250 4 (3250+42)%10= 6
2. Chaining: in this we make array of pointer instead of data. If entries of same hash function
value makes link list. New entry is added at the end of link list. Example: chaining for data
2010, 4011 , 3127,4256,3214 , 5414 ,5444, 6457 ,9666, 8888 using key mod 10 function will
be as:
0 2010 Null
1 4011 Null
2 Null
3 Null
5 Null
ENGINEER’S CIRCLE, GWALIOR Page 41
6 4256 9666 Null
8 8888 Null
9 Null
Q41. Consider a hash table of size seven, with starting index zero, and a hash function(3x +
4)mod7. Assuming the hash table is initially empty, which of the followingis the contents of
the table when the sequence 1, 3, 8, 10 is inserted into thetable using closed hashing? Note
that − denotes an empty location in the table.
(A) 8, −, −, −, −, −, 10 (B) 1, 8, 10, −, −, −, 3
(C) 1, −, −, −, −, −, 3 (D) 1, 10, 8, −, −, −, 3 CS2007
Ans. B
Explanation: hash function is (3x+4) mod 7
Key hash table index
1 0
3 6
8 0
10 6
Final hash table will be(if linear probing is used)
0 1
1 8
2 10
3
4 Q42. consider the following SQL query
5 Select distinct a1,a2,, , an from r1,r2, , , rm where p
6 3 For an arbitrary predicate p, this query is equivalent to which of the following
relational algebra expressions?
(A) Π σp(r1 × r2 × … ×rm )
a1, a2 . . . an
(B) Π σp(r1 r2 … rm )
a1, a2 . . . an
(C) Π σp(r1 r2 …rm )
a1, a2 . . . an
(D) Π σp(r1 r2 …rm ) CS2003
a1, a2 . . . an
Ans. A
Q43. Consider set of relation shown below and SQL query that follows.
Students: (Roll_Number , Name, Date_of_Birth)
Course: (Cource_Number,Cource_Name, Instructor)
Grades: (Roll_Number, Course_Number, Grade)
Ans. C
Q44. Given the following input (4322, 1334, 1471, 9679, 1989, 6171, 6173, 4199) and the
hashfunction x mod 10, which of the following statements are true?
i) 9679, 1989, 4199 hash to the same value
ii) 1471, 6171 has to the same value
iii) All elements hash to the same value
iv) Each element hashes to a different value
(a) i only (b) ii only (c) i and ii only (d) iii or iv CS2004
Ans. C
Explanation: when we apply hash function x mod 10 to 9679,1989,4199 , result is 9 , so
statement (i) is correct. Similarly, for 1471 and 6171 hash function returns 1, so statement (ii)
is also correct.
Ans. a
Explanation: Natural join will be performed by equating rollno attribute from both tables.
In Students relation there are 120 tuples and each tuple have a unique rollno, since rollno is
primary key in students relation. In enroll relation there are only 8 tuples and there can be two
extreme conditions:
1.Minimum condition: each tuple in enroll have unique student or in other words there are
only 8 students enrolled for courses.
2. Maximum condition: each tuple in enroll have same student but with different courses , or
in other words there is one student enrolled for 8 courses.
In both cases, natural join of two relation results only 8 tuples.
Q46. Which one of the following is a key factor for preferring B-trees to binary searchtrees
for indexing database relations?
(a) Database relations have a large number of records
Ans. a
Q47. Let r be a relation instance with schema R = (A, B, C, D). We define r 1 = ΠA,B,C (R)
andr2 = ΠA,D (r). Let s = r1* r2 where * denotes natural join. Given that the decomposition of r
into r1 and r2 is lossy, which one of the following isTRUE?
(a) sr (b) rs=r (c) r s (d)r*s=s CS2005
Ans. c
Explanation: Decomposition is lossy means when decomposed tables are joined then some
spurious(extra, meaning-less) tuples will be generated and because of these spurious tuples
we can’t obtain actual data after joining. So option(c) is correct.
Q48. The following table has two attributes A and C where A is the primary key and C is
theforeign key referencing a with on-delete cascade. The set of all tuples that must
beadditionally deleted to preserve referential integrity when the tuple (2,4) is deleted is:
A C
2 4
3 4
4 3
5 2
7 2
9 5
6 4
Ans. c
Explanation: C is foreign key referencing to A, C can’t have data other than data present in
A. if we delete tuple(2,4), then rows those containing data value ‘2’ in column C becomes
invalid, which are tuples(5,2) and (7,2), these tuples must be deleted from table.
If (5,2) is deleted then one more row become invalid(tuple (9,5)).
So by deleting (2,4) from table, tuples (5,2),(7,2) and (9,5) is also deleted
Q49. The relation book (title,price) contains the titles and prices of different books.
Assumingthat no two books have the same price, what does the following SQL query list?
select title
from book as B
where (select count(*)
from book as T
whereT.price>B.price)<5
(a) Titles of the four most expensive books
(b) Title of the fifth most inexpensive book
Ans. d
Ans. d
Explanation: if we take closure AE, BE, DE , we will get all attributes appearing in
functional dependencies as:
AE+={ABCDE} BE+={ABCDE} DE+={ABCDE}
Only H is not in any of the FDs, so add H to AE, DE, and BE to generate candidate key.
Q51. Consider the following log sequence of two transactions on a bank account, with
initialbalance 12000, that transfer 2000 to a mortgage payment and then apply a 5% interest.
1. T1 start
2. T1 B old= 1200 new= 10000
3. T1 M old=0 new=2000
4. T1 commit
5. T2start
6. T2 B old= 10000 new= 10500
7. T2 commit
Suppose the database system crashes just before log record 7 is written. Whenthe system is
restarted, which one statement is true of the recovery procedure?
(A) We must redo log record 6 to set B to 10500
(B) We must undo log record 6 to set B to 10000 and then redo log records 2 and 3
(C) We need not redo log records 2 and 3 because transaction Ti has committed
(D) We can apply redo and undo operations in arbitrary order because they are idempotent.
CS2006
Ans. C
Q52. Consider the relation account (customer, balance) where customer is a primary key
andthere are no null values. We would like to rank customers according to decreasing
balance.The customer with the largest balance gets rank 1. ties arenot broke but ranks are
skipped: if exactly two customers have the largest balance they eachget rank 1 and rank 2 is
not assigned.
Queryl:select A.customer, count(B.customer)
from account A, account B
whereA.balance<=B.balance
group by A.customer
Ans. C
Explanation: solve these queries by taking example. Suppose content of table is
Account Query1: Query2:
Q53. Consider the relation enrolled (student, course) in which (student, course) is the
primarykey, and the relation paid (student, amount) where student is the primary key. Assume
nonull values and no foreign keys or integrity constraints. Given the following four queries:
Queryl:select student
from enrolled
where student in
(select student from paid)
Query2:selectstudent
from paid
where student in
(select student from enrolled)
Query3:select E.student
from enrolled E, paid P
whereE.student = P.student
Query4:select student
from paid
whereexists
(select * from enrolled where enrolled.student = paid.student)
Ans. C,D
Explanation: in option C closure of AF contains C , so it is wrong.
In option D , closure of AB contains F, so it is also wrong.
Ans. b
Explanation:statement “ ΠstudId(σ sex="female"(studInfo)” returns studid of all female student,
these studid is naturally join with studid of enroll so statement “( ΠstudId(σ
sex="female"(studInfo))×Πcourseid( enroll))” returns cartesian product of all female student’s
studid with all available courses in enroll. Next, enroll is subtracted from this Cartesian
product, means actual entries of female student enrolled in enroll relation is removed from
result of Cartesian product, if a course is enrolled by all female student then it will be
removed completely. Now , final result contain female student with coursed in which they
have not enrolled but these course may be enrolled by other female student. Π courseid() will
select Courses in which a proper subset of female students are enrolled.
Example:
Studid Name Sex Studid courseid
S1 A Female S1 1
S2 B Female S1 2
S3 C Male S2 1
S4 D Male S3 1
S4 2
Studid courseid
S1 1
Coursed
2
Q56. Consider the relation employee(name, sex, supervisorName) with name as thekey.
supervisorName gives the name of the supervisor of the employee underconsideration. What
does the following Tuple Relational Calculus query produce?
{e.name employee (e) /\
x [¬employee (x) \/ x.supervisorName¹ e.name \/x.sex = "male" ] }
(A) Names of employees with a male supervisor.
(B) Names of employees with no immediate male subordinates.
(C) Names of employees with no immediate female subordinates.
(D) Names of employees with a female supervisor. CS2007
Ans. C
Q57. Consider the table employee(empId, name, department, salary) and the twoqueries
Q1 ,Q2 below. Assuming that department 5 has more than one employee,and we want to find
the employees who get higher salary than anyone in thedepartment 5, which one of the
statements is TRUE for any arbitrary employeetable?
Q1 : Select e.empId
From employee e
Where not exists
(Select * From employee s where s.department = “5” and s.salary>=e.salary)
Q2 : Select e.empId
From employee e
Where e.salary>Any
(Select distinct salary From employee s Where s.department = “5”)
(A)Q1 is the correct query
(B) Q2 is the correct query
(C) Both Q1 and Q2 produce the same answer.
(D) Neither Q1 nor Q2 is the correct query CS2007
Ans. b
Ans. C
Explanation: to find conflict serializability first find conflict statements in schedules.
S1 S2
T1 T2 T1 T2
r1(X) r1(X)
r1(Y) r2(X)
r2(X) r2(Y)
r2(Y) w2(Y)
w2(Y) r1(Y)
w1(X) w1(X)
Dependency Graph: Dependency graph:
Y X
T X T T Y T
1 1 1 1
Cycle exists, Not conflict Cycle not exists, conflict
serializable serializable
Q60. Which of the following tuple relational calculus expression(s) is/are equivalent tot r
(P(t))?
I.tr(P(t))
II. t r(P(t))
III. t r(P(t))
IV. tr(P(t))
(A) I only (B) II only (C) III only (D) III and IV only CS2008
Ans. C
Explanation: in this question some rules of predicate calculus are used.
can be replace by or can be replace by
t r (P(t)) = ( t r (P(t)))
= ( t r (P(t)))
Now can be replace by
= ( t r (P(t)))
= ( t r (P(t)))
Alternatively, you can take example for symbols and then compare each predicate.
Suppose r is student relation and P is predicate for Sick.
P(t)=Sick(t) means student t is sick
t r (P(t)) means all are sick students
I.tr(P(t)) means there exist no one who belongs to sick student
Ans. C
Q62.The keys 12, 18, 13, 2, 3, 23, 5 and 15 are inserted into an initially empty hashtable of
length 10 using open addressing with hash function h(k) = k mod 10 andlinear probing. What
is the resultant hash table?
A B C D
0 0 0 0
1 1 1 1
2 2 2 12 2 12 2 12,2
3 23 3 13 3 13 3 13,3,23
4 4 4 2 4
5 15 5 5 5 3 5 5,15
6 6 6 23 6
7 7 7 5 7
8 18 8 18 8 18 8 18
9 9 9 15 9
Ans. C
Q63. Let R and S be relational schemes such that R={a,b,c} and S={c}. Now considerthe
following queries on the database:
I. Π R-S(r) - Π R-S( ΠR-S(r) ×S-Π R-S,S(r) )
II. {t | t Π R-S(r) /\ u s (v r(u=v[s] /\ t=v[R-S] ))}
III. t | t Π R-S(r) /\ v s (u r(u=v[s] /\ t=v[R-S] ))}
IV. Select R.a, R.b
From R,S
Where R.c=S.c
Which of the above queries are equivalent?
(A) I and II (B) I and III (C) II and IV (D) III and IV CS2009
Ans. C
Hint: here R-S meansa,b and S means c
queryI. Π R-S(r) - Π R-S( ΠR-S(r) ×S-Π R-S,S(r) )
This query is equivalent to
Π a,b(r) - Π a,b( Π a,b(r) ×S-Π a,b,c(r) )
Returns combination of a,b in R which belongs all c in S
Assume that relations corresponding to the above schema are not empty. Whichone of the
following is the correct interpretation of the above query?
(A) Find the names of all suppliers who have supplied a non-blue part.
(B) Find the names of all suppliers who have not supplied a non-blue part.
(C) Find the names of all suppliers who have supplied only blue parts.
(D) Find the names of all suppliers who have not supplied only blue parts. CS2009
Ans.B
Explanation: (SELECT P.pidFROM Parts PWHEREP.color<> 'blue') returns pid of parts
those have blue color.
(SELECT C.sidFROM Catalog CWHEREC.pid NOT in (SELECT P.pidFROM Parts P
WHERE P.color<> 'blue')) returns sid of suppliers who have supplied at least one non-blue
part.
Finally, outer query will select suppliers who have not supplied any non-blue parts.
Q65. Assume that, in the suppliers relation above, each supplier and each street withina city
has a unique name, and (sname, city) forms a candidate key. No otherfunctional dependencies
are implied other than those implied by primary andcandidate keys. Which one of the
following is TRUE about the above schema?
(A) The schema is in BCNF
(B) The schema is in 3NF but not in BCNF
(C) The schema is in 2NF but not in 3NF
(D) The schema is not in 2NF CS2009
Ans.A
Explanation: in this relation FDs only depend on primary key and candidate key, so relation
is in BCNF.
Ans. B
Explanation: it is correlated query ,so for every row select by outer query inner query will
run
Pid Class Tid Inner query returns Exists returns Outer query returns
0 AC 8200 Null False
1 AC 8201 {1,Rahul,66} True 1
2 AC 8201 {2,Sourav,67} True 2
5 AC 8203 Null False
1 AC 8204 {1,Rahul,66} True 1
3 AC 8202 Null False
Q67. Which of the following concurrency control protocols ensure both conflictserializability
and freedom from deadlock?
I. 2-phase locking
II. Time-stamp ordering
(A) I only (B) II only (C) Both I and II (D) Neither I nor II CS2010
Ans. B
Q68. Consider the following schedule for transactions T1, T2 and T3:
T1 T2 T3
Read(X)
Read(Y)
Read(Y)
Write(Y)
Write(X)
Write(X)
Read(X)
Write(X)
Which one of the schedules below is the correct serialization of the above?
(A) T1 → T3 → T2 (B) T2 → T1 → T3
(C) T2 → T3 → T1 (D) T3 → T1 → T2 CS2010
Ans. A
Explanation: first find all conflicts
T1 T2 T3
Read(X)
ENGINEER’S CIRCLE, GWALIOR Page 52
Read(Y)
Read(Y)
Write(Y)
Write(X)
Write(X)
Read(X)
Write(X)
Now make dependency graph
T T3 T2
1
By topological sorting of this graph we can find order of serialization which is
T1T3T2
Q69. The following functional dependencies hold for relations R(A, B, C) and S(B, D, E)
B A,
A C
The relation R contains 200tuples and the relation S contains 100tuples. What isthe maximum
number of tuples possible in the natural join of Rand S?
(A) 100 (B) 200 (C) 300 (D) 2000
CS2010
Ans. A
Explanation: from set of functional dependencies , we can find that B is primary key of R.
So in R , 200 tuples contains unique value of B. In S there can be two extreme conditions:
1. If all 100 B in S is same(and this B is present in R)
2. If all 100 B in S is unique( and every B in S is present in R)
In both case natural join would pick maximum 100 tuples.
Q70. Which one of the following choices gives a possible order in which the key valuescould
have been inserted in the table?
(A) 46, 42, 34, 52, 23, 33 (B) 34, 42, 23, 52, 33, 46
(C) 46, 34, 42, 23, 52, 33 (D) 42, 46, 33, 23, 34, 52 CS2010
Ans. C
Explanation: for all option create hash table
Q71. How many different insertion sequences of the key values using the same hashfunction
and linear probing will result in the hash table shown above?
(A) 10 (B) 20 (C) 30 (D) 40
CS2010
Ans. C
Q72.Consider the following entity relationship diagram (ERD), where two entities El and
E2have a relation R of cardinality l:m.
1 m
E1 E2
R
The attributes of El are A11, A12 and A13 where A11 is the key attribute. The attributes of E2are
A21, A22 and A23 where A21 is the key attribute and A23 is a multi-valued attribute.Relation R
does not have any attribute. A relational database containing minimum number oftables with
each table satisfying the requirements of the third normal form (3NF) is designedfrom the
above ERD. The number of tables in the database is:
(A) 2 (B) 3 (C) 5 (D)4 IT2004
Ans. B
Explanation: tables created using ERD
E1(A11, A12 , A13) , E2(A21, A22) and A23(A21, A23)
Q73.A relational database contains two table student and department in which student
tablehas columns roll_no, name and dept_id and department table has columns dept_id
anddetp_name. the following insert statements were executed successfully to populate the
emptytables:
Insert into department values (1, ‘Mathematics’)
Insert into department values (2, ‘Physics’)
Insert into student values (1, ‘Navin’,l)
Insert into student values (2, ‘Mukesh’,2)
Insert into student values (3, ‘Gita’,l)
How many rows and columns will be retrieved by the following SQL statement?
Select * from student, department
(A) 0 row and 4 columns
(B) 3 rows and 4 columns
(C) 3 rows and 5 columns
ENGINEER’S CIRCLE, GWALIOR Page 54
(D) 6 rows and 5 columns
IT2004
Ans. D
Explanation: query is Cartesian product of student and department which returns 3*2 rows
and 5(3 student’s and 2 department’s) columns.
Q74. A relation Empdtl is defined with attributes empcode (unique), name, street, city,
stateand pincode. For any pincode, there is only one city and state. Also, for any given street,
cityand state, thereis just one pincode. In normalization terms, Empdtl is a relation in
(A) 1 NF only
(B) 2 NF and hence also in 1 NF
(C) 3 NF and hence also in 2 NF and 1 NF
(D) BCNF and hence also in 3 NF, 2NF and 1NF IT2004
Ans. C
Explanation: functional dependency given
Pincodecity
Pincodestate
Street,city,statepincode
Candidate key of Empdtl will be: {empcode, name, pincode, street} and {empcode,
name,Street, city, state}
Apply check for 2NF: find partial dependencies
Pincodecity - not partial(primeprime)
Pincodestate - not partial(primeprime)
Street,city,statepincode -not partial(primeprime)
Empdtl is in 2NF
Q75. A table Ti in a relational database has the following rows and columns:
Roll No Marks
1 10
2 20
3 30
4 Null
The following sequence of SQL statements was successfully executed on table T1.
Ans. C
Explanation:Query “Update Ti set marks = marks + 5” update in table as
Roll No Marks
1 15
2 25
3 35
4 Null
Query “Select avg(marks) from Ti” returns
(15+25+35)/3=25
T1 T2
Read(A)
A=A-10
Read(A)
Temp=0.2*A
Write(A)
Read(B)
Write(A)
Read(B)
B=B+10
Write(B)
B=B+temp
Write(B)
Which of the following is TRUE about the schedule 5?
(A) S is serializable only as T1, T2
(B) S is serializable only as T2, T1
(C) S is serializable both as T1, T2 and T2, T1
(D) S is serializable either as T1 or as T2 IT2004
Ans. D
Explanation: find all conflicts in schedule
T1 T2
Read(A)
A=A-10
Read(A)
Temp=0.2*A
Write(A)
Read(B)
Write(A)
Read(B)
B=B+10
This schedule is not conflict serializable. But this can be view serializable.
If a schedule is not conflict serializable then this means schedule may or may not be
serializable. So we have to check for less strict definition of serializability i.e.
viewserializability. View serializability does not consider conflicts for blind writes. For
example
T1 T2
R(A)
W(A)
W(A)
W(A)
This schedule is not conflict serializable. But in last result of this schedule is value stored for
data item A. Last write operation writes value of A. result of this schedule is similar to
running only T2. If we swap the write statements then schedule is
T1 T2
R(A)
W(A)
W(A)
W(A)
Result of this schedule is similar to running T1 only. This situation is called blind writes.
These writes operation does not change serializability of schedule.
Now in question, there is only write-write conflict forming a cycle. We can swap these write
instructions in following two ways to remove cycle:
T1 T2 T1 T2
Read(A) Read(A)
A=A-10 A=A-10
Read(A) Read(A)
Temp=0.2*A Temp=0.2*A
Write(A) Write(A)
Write(A) Read(B)
Read(B) Write(A)
Read(B) Read(B)
B=B+10 B=B+10
Write(B) B=B+temp
B=B+temp Write(B)
Write(B) Write(B)
Output of this schedule is Output of this schedule is
same as output of running same as output of running
only T2 only T1
So option D is correct.
Q77. Consider two tables in a relational database with columns and rows as follows:
Table: Student Table: Department
Roll_no Name Dept_id Dept_id Dept_Name
Roll_no is the primary key of the Student table, Dept_id is the primary key of theDepartment
table and Studetn.Dept_id is a foreign key fromDepartment.Dept_id.What will happen if we
try to execute the following two SQL statements?
(i) update Student set Dept_id= Null where Roll_no =1
(ii) update Department set Dept_id = Null where Dept_id =1
(A) Both (i) and (ii) will fail (B) (i) will fail but (ii) will succeed
(C) (i) will succeed but (ii) will fail (D) Both (i) and (ii) will succeed IT2004
Ans. C
Explanation:
Query(i) runs correctly because foreign key(Student.dept_id) can have only null value other
than values in referred column(Department.dept_id).
Query(ii) will fail because it is trying to set dept_id to Null which is primary key and primary
key implicitly have two constraints 1. Unique and 2. Not null.
Q78.A hash table contains 10 buckets and uses linear probing to resolve collisions. The
keyvalues are integers and the hash function used is key % 10. if the values 43, 165, 62,
123,142 are inserted in the table, in what location would the key value 142 be inserted?
(A) 2 (B) 3 (C) 4 (D)6 IT2005
Ans. D
Explanation: create hash table for given data
0
1
2 62
3 43
4 123
5 165
6 142
7
8
9
Q79.Consider the entities ‘hotel room’, and ‘person’ with a many to many relationship
‘lodging’as shown below:
mm
Hotel Room lodging person
If we wish to store information about the rent payment to be made by person(s)
occupyingdifferent hotel rooms, then this information should appear as anattribute of
(A) Person
(B) Hotel Room
(C) Lodging
Ans. C
Explanation: for many-nay relation a separate table is created. Here a separate table will be
created for lodging which contain primary keys of ‘hotel room’ and ‘person’ as its attribute
.so we can store information about the rent payment to be made by person(s)
occupyingdifferent hotel rooms in lodging table.
Q80. A table has fields Fl, F2, F3, F4, F5 with the following functional dependencies
Fl F3
F2 F4
(F1.F2) F5
In terms of Normalization, this table is in
(A) 1 NF
(B) 2 NF
(C) 3 NF
(D)None of these IT2005
Ans. A
Explanation: first find candidate keys of relation: {F1,F2}
Now check for 2NF: find partial dependencies
Fl F3 - partial (prime non-prime)
F2 F4 -partial (prime non-prime)
(F1.F2) F5 -not partial(non-prime non-prime)
Relation is not in 2NF because it have partial dependencies.
Q81. A B-tree used as an index for a large database table has four levels including the
rootnode. If a new key is inserted in this index, then the maximum number of nodes that
could benewly created in the process are
(A) 5 (B) 4 (C) 3 (D)2 IT2005
Ans. A
Explanation: solve by taking example.
Q82. Amongst the ACID properties of a transaction, the ‘Durability’ property requiresthat the
changes made to the database by a successful transaction persist
(A) except in case of an Operating System crash
(B) except in case of Disk crash
(C) except in case of a power failure
(D) always, even if there is a failure of any kind IT2005
Ans. D
Q83. A company maintains records of sales made by its salespersons and pays
themcommission based on each individual’s total sales made in a year. This data is
maintained in atable with following schema:
salesinfo = (salespersonid, totalsales, commission)
In a certain year, due to better business results, the company decides to further reward
itssalespersons by enhancing the commission paid to them as per the following formula.
If commission < = 50000, enhance it by 2%
If 50000 < commission < = 100000, enhance it by 4%
If commission > 100000, enhance it by 6%
Ans. D
Explanation: suppose if we run T1 then there will be some employees whose salary have
become >50000 and now if we run T2 then these employees will also get benefit of 4%, so
T2 must not followed by T1, and similarly, T3 must not be followed by T2. So option D is
correct.
Q84. A table ‘student’ with schema (roll, name, hostel, marks) and another table ‘hobby’
withschema (roll, hobbyname) contains records as shown below.
Relations S and H with the same schema as those of these two tables respectively contain
thesame information as tuples. A new relation S’ is obtained by the following relational
algebraoperation:
S = hostel ((σs.roll=H.roll (σmarks>75 and roll>2000 and roll<3000 (S)) (H))
ENGINEER’S CIRCLE, GWALIOR Page 60
The difference between the number of rows output by the SQL statement and the number of
tuples in S is:
(A) 6 (B) 4 (C) 2 (D) IT2005
Ans. B
Explanation: Following table is created after joining tables with condition