Sunteți pe pagina 1din 59

2 December 2005 logoPPTp[3] Introduction to DatabasesRelational Model and Relational Algebra Prof.

Beat Signer Department of Computer Science Vrije Universiteit Brussel http://vub.academia.edu/BeatSigner

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 2 March 3, 2011 Relational Model .Theory for data management developedby Edgar F. Codd while working at IBM .Edgar F. Codd, A Relational Model of Data forLarge Shared Data Banks, Communica tionsof the ACM 13(6), June 1970 .data independence -between logical and physical level .set-based query language -relational query language .normalisation -avoid redundancy

.IBM first did not implement the relationalmodel in order to "protect" their IMS/DB revenues Edgar F. Coddcodd.png

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 3 March 3, 2011 Relational Model .IBM's System R (1974) was a DBMS prototype implementing Codd's relational model .first implementation of the Structured English QueryLanguage (SEQUEL) -later renamed to Structured Query Language (SQL)

.The System R prototype finally led to the development of different commercial DBMSs including IBM's DB2, Oracle orMicrosoft SQL Server

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 4 March 3, 2011 Relational Database .A relational database consists of a number of tables .each table row defines a relationshipbetween a set of values .There is an analogy between the concept of a table (collection of relationships) and the mathematical concept of a relation .in the following we therefore talk about relations instead of tables ..a relational database consists of a collection of relations

customerID name street postcode city 1 MaxFrisch Bahnhofstrasse 7 8001 Zurich 2 Pieter Bruegel Pleinlaan 25 1050 Brussels ... ... ...

... ...

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 5 March 3, 2011 Relational Database ... .Information is normally partitioned into different relations since the storage in a single relation would lead to .a replication of information (redundancy) .a large number of necessary null values .While tables are used at the logical level, different storage structures can be used at the physical level .data independence

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 6 March 3, 2011 Relation .The column headers of a table are called attributesand for each attribute aithere is a set of permitted values called the domainDiof ai .Given the domains D1,D2,..., Dn, a relationris defined as a subset of the cartesian product D1.D2.....Dn namestreet ... city Frisch Bahnhofstrasse 7 ... Zurich ... ... ... ...

.D1.D2 .Dnt1tm a1 a2 an attributes tuples relation r degree cardinality

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 7 March 3, 2011 Relation ... .A relation ris a set of n-ary tuples ti=(a1, a2,..., an) whereeach ai.Di .the number of attributes is called a relation's degree .the number of tuples is called a relation's cardinality .Since a relation is a set of tuples, the order of the tuplesis irrelevant .each tuple is distinctive (no duplicate tuples) .The order of attributes is irrelevant .The domain of each attribute has to be atomic .all members of the domain have to be indivisible units .a relation with only atomic values is normalised(first normal form)

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 8 March 3, 2011 Relation ... .Multiple attributes can have the same domain Di .we will see later how the domain types can be defined (in SQL) .A tuple variable is a variable that stands for a tuple .its domain is defined by the set of all tuples .The special nullvalue is part of any domain Di .used to represent an unknown or non-existing value .null values are not easy to handle (not the same as empty string) -avoid null values whenever possible

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 9 March 3, 2011 Example of a Relation .Let us assume that we have the following attributes with their domains .name= {Mller, De Meuter, Giacometti, ...} .street= {Bahnhofstrasse, Pleinlaan, Bergstrasse, ...} .city= {Zurich, Brussels, Geneva, Paris, ...} .Then .r = {(Mller, Pleinlaan, Brussels), (De Meuter, Bahnhofstrasse, Zurich), (Giacometti, Bergstrasse, Geneva), (Giacometti, Bahnhofstrasse, Geneva)} forms a relation over name.street.city

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 10 March 3, 2011 Relational Database Example customerIDname street postcode city 1 MaxFrisch Bahnhofstrasse 7 8001 Zurich 2 Pieter Bruegel Pleinlaan 25 1050 Brussels 5 Claude Debussy 12 RueLouise 75008 Paris 53 AlbertEinstein Bergstrasse18 8037 Zurich 8

MaxFrisch ETH Zentrum 8092 Zurich

cdID name duration price year 1 Falling into Place 2007 17.90 2007 2 Moudi 3156 15.50 1996

customer cd

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 11 March 3, 2011 Relational Database Example ... orderID customerID cdID date amount status 1 53 93 13.02.2010 2 open 2 2 117 15.02.2010 1 delivered

order supplierID name city 5 MaxFrisch Zurich

2 Franz Hohler Aarau

supplier Customer (customerID, name, street, postcode, city) CD (cdID, name, duration, price, year) Order (orderId, customerID, cdID, date, amount, status) Supplier (supplierID, name, city) relational database schema

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 12 March 3, 2011 Database Schema .The logical design of a database is defined by the database schema .A database instance is a snapshot of the data stored in a database at a given time .A relation schema R = (A1, A2,..., An) is defined by a list of attributes .by convention, the name of a relation schema starts with an uppercase letter (in contrast to relations) -e.g. Customer= (customerID, name, street, postcode, city) .a relationcan be defined based on the relation schema -e.g. customer= (Customer) .a relation instance contains the relation's actual values (tuples)

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 13 March 3, 2011 Keys .For the keys we use the same terminology as introduced earlier for the ER model .K .Ris a superkeyof Rif the values of Kuniquely identify any tuple of possible relation instances r(R) .t1andt2.r andt1.t2.t1[K].t2[K] .e.g. {cdID} and {cdID, name} are both superkeys of CD .Kis a candidate key if Kis minimal .e.g. {cdID}is a candidate key of CD .The DB designer has to choose one of the candidate keys for each relation schema as primary key .if possible, the value of a primary key should not change or only in very rare cases

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 14 March 3, 2011 Foreign Keys .A relation schema may have one or multiple attributes that correspond to the primary key of another relation schema and are called foreign keys .e.g. customerIDis a foreign key of the orderrelation schema .note that a foreign key of the referencing relation can only contain values that occur in the primary key of the referenced relationor it must be null.referential integrity

customerID name street postcode city cdID name duration price year orderID customerID cdID date amount status customer cd order

schema diagram

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 15 March 3, 2011 .A query language is a language that is used to access(read) information stored in a database as well as to create, update and deleteinformation (CRUD) .There are different types of query languages .procedural -e.g. relational algebra .declarative -e.g. Structured Query Language (SQL)

.The relational algebra as well as the tuple and domain relational calculus form the basis for "higher level" query languages (e.g. SQL) Query Language

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 16 March 3, 2011 Relational Algebra .The relational algebra consists of six fundamental operations .unary operations .selection: s .projection: p .rename: r .binary operations .union: . .set difference:.cartesian product: . .An operator takes one (unary) or two (binary) relations as input and returns a new relation as output

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 17 March 3, 2011 Relational Algebra ... .Based on the six fundamental operations, we can define additional relational operations (no additional power) .set intersection: . .natural join: . .theta join: q .equijoin: q .semijoin: . .division: . .assignment: . .Extended operations with additional expressiveness .generalised projection .aggregate function .outer join

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 18 March 3, 2011 Selection (s) .sp(r) ={t t.rand p(t)} .pis a selection predicate .pconsists of terms piconnected by an and (.), or (.)or not (.) .a term pihas the form attributemoperatorattributenor constant .the available operators are: =, ., >, <, ., . .Examples ."Find all tuples in the customer relation that are in the city of Zurich and have a postcode greater than 8010." .scity = "Zurich".postcode > 8010 (customer)

customerID name street postcode city 53 AlbertEinstein Bergstrasse18 8037 Zurich 8 MaxFrisch ETH Zentrum 8092 Zurich

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 19 March 3, 2011 Selection (s) ... .Examples ... .A selection predicate may contain a comparison between two attributes ."Find all tuples in the cd relation with a value for the duration attribute that is equal to the year of release." .sduration = year (cd)

cdID name duration price year 1 Falling into Place 2007 17.90 2007

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 20 March 3, 2011 Projection (p) .pA1, A2,..., Am(r) .returns a relation instance that only contains the columns for which an attribute Aihas been listed .note that duplicate tuples are removedfrom the result since the resulting relation is a set .Example ."Return the name and city of all tuples in the customer relation." .pname,city (customer)

name city MaxFrisch Zurich Pieter Bruegel Brussels Claude Debussy Paris AlbertEinstein Zurich

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 21 March 3, 2011 Composition of Relational Operations .Since the result of a relational operation is a new relation instance, multiple operations can be combined .Example ."Find the names of all customers who live in Zurich." .pname (scity = "Zurich"(customer))

name MaxFrisch AlbertEinstein

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 22 March 3, 2011 Union (.) .r.s={t t. ror t. s} .rand smust have the same degree (same number of attributes) .the corresponding attribute domains must have a compatible type .Example ."Find the names of persons who are either customers or suppliers." .pname(customer).pname(supplier)

name MaxFrisch Pieter Bruegel Claude Debussy AlbertEinstein Franz Hohler

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 23 March 3, 2011 Set Difference (-) .r-s={t t. rand t. s} .rand smust have the same degree .the corresponding attribute domains must have a compatible type .finds tuples that are in a relation rbut not in another relation s .Example ."Find the names of suppliers who are no customers." .pname(supplier)-pname(customer)

name Franz Hohler

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 24 March 3, 2011 Cartesian Product (.) .r.s={tu t. rand u. s} .the attribute names of r(R)and s(S)have to be distinct .if there are attributes with the same name in r(R)and s(S), the rename operator rhas to be used .Example .pname(customer).pcity(customer)

name city MaxFrisch Zurich MaxFrisch Brussels MaxFrisch Paris Pieter Bruegel Zurich Pieter Bruegel Brussels ... ...

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 25 March 3, 2011 Cartesian Product (.) ... .Example ... ."List the names of all customers with at least one order." .pname( scustomer.customerID = order.customerID(customer.order) ) .Note that we will later see another operator (natural join) for the combination of two tables based on common attributes name Pieter Bruegel Albert Einstein

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 26 March 3, 2011 Rename (r) .rx(E) .renames the result of expression Eto x .rx(A1,A2,..., An) (E) .renames the result of expression Eto xand renames the attributes to A1,A2,...,An .Example .rperson(name,location) (pname,city (customer))

name location MaxFrisch Zurich Pieter Bruegel Brussels Claude Debussy Paris AlbertEinstein Zurich

person

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 27 March 3, 2011 Rename (r) ... .Example ."Find the price of the most expensive CD in the cd relation." .Pprice(cd)-Pcd.price(scd.price < d.price (cd.rd (cd)))

.We will later see another operator (aggregate function) that can be used to simplify this type of query for a max value price 17.90

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 28 March 3, 2011 Formal Definition .A basic relational algebra expression is either a relation from the databaseor a constant relation .The set of all relational algebra expressions is defined by .E1. E2 .E1 -E2 .E1.E2 .sp(E1) .pA1,A2,..., Am(E1) .rx(A1,A2,..., An) (E1)

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 29 March 3, 2011 Set Intersection .r. s={t t. rand t. s} .rand smust have the same degree .the corresponding attribute domains must have a compatible type .Note that the set intersection can be implemented via a pair of set difference operations .r. s=r-(r-s) .Example ."Find the names of people who are customers and suppliers." .pname(customer). pname(supplier)

name MaxFrisch

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 30 March 3, 2011 Natural Join (.) .r.s= pR.S(sr.A1= s.A1. r.A2= s.A2 .....r.An= s.An(r.s)) .where R. S = {A1, A2,..., An} .Keep all tuples of the cartesian product r.s that have the same value for the shared attributes of r(R)and s(S) .the natural join is an associative operation .Example ."List the name and street of customers whose order is still open." .pname, street(sstatus="open"(order.customer)) .note that we could do the selection first (more efficient) -we will review this later when discussing query optimisation

name street AlbertEinstein Bergstrasse18

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 31 March 3, 2011 Theta Join (q) and Equijoin .r.ps= sp (r.s)) .where the predicate pis of the form r.Aiqs.Ajand q is a comparison operator (=, ., >, <, ., .) .Example .customer.postcode . duration(rcd(cdID,cdName,duration,price,year) (cd))

.An equijoin (r.ps) is a special form of a theta join where only the equality operator (=) may be used .note that a natural join is a equijoin over all common attributes of a relation rand s

customerID name ... cdID cdName ... 2 Pieter Bruegel ... 1 Falling into Place ... 2 Pieter Bruegel ... 2

Moudi ...

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 32 March 3, 2011 Semijoin (.) .r.s= pA1, A2,..., An(r.s) .where A1, A2,..., Anare all the attributes of r .Example ."List the customers whose order is still open." .customer.sstatus="open" (order)

customerID Name Street Postcode city 53 AlbertEinstein Bergstrasse18 8037 Zurich

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 33 March 3, 2011 Division (.) .r.s= {t t. pR-S(r) and "u. s .tu. r} .for the relations r(R)and s(S)and S.R .suited for queries that include the phrase "for all" .Example .r.s

A B C D a 3 a 1 a 1 a 3 a 3 b 3 b 1 a 1

c 4 b 3 b 2 b 2 c 4 a 1

r C D b 3 a 1

A B a 3 c 4

sr.s

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 34 March 3, 2011 Assigment .variable.E .Works like an assigment in programming languages .Assigments must always be made to temporaray relation variables .no database modification .Example .temp1.pname, street(sstatus="open"(order.customer)) .temp2.pname,street (scity = "Brussels"(customer)) .result .temp1 . temp2

name street Pieter Bruegel Pleinlaan 25 AlbertEinstein Bergstrasse18

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 35 March 3, 2011 Generalised Projection (p) .pF1, F2,..., Fm(E) .generalisation of the project operation that supports the projection to arithmetic expressions F1, F2,..., Fm .Example ."Show a list of the names of all CDs together with a reducedprice (80% of the o riginal price)." .pname, price.0.8 as reducedPrice (cd)

name reducedPrice Falling into Place 14.32 Moudi 12.72

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 36 March 3, 2011 Aggregate Function . G1, G2,..., GmG F1(A1), F2(A2),..., Fn(An)(E) .takes a collection of values as input and returns a single value based on the following operations -min: minimum value -max: maximum value -sum: sum of values -count: number of value -avg: average value .G1, G2,..., Gm lists the attributes on which to group -can be empty .Fi are the aggregate functions .by default, duplicates are not elimiated before aggregating -can use the keyword distictto eliminate duplicates (e.g. sum-distinct)

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 37 March 3, 2011 Aggregate Function ... .Examples ."List the average amount of items in an order." .G avg(amount)(order) ."List the number of customers for each city." . cityG count(customerID) as customerNo (customer)

avg(amount) 1.5

city customerNo Zurich 3 Brussels 1 Paris 1

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 38 March 3, 2011 Outer Joins .The left outer join (.),right outer join (.) and full outer join (.)are extensions of the natural join operation .Computes the natural join and then adds the tuples from one relation that do not match the other relation .filled up with null values .Example .customer.order

customerID name ... orderID cdID ... 53 AlbertEinstein ... 1 93 ... 2 Pieter Bruegel ... 2 117 ... 5

Claude Debussy ... null null ... ... ... ... ... ... ...

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 39 March 3, 2011 Outer Joins ... .left outer join (.) .keeps all tuples of the left relation .non-matching parts filled up with null values .right outer join (.) .keeps all tuples of the right relation .non-matching parts filled up with null values .full outer join (.) .keeps all tuples of both relations .non-matching parts filled up with null values

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 40 March 3, 2011 Null Values .A null value means that the value is unknownornonexistent .Primary key attributes can never be null (entity integrity) .The result of an arithmetic operation that involves a null value is null .For grouping and duplicate elimination null values are treated like other values (two null values are the same) .Null values are ignored in aggregate functions

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 41 March 3, 2011 Database Modifications .A database may be modified by using one of the following operations .insert .update .delete .These operations are defined via the assigment operator

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 42 March 3, 2011 Insert .r.r. E .To insert new data into a relation we can .explicitly specify the tuples to be inserted .write a query and insert the result tuple set .Example .cd.cd. {(3, "Chromatic", 3012, 16.50, 1996)}

cdID name duration price year 1 Falling into Place 2007 17.90 2007 2 Moudi 3156 15.50 1996 3 Chromatic 3012 16.50

1996

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 43 March 3, 2011 Update .r.pF1, F2,..., Fn(r) .Update specific values in a tuple r .Fiis either the old value of ror a new value if the attribute has to be updated .Example ."Increase the price of all CDs by 10%." .cd.pcdID, name, duration, price.1.1, year (cd)

cdID name duration price year 1 Falling into Place 2007 19.69 2007 2 Moudi 3156 17.05 1996 3 Chromatic 3012 18.15 1996

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 44 March 3, 2011 Delete .r.r-E .A delete is expressed similar to a query except that the result is removed from the database .Cannot remove single attributes .Example ."Remove the CD with the name 'Moudi' from the database." .cd.cd-sname = "Moudi"(cd)

cdID name duration price year 1 Falling into Place 2007 19.69 2007 3 Chromatic 3012 18.15 1996

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 45 March 3, 2011 Homework .Study the following chapters of theDatabase System Concepts book .chapter 2 -Relational Model .chapter 6.1 -Relational Algebra

book2.jpg

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 46 March 3, 2011 Exercise 3 .Relational model .Relational algebra C:\Users\signer\AppData\Local\Microsoft\Windows\Temporary Internet Files\Content .IE5\TAQ8LW3H\MC900187587[1].wmf

Picture1.jpg new-3.gif Beat Signer -Department of Computer Science -bsigner@vub.ac.be 47 March 3, 2011 References .A. Silberschatz, H. Korthand S. Sudarshan, Database System Concepts(Sixth Edition), McGraw-Hill, 2010 .E. F. Codd, A Relational Model of Data for Large Shared Data Banks, Communications of the ACM 13(6), June 1970 C:\Users\signer\AppData\Local\Microsoft\Windows\Temporary Internet Files\Content .IE5\NH3RTRTE\MC900441734[1].png

2 December 2005 logoPPTp[3] Next WeekRelational Database Design

S-ar putea să vă placă și