Documente Academic
Documente Profesional
Documente Cultură
Database
Database Environment
• A database environment is a system of
components that regulate the collecCon,
management and use of data. It includes
– soQware
– hardware
– people
– procedures
– data
Prabin Babu Dhakal, CDPA, TU
Database Architecture
DBMS
Prabin Babu Dhakal, CDPA, TU
Database Architecture
• The design of a DBMS depends on its architecture
• DBMS can be centralized, decentralized or
hierarchical; Parallel or distributed
• The architecture of a DBMS can be seen as either
single Cer (level) or mulC-Cer (level).
• An n-Cer architecture divides the whole system
into related but independent n modules, which
can be independently modified, altered, changed,
or replaced.
• This brings data independence
Prabin Babu Dhakal, CDPA, TU
ApplicaCon
Tier
Database
Tier
Prabin Babu Dhakal, CDPA, TU
Data Model
Prabin Babu Dhakal, CDPA, TU
Data Model
• Data model defines the logical design of data in database
• The model describes the relaConships between different
parts of the data.
• Historically, in database design, three models are most
commonly used. These are record based logical models
– Hierarchical Model
– Network Model
– RelaConal Model
• Object based logical models are
– ER Model
– Object oriented model
Prabin Babu Dhakal, CDPA, TU
Hierarchical Model
Prabin Babu Dhakal, CDPA, TU
Hierarchical Model
• In this model each enCty has only one parent
but can have several children
• At the top of hierarchy there is only one enCty
which is called Root
• Different levels of data defined
• Most important enCty is modeled as root,
then second-important enCty modeled under
root and so on
Prabin Babu Dhakal, CDPA, TU
Network Model
Prabin Babu Dhakal, CDPA, TU
Network Model
• In the network model, enCCes are organized
in a graph, in which some enCCes can be
accessed through several path
• Highly accessible but difficult to model and
make
Prabin Babu Dhakal, CDPA, TU
RelaConal Model
Prabin Babu Dhakal, CDPA, TU
RelaConal Model
• In this model, data is organized in two-
dimensional tables called rela@ons
• The tables or relaCon are related to each
other through special fields
• Mostly used model and highly effecCve
• RDBMS got its name from relaConal model
Prabin Babu Dhakal, CDPA, TU
RelaConal model
Prabin Babu Dhakal, CDPA, TU
Codd's Rule
• E.F Codd was a Computer ScienCst who invented Rela@onal
model for Database management
• Based on relaConal model, Rela@on database was created.
• Codd proposed rules popularly known as Codd's 12 rules to
test DBMS's concept against his relaConal model
• Codd's rule actualy define what quality a DBMS requires in
order to become a RelaConal Database Management
System(RDBMS).
• Till now, there is hardly any commercial product that
follows all the 13 Codd's rules. Even Oracle follows only
eight and half out(8.5) of 12.
Prabin Babu Dhakal, CDPA, TU
Codd's Rule
• Rule zero: This rule states that for a system to qualify as an RDBMS, it must be
able to manage database enCrely through the relaConal capabiliCes.
• Rule 1 - Informa@on rule: All informaCon(including metadata) is to be represented
as stored data in cells of tables. The rows and columns have to be strictly
unordered.
• Rule 2 - Guaranteed Access: Each unique piece of data(atomic value) should be
accessible by : Table Name + primary key(Row) + ASribute(column). Note that
ability to directly access via POINTER is a violaCon of this rule.
• Rule 3 - Systema@c treatment of NULL: Null has several meanings, it can mean
missing data, not applicable or no value. It should be handled consistently. Primary
key must not be null. Expression on NULL must give null.
• Rule 4 - Ac@ve Online Catalog: Database dic@onary(catalog) must have
descripCon of Database. Catalog to be governed by same rule as rest of the
database. The same query language to be used on catalog as on applicaCon
database.
• Rule 5 : Powerful language: One well defined language must be there to provide
all manners of access to data. Example: SQL. If a file supporCng table can be
accessed by any manner except SQL interface, then its a violaCon to this rule.
• Rule 6 : View Upda@on rule: All view that are theoreCcally updatable should be
updatable by the system.
Prabin Babu Dhakal, CDPA, TU
Codd's Rule
• Rule 7 - Rela@onal Level Opera@on: There must be Insert, Delete, Update
operaCons at each level of relaCons. Set operaCon like Union, IntersecCon and
minus should also be supported.
• Rule 8 - Physical Data Independence: The physical storage of data should not
maDer to the system. If say, some file supporCng table were renamed or moved
from one disk to another, it should not effect the applicaCon.
• Rule 9 - Logical Data Independence: If there is change in the logical structure(table
structures) of the database the user view of data should not change. Say, if a table
is split into two tables, a new view should give result as the join of the two tables.
This rule is most difficult to saCsfy.
• Rule 10 - Integrity Independence: The database should be able to conforce its own
integrity rather than using other programs. Key and Check constraints, trigger etc
should be stored in Data DicConary. This also make RDBMS independent of front-
end.
• Rule 11 - Distribu@on Independence: A database should work properly regardless
of its distribuCon across a network. This lays foundaCon of distributed database.
• Rule 12 - Nonsubversion rule: If low level access is allowed to a system it should
not be able to subvert or bypass integrity rule to change data. This can be achieved
by some sort of locking or encrypCon.
Prabin Babu Dhakal, CDPA, TU
RelaConal model
• RelaConal data model is the primary data
model, which is used widely around the world
for data storage and processing.
• This model is simple and it has all the
properCes and capabiliCes required to process
data with storage efficiency.
• Developed by EF Codd in 1970s
Prabin Babu Dhakal, CDPA, TU
RelaConal Algebra
• Rela@onal Algebra
– RelaConal algebra is a procedural query language, which takes
instances of relaCons as input and yields instances of relaCons
as output. It uses operators to perform queries. An operator can
be either unary or binary. They accept relaCons as their input
and yield relaCons as their output. RelaConal algebra is
performed recursively on a relaCon and intermediate results are
also considered relaCons.
• The fundamental operaCons of relaConal algebra are as
follows −
– Select (σ)
– Project (π)
– Union (U)
– Set different (-)
– Cartesian product (✕)
– Rename
Prabin Babu Dhakal, CDPA, TU
EnCty-RelaConship Model
ER Model
• Real world data captured as en@ty and
rela@onship between the enCCes
• Can be used for ontological definiCon of enCCes
in parCcular domain
• Components of ER model
– EnCty (Rectangular box)
– ADribute (Oval)
– RelaConship (Diamond)
• En@@es are related by rela@onship and enCCes
have aSributes
• ER-Diagram is a visual representaCon of data that
describes how data is related to each other
Prabin Babu Dhakal, CDPA, TU
ER Diagram
Symbols and
notaCons
Prabin Babu Dhakal, CDPA, TU
ER Model - EnCty
• An En@ty can be any object, place, person or class. In
E-R Diagram, an en@ty is represented using rectangles.
Consider an example of an OrganizaCon. Employee,
Manager, Department, Product and many more can be
taken as enCCes from an OrganizaCon.
ER Model - ADribute
• An ASribute describes a property or characterisCc of an enCty. For
example, Name, Age, Address etc. can be aDributes of a Student.
An aDribute is represented using ellipse. e.g. a student can have
aDributes – name, address, roll_no, enrolled_courses, Marks etc.
• Key ASribute represents the main characterisCc of an EnCty. It is
used to represent Primary key. Ellipse with underlying lines
represent Key ADribute. e.g. roll_no uniquely represent student in
a class.
• Composite ASribute: some aDribute can also have their own
aDributes. These aDributes are known as Composite aDribute. e.g.
Address can have city, district, zone, country, etc.
• Mul@-valued aSribute: Some aDribute can have mulCple values
e.g. a person can have mulCple phone number
• Derived aSribute: If one aDribute can be derived from another
then it is called derived aDribute. represented in dashed circle e.g.
age derived from date of birth
Prabin Babu Dhakal, CDPA, TU
Composite ADribute
Prabin Babu Dhakal, CDPA, TU
ER Model - RelaConship
• A RelaConship describes relaCons between en@@es. RelaConship is
represented using diamonds.
• Binary Rela@onship: relaCon between two EnCCes
– One to One: It reflects business rule that one enCty is associated with only
one of the other enCty. E.g. one man can have only one wife and one woman
can have only one husband
– One to Many: one enCty is associated with many number of same enCty. e.g.
one father can have many child but many child can have one father
– Many to One: many enCCes can be associated with just one enCty. e.g. A
student enrolls for only one Course but a Course can have many Students.
– Many to Many: Many enCCes can be associated with many enCCes. E.g. one
student can have many teacher and one teacher can have many students
• Recursive Rela@onship: When an EnCty is related with itself it is known as
Recursive RelaConship. e.g. student is friend of another student
• Ternary Rela@onship: RelaConship of degree three is called Ternary
relaConship. e.g. staff manages teacher and students
Prabin Babu Dhakal, CDPA, TU
GeneralizaCon and
specializaCon
• Generaliza@on is a boDom-up approach in
which two lower level enCCes combine to form
a higher level enCty. In generalizaCon, the
higher level enCty can also combine with other
lower level enCty to make further higher level
enCty.
• Specializa@on is opposite to GeneralizaCon. It is
a top-down approach in which one higher level
enCty can be broken down into two lower level
enCty. In specializaCon, some higher level
enCCes may not have lower-level enCty sets at
all.
• Aggrega@on is a process when relaCon between
two enCty is treated as a single enCty. Here the
relaCon between Center and Course, is acCng as
an EnCty in relaCon with Visitor.
Prabin Babu Dhakal, CDPA, TU
TransacCon
• A transacCon can be defined A’s Account
as a group of tasks. A single • Open_Account(A)
task is the minimum • Old_Balance = A.balance
processing unit which cannot
be divided further. • New_Balance = Old_Balance - 500
• Let’s take an example of a • A.balance = New_Balance
simple transacCon. Suppose • Close_Account(A)
a bank employee transfers B’s Account
Rs 500 from A's account to • Open_Account(B)
B's account. This very simple
and small transacCon • Old_Balance = B.balance
involves several low-level • New_Balance = Old_Balance + 500
tasks. • B.balance = New_Balance
• Either all of the tasks should • Close_Account(B)
be completed or none. If
some don't complete,
ERROR!
Prabin Babu Dhakal, CDPA, TU
Example of concurrency
• Hemant account • Nisha balance = 0
• balance=100000 • Nisha amount = 5000
• read hemant’s balance =1000000
• cheques for hemant’s account all • new hemants balance = 95000
processed at same Cme • new nisha's balance = 5000
• rajan balance = 0 • update hemant's balance
• Rajan amount = 100000 • update nisha's balance
• read hemant’s balance =1000000
• new hemants balance = 90000 • Gagan's balance = 0
• new rajan's balance = 10000 • Gagan amount = 2000
• update hemant's balance • read hemant’s balance =1000000
• update rajan's balance • new hemants balance = 98000
• new gagan's balance = 2000
• update hemant's balance
• update gagan's balance
Prabin Babu Dhakal, CDPA, TU
ACID
• Atomicity. In a transacCon involving two or more discrete pieces of
informaCon, either all of the pieces are commiDed or none are.
• Consistency. A transacCon either creates a new and valid state of
data, or, if any failure occurs, returns all data to its state before the
transacCon was started.
• Isola@on. A transacCon in process and not yet commiDed must
remain isolated from any other transacCon.
• Durability. CommiDed data is saved by the system such that, even
in the event of a failure and system restart, the data is available in
its correct state.
• Each of these aDributes can be measured against a benchmark. In
general, however, a transacCon manager or monitor is designed to
realize the ACID concept. It ensures DB must commit to transacCon
compleCon or none do, and the transacCon is rolled back
Prabin Babu Dhakal, CDPA, TU
AuthenCcaCon
AuthorizaCon
Audit
Prabin Babu Dhakal, CDPA, TU
Database Security
• Secure database system should saCsfy three
basic requirements on data protecCon
• Security - prevenCng, detecCng and deterring
improper disclosure of informaCon. This is
especially important in strongly protected
environments (e.g. army).
• Integrity - prevenCng, detecCng and deterring
improper changes of informaCon. The proper
funcCon of any organizaCon depends on proper
operaCons on proper data.
• Availability - effort for prevenCon of improper
denial of service that DBMS provides
Prabin Babu Dhakal, CDPA, TU
Security Threat
• Security threat from any agent which can obtain or
change informaCon randomly or with some intenCon.
• Random security threats
– Natural or accidental disasters- Data or hardware is
damaged which leads to the integrity violence and service
rejecCon.
– Errors, design flaws and bugs in hardware and so<ware -
causes improper applicaCon of security policies.
– human errors - unintenConal violaCons such as incorrect
input or wrong use of applicaCons.
– Overloads, performance constraints and capability issues
• Intended security threats
– Authorized users - abuse their privileges
– Hos@le agents - various hosCle programs - viruses, Trojan
horses, back-doors
Prabin Babu Dhakal, CDPA, TU
Requirements of DB security
• Protec@on from improper access- only authorized users should be
granted access
• Protec@on from inference - inference of confidenCal informaCon
from available data should be avoided
• Database integrity – Integrity of data during and even aQer
transacCons; ensured with transacCons, various back-up and
recovery procedures
• Seman@c data integrity - with integrity constraints
• Accountability and audi@ng - log data accesses
• User authen@ca@on - unambiguous idenCficaCon of each user
• Management and protec@on of sensi@ve data - access should be
granted only to narrow round of users
• Mul@level security - data may be classified and access right given
according to their sensiCvity
• Confinement (subject isola@on) - isolate subjects to avoid
uncontrolled data flow between programs
Prabin Babu Dhakal, CDPA, TU
Views
• View is the result set of a stored query in database
• A view is a virtual table based on the result-set of an
SQL statement.
• This pre-established query command is kept in the
database dicConary.
• A view contains rows and columns, just like a real
table. The fields in a view are fields from one or more
real tables in the database.
• You can add SQL statements to a view and present the
data as if the data were coming from one single table
• We treat views like a real table
Prabin Babu Dhakal, CDPA, TU
Benefits of views
• Hide complexity: If you have a query that requires joining several
tables, or has complex logic or calculaCons, you can code all that
logic into a view, then select from the view just like you would a
table.
• Can be used as a security mechanism: A view can select certain
columns and/or rows from a table, and permissions set on the view
instead of the underlying tables. This allows surfacing only the data
that a user needs to see.
• Aggregate or de-normalize data: If you have broken table to smaller
parts, views can be used as joined table without changing underlying
model. It is frequently used for reporCng purpose
• Simplify suppor@ng legacy code: If you need to refactor a table that
would break a lot of code, you can replace the table with a view of
the same name. The view provides the exact same schema as the
original table, while the actual schema has changed. This keeps the
legacy code that references the table from breaking, allowing you to
change the legacy code at your leisure.
Prabin Babu Dhakal, CDPA, TU
RDBMS
• A rela@onal database management system (RDBMS) is a
DBMS that is based on the relaConal model as invented by
E. F. Codd.
• RDBMS has oQen replaced legacy hierarchical databases
and network databases because they are easier to
understand and use.
• Many other DBMS like object database management
systems and XML database management systems posed
failed challenge of replacing RDBMS.
• Despite such aDempts, RDBMSs keep most of the market
share, which has also grown over the years.
• Today, RDBMS is the most common choice for the storage
of informaCon in new databases used for financial records,
manufacturing and logisCcal informaCon, personnel data,
and many other applicaCons
Prabin Babu Dhakal, CDPA, TU
What is NoSQL?
• NoSQL (Not Only SQL) represents a completely different framework
of databases that allows for high-performance, agile processing of
informaCon at massive scale. It is is very well-adapted to the heavy
demands of big data.
• The efficiency of NoSQL can be achieved because unlike relaConal
databases that are highly structured, NoSQL databases are
unstructured in nature, trading off stringent consistency
requirements for speed and agility.
• NoSQL centers around the concept of distributed databases, where
unstructured data may be stored across mulCple processing nodes,
and oQen across mulCple servers.
• Distributed architecture allows it to be horizontally scalable - as
data conCnues to explode, just add more hardware to keep up,
with no slowdown in performance.
• The NoSQL distributed database infrastructure has been the
soluCon to handling some of the biggest data warehouses on the
planet – i.e. the likes of Google, Amazon, and the CIA.
Prabin Babu Dhakal, CDPA, TU
NoSQL vs RDBMS
• NoSQL – suitable for • Parallel RDBMS – suitable
non-transacConal data for transacConal data
• Advantages • Advantages
– Highly scalable – Strong funcConaliCes
– Highly fault tolerant • SQL, Schemas, Indexes,
query opCmizaCon,
– Inexpensive transacCons
– Easy to setup and use • Disadvantages
• Disadvantages – Difficult to scale
– Weak funcConaliCes – Expensive
• SQL, Schemas, Indexes,
query opCmizaCon,
– Not suitable where faults
transacCons occur frequently
– Harder to setup and use
Prabin Babu Dhakal, CDPA, TU
NoSQL vs RelaConal
Prabin Babu Dhakal, CDPA, TU
a) EnCty-relaConship diagram
b) EnCty diagram
c) Database diagram
d) Architectural representaCon
Answer: a
ExplanaCon: E-R diagrams are simple and clear—
qualiCes that may well account in large part for the
widespread use of the E-R model.
Prabin Babu Dhakal, CDPA, TU
a) Rectangle
b) Oval
c) Circle
d) Diamond
Answer: d
ExplanaCon: In ER diagram, RelaConship is represented
in diamond
Prabin Babu Dhakal, CDPA, TU
AuthenCcaCon is
Answer: b
The select operaCon selects tuples that saCsfy a given
predicate.
Prabin Babu Dhakal, CDPA, TU
Answer: a
The predicates or the condiCons appear in subscript and
the relaCon or table appear in the bracket like following
σcondiCon(relaCon)
Prabin Babu Dhakal, CDPA, TU
Answer: b
The expression r − s produces a relaCon containing
those tuples in r but not in s.
Prabin Babu Dhakal, CDPA, TU
Answer: b
Any such relaCon that is not part of the logical model,
but is made visible to a user as a virtual relaCon, is
called a view.
Prabin Babu Dhakal, CDPA, TU
Answer: b
Here one enCty in one set is related to one one enCty
in other set.
Prabin Babu Dhakal, CDPA, TU
Answer: c
The data entered will be in a parCcular cell (i.e., table column).
Prabin Babu Dhakal, CDPA, TU
Answer: b
Prabin Babu Dhakal, CDPA, TU
Answer: b
Primary key checks for not null and uniqueness
constraint.
Prabin Babu Dhakal, CDPA, TU
Answer: a
Constraints are specified to restrict entries in the
relaCon.