Sunteți pe pagina 1din 77

Prabin

Babu Dhakal, CDPA, TU

Database

System, Concept and Architecture


Prabin Babu Dhakal, CDPA, TU
Prabin Babu Dhakal, CDPA, TU

Database Systems, Concepts and


Architecture
•  Database environment
•  DBMS architecture and data independence
•  Data models
•  E-R model; EnCty types, aDributes, keys and relaConship
types
•  Codd’s Rule
•  RelaConal Model: IntroducCon to relaConal db; RelaConal
algebra and Kinds of relaCon
•  Integrity Constrains, and integrity violaCons
•  TransacCons and ACID properCes
•  Access control and authorizaCon; Security and views
•  Parallel processing in RDBMS and NoSQL
Prabin Babu Dhakal, CDPA, TU

Database Environment
•  A database environment is a system of
components that regulate the collecCon,
management and use of data. It includes
–  soQware
–  hardware
–  people
–  procedures
–  data
Prabin Babu Dhakal, CDPA, TU

Components of Database Environment


Prabin Babu Dhakal, CDPA, TU

Components of Database Environment


•  CASE Tools–computer-aided soQware engineering
•  Repository–centralized storehouse of metadata
•  Database Management System (DBMS) –soQware for
managing the database
•  Database–storehouse of the data
•  Applica@on Programs–soQware using the data
•  User Interface–text and graphical displays to users
•  Data/Database Administrators–personnel responsible
for maintaining the database
•  System Developers–personnel responsible for
designing databases and soQware
•  End Users–people who use the applicaCons and
databases
Prabin Babu Dhakal, CDPA, TU

Database Architecture

DBMS
Prabin Babu Dhakal, CDPA, TU

Database Architecture
•  The design of a DBMS depends on its architecture
•  DBMS can be centralized, decentralized or
hierarchical; Parallel or distributed
•  The architecture of a DBMS can be seen as either
single Cer (level) or mulC-Cer (level).
•  An n-Cer architecture divides the whole system
into related but independent n modules, which
can be independently modified, altered, changed,
or replaced.
•  This brings data independence
Prabin Babu Dhakal, CDPA, TU

One Cer architecture


•  All data, applicaCon and user interface reside
at the same place
•  The applicaCon program and the data in
database is cannot be separated
•  Data independence is harder to obtain
Prabin Babu Dhakal, CDPA, TU

Two Tier client/server Architecture


Prabin Babu Dhakal, CDPA, TU

Two Tier client/server Architecture


•  User Interface program and ApplicaCon Programs
runs on client side.
•  An interface called ODBC provides an API that
allow client side program to call the DBMS.
•  Most DBMS vendors provide ODBC drivers.
•  A client program may connect to several DBMS's.
•  In this architecture some variaCon of client is also
possible for example in some DBMS's more
funcConality is transferred to the client including
data dicConary, opCmizaCon etc. Such clients are
called Data server.
Prabin Babu Dhakal, CDPA, TU

Three Cer client-server architecture


PresentaCon
Tier

ApplicaCon
Tier

Database
Tier
Prabin Babu Dhakal, CDPA, TU

Three Cer client-server Architecture


•  Commonly used architecture for web
applicaCons
•  Intermediate layer called Applica@on server or
Web Server stores the web connecCvity soQware
and the business logic(constraints) part of
applicaCon used to access the right amount of
data from the database server
•  This layer acts like medium for sending parCally
processed data between the database server and
the client.
Prabin Babu Dhakal, CDPA, TU

Levels in 3-Cer architecture


•  Database (Data) Tier − At this Cer, the database resides along with its
query processing languages. We also have the relaCons that define the
data and their constraints at this level.
•  Applica@on (Middle) Tier − At this Cer reside the applicaCon server and
the programs that access the database. The applicaCon stores database
connecCvity and business logic to the whole system. It presents an
abstracted view of the database to the users, who are unaware of any
existence of the database beyond the applicaCon. At the other end, the
database Cer is not aware of any other user beyond the applicaCon Cer.
Hence, the applicaCon layer sits in the middle and acts as a mediator
between the end-user and the database.
•  User (Presenta@on) Tier − End-users operate on this Cer and they know
nothing about any existence of the database beyond this layer. At this
layer, mulCple views of the database can be provided by the applicaCon.
All views are generated by applicaCons that reside in the applicaCon Cer.
Prabin Babu Dhakal, CDPA, TU

Data Model
Prabin Babu Dhakal, CDPA, TU

Data Model
•  Data model defines the logical design of data in database
•  The model describes the relaConships between different
parts of the data.
•  Historically, in database design, three models are most
commonly used. These are record based logical models
–  Hierarchical Model
–  Network Model
–  RelaConal Model
•  Object based logical models are
–  ER Model
–  Object oriented model
Prabin Babu Dhakal, CDPA, TU

Hierarchical Model
Prabin Babu Dhakal, CDPA, TU

Hierarchical Model
•  In this model each enCty has only one parent
but can have several children
•  At the top of hierarchy there is only one enCty
which is called Root
•  Different levels of data defined
•  Most important enCty is modeled as root,
then second-important enCty modeled under
root and so on
Prabin Babu Dhakal, CDPA, TU

Network Model
Prabin Babu Dhakal, CDPA, TU

Network Model
•  In the network model, enCCes are organized
in a graph, in which some enCCes can be
accessed through several path
•  Highly accessible but difficult to model and
make
Prabin Babu Dhakal, CDPA, TU

RelaConal Model
Prabin Babu Dhakal, CDPA, TU

RelaConal Model
•  In this model, data is organized in two-
dimensional tables called rela@ons
•  The tables or relaCon are related to each
other through special fields
•  Mostly used model and highly effecCve
•  RDBMS got its name from relaConal model
Prabin Babu Dhakal, CDPA, TU

RelaConal model
Prabin Babu Dhakal, CDPA, TU

Codd's Rule
•  E.F Codd was a Computer ScienCst who invented Rela@onal
model for Database management
•  Based on relaConal model, Rela@on database was created.
•  Codd proposed rules popularly known as Codd's 12 rules to
test DBMS's concept against his relaConal model
•  Codd's rule actualy define what quality a DBMS requires in
order to become a RelaConal Database Management
System(RDBMS).
•  Till now, there is hardly any commercial product that
follows all the 13 Codd's rules. Even Oracle follows only
eight and half out(8.5) of 12.
Prabin Babu Dhakal, CDPA, TU

Codd's Rule
•  Rule zero: This rule states that for a system to qualify as an RDBMS, it must be
able to manage database enCrely through the relaConal capabiliCes.
•  Rule 1 - Informa@on rule: All informaCon(including metadata) is to be represented
as stored data in cells of tables. The rows and columns have to be strictly
unordered.
•  Rule 2 - Guaranteed Access: Each unique piece of data(atomic value) should be
accessible by : Table Name + primary key(Row) + ASribute(column). Note that
ability to directly access via POINTER is a violaCon of this rule.
•  Rule 3 - Systema@c treatment of NULL: Null has several meanings, it can mean
missing data, not applicable or no value. It should be handled consistently. Primary
key must not be null. Expression on NULL must give null.
•  Rule 4 - Ac@ve Online Catalog: Database dic@onary(catalog) must have
descripCon of Database. Catalog to be governed by same rule as rest of the
database. The same query language to be used on catalog as on applicaCon
database.
•  Rule 5 : Powerful language: One well defined language must be there to provide
all manners of access to data. Example: SQL. If a file supporCng table can be
accessed by any manner except SQL interface, then its a violaCon to this rule.
•  Rule 6 : View Upda@on rule: All view that are theoreCcally updatable should be
updatable by the system.
Prabin Babu Dhakal, CDPA, TU

Codd's Rule
•  Rule 7 - Rela@onal Level Opera@on: There must be Insert, Delete, Update
operaCons at each level of relaCons. Set operaCon like Union, IntersecCon and
minus should also be supported.
•  Rule 8 - Physical Data Independence: The physical storage of data should not
maDer to the system. If say, some file supporCng table were renamed or moved
from one disk to another, it should not effect the applicaCon.
•  Rule 9 - Logical Data Independence: If there is change in the logical structure(table
structures) of the database the user view of data should not change. Say, if a table
is split into two tables, a new view should give result as the join of the two tables.
This rule is most difficult to saCsfy.
•  Rule 10 - Integrity Independence: The database should be able to conforce its own
integrity rather than using other programs. Key and Check constraints, trigger etc
should be stored in Data DicConary. This also make RDBMS independent of front-
end.
•  Rule 11 - Distribu@on Independence: A database should work properly regardless
of its distribuCon across a network. This lays foundaCon of distributed database.
•  Rule 12 - Nonsubversion rule: If low level access is allowed to a system it should
not be able to subvert or bypass integrity rule to change data. This can be achieved
by some sort of locking or encrypCon.
Prabin Babu Dhakal, CDPA, TU

RelaConal model
•  RelaConal data model is the primary data
model, which is used widely around the world
for data storage and processing.
•  This model is simple and it has all the
properCes and capabiliCes required to process
data with storage efficiency.
•  Developed by EF Codd in 1970s
Prabin Babu Dhakal, CDPA, TU

DefiniCons of terms in RelaConal


models
•  Tables − In relaConal data model, relaCons are saved in the format of
Tables. This format stores the relaCon among enCCes. A table has rows
and columns, where rows represents records and columns represent the
aDributes.
•  Tuple − A single row of a table, which contains a single record for that
relaCon is called a tuple.
•  Rela@on instance − A finite set of tuples in the relaConal database system
represents relaCon instance. RelaCon instances do not have duplicate
tuples.
•  Rela@on schema − A relaCon schema describes the relaCon name (table
name), aDributes, and their names.
•  Rela@on key − Each row has one or more aDributes, known as relaCon
key, which can idenCfy the row in the relaCon (table) uniquely.
•  ASribute domain − Every aDribute has some pre-defined value scope,
known as aDribute domain.
Prabin Babu Dhakal, CDPA, TU

Concepts in RelaConal Model:


Constraints
•  Every relaCon has some condiCons that must hold for it to be a valid relaCon. There
are three main integrity !मािणकता वा पiव+ता constraints (बा-यता वा iनरोध)
•  Key Constraints
–  There must be at least one minimal subset of aDributes in the relaCon, which can idenCfy a tuple
uniquely. This minimal subset of aDributes is called key for that relaCon. If there are more than
one such minimal subsets, these are called candidate keys.
–  Key constraints force that
•  in a relaCon with a key aDribute, no two tuples can have iden@cal values for key aDributes.
•  a key aDribute can not have NULL values.
•  Key constraints are also referred to as En@ty Constraints.
•  Domain Constraints
–  ADributes have specific values in real-world scenario. For example, age can only be a posiCve
integer. The same constraints have been tried to employ on the aDributes of a relaCon. Every
aDribute is bound to have a specific range of values. For example, age cannot be less than zero
and telephone numbers cannot contain a digit outside 0-9.
•  Referen@al integrity Constraints
–  ReferenCal integrity constraints work on the concept of Foreign Keys. A foreign key is a key
aDribute of a relaCon that can be referred in other relaCon.
–  ReferenCal integrity constraint states that if a relaCon refers to a key aDribute of a different or
same relaCon, then that key element must exist
•  Integrity Viola@ons: if these rules are not clearly defined or the data violates the
integrity constraints, then it is called integrity violaCon. It will not give good result
when queried. The data might become a junk and unusable. So integrity constraints
must be followed (!मािणकता वा पiव+ताको उ4ल6घन, अiत9मण वा भ;ग =नu)
Prabin Babu Dhakal, CDPA, TU

RelaConal Algebra
•  Rela@onal Algebra
–  RelaConal algebra is a procedural query language, which takes
instances of relaCons as input and yields instances of relaCons
as output. It uses operators to perform queries. An operator can
be either unary or binary. They accept relaCons as their input
and yield relaCons as their output. RelaConal algebra is
performed recursively on a relaCon and intermediate results are
also considered relaCons.
•  The fundamental operaCons of relaConal algebra are as
follows −
–  Select (σ)
–  Project (π)
–  Union (U)
–  Set different (-)
–  Cartesian product (✕)
–  Rename
Prabin Babu Dhakal, CDPA, TU

Example: RelaConal Algebra with table A


No Name Address Phone Income Project name and phone of all
1 Ram Ayodhya 45455 50000 πname, phone (A)
2 Hari Baikuntha 67656 60000 Select tuples whose address is
Brindaban
3 Shyam Brindaban 56765 40000
4 Radha Brindaban 45465 50000 σaddress=brindaban (A)
5 Sita Ayodhya 87987 30000
Select tuples whose address is
Baikuntha or Brindaban or Ayodhya
6 Gauri Kailash 23435 40000
7 Shiva Everywhere 78678 10000
σaddress=brindaban v address=ayodhya (A)
Select name and address of tuples
8 Indra Sworga 54654 30000 whose income is greater than 30000
but not including those from Ayodhya
9 Laxmi Baikuntha 13435 90000
10 Kali Everywhere 98796 10000 πname,address ((σincome>30000 (A)) –
11 Maya Everywhere 56246 10000 (σadderss=ayodhya (A)))


Prabin Babu Dhakal, CDPA, TU

Other RelaConal Algebra operators


•  Change alias of some RelaCon name or aDribute name A to z
–  A ρ z
•  Update ADribute A with new value x expression E in RelaCon R
–  δA ßE (R)
–  If condiCon c has to be supplied, select from the relaCon
–  δA ßE (σc R)
–  Increase all balances by 10% in relaCon deposit
–  δbalance ßbalance*1.1 (deposit)
•  Delete some tuples in RelaCon R with CondiCon
–  RßR - σcondiCon(R)
–  Delete all records of account number 33221
–  deposit ß deposit – σaccountno=33221 (deposit)
•  Insert tuple denoted by Expression E in relaCon R
–  R ß R U E
–  E.g. to insert tuple for Ram who has 5000 balance in account 33221
–  deposit ß deposit U {(33221, "Ram", 5000)}
Prabin Babu Dhakal, CDPA, TU

EnCty-RelaConship Model

Ontology definiCon in parCcular


domain
Prabin Babu Dhakal, CDPA, TU

ER Model
•  Real world data captured as en@ty and
rela@onship between the enCCes
•  Can be used for ontological definiCon of enCCes
in parCcular domain
•  Components of ER model
–  EnCty (Rectangular box)
–  ADribute (Oval)
–  RelaConship (Diamond)
•  En@@es are related by rela@onship and enCCes
have aSributes
•  ER-Diagram is a visual representaCon of data that
describes how data is related to each other
Prabin Babu Dhakal, CDPA, TU

ER Diagram

Symbols and
notaCons
Prabin Babu Dhakal, CDPA, TU

ER Model - EnCty
•  An En@ty can be any object, place, person or class. In
E-R Diagram, an en@ty is represented using rectangles.
Consider an example of an OrganizaCon. Employee,
Manager, Department, Product and many more can be
taken as enCCes from an OrganizaCon.

•  Weak enCty is an enCty that depends on another


enCty. Weak enCty doesn't have key aDribute of their
own. Double rectangle represents weak enCty.
Prabin Babu Dhakal, CDPA, TU

ER Model - ADribute
•  An ASribute describes a property or characterisCc of an enCty. For
example, Name, Age, Address etc. can be aDributes of a Student.
An aDribute is represented using ellipse. e.g. a student can have
aDributes – name, address, roll_no, enrolled_courses, Marks etc.
•  Key ASribute represents the main characterisCc of an EnCty. It is
used to represent Primary key. Ellipse with underlying lines
represent Key ADribute. e.g. roll_no uniquely represent student in
a class.
•  Composite ASribute: some aDribute can also have their own
aDributes. These aDributes are known as Composite aDribute. e.g.
Address can have city, district, zone, country, etc.
•  Mul@-valued aSribute: Some aDribute can have mulCple values
e.g. a person can have mulCple phone number
•  Derived aSribute: If one aDribute can be derived from another
then it is called derived aDribute. represented in dashed circle e.g.
age derived from date of birth
Prabin Babu Dhakal, CDPA, TU

EnCty with ADributes and key

Composite ADribute
Prabin Babu Dhakal, CDPA, TU

ER Model - RelaConship
•  A RelaConship describes relaCons between en@@es. RelaConship is
represented using diamonds.
•  Binary Rela@onship: relaCon between two EnCCes
–  One to One: It reflects business rule that one enCty is associated with only
one of the other enCty. E.g. one man can have only one wife and one woman
can have only one husband
–  One to Many: one enCty is associated with many number of same enCty. e.g.
one father can have many child but many child can have one father
–  Many to One: many enCCes can be associated with just one enCty. e.g. A
student enrolls for only one Course but a Course can have many Students.
–  Many to Many: Many enCCes can be associated with many enCCes. E.g. one
student can have many teacher and one teacher can have many students
•  Recursive Rela@onship: When an EnCty is related with itself it is known as
Recursive RelaConship. e.g. student is friend of another student
•  Ternary Rela@onship: RelaConship of degree three is called Ternary
relaConship. e.g. staff manages teacher and students
Prabin Babu Dhakal, CDPA, TU

GeneralizaCon and
specializaCon
•  Generaliza@on is a boDom-up approach in
which two lower level enCCes combine to form
a higher level enCty. In generalizaCon, the
higher level enCty can also combine with other
lower level enCty to make further higher level
enCty.
•  Specializa@on is opposite to GeneralizaCon. It is
a top-down approach in which one higher level
enCty can be broken down into two lower level
enCty. In specializaCon, some higher level
enCCes may not have lower-level enCty sets at
all.
•  Aggrega@on is a process when relaCon between
two enCty is treated as a single enCty. Here the
relaCon between Center and Course, is acCng as
an EnCty in relaCon with Visitor.
Prabin Babu Dhakal, CDPA, TU

E-R diagram to relaConal table


•  General rules are described and details is not
covered
•  Each enCty is modeled as a table in relaConal
database management system
•  Each aDribute of the enCty is modeled as fields of
the table
•  One or more of the aDributes are set as primary
key
•  Map relaCon to table, and put primary fields of
associated enCCes in relaCon table, declare
foreign key constraints and add aDributes of
relaCon if any
Prabin Babu Dhakal, CDPA, TU

How to design data model?


•  List the main fields or columns
•  Start with the most important among them
•  If you don't idenCfy most important at first, no
problem, just start, then the thing will be ok at last
•  In our example start with ciCzen, which is the most
important in ciCzenship management system
•  Once you build ciCzen table, think which of other
columns are aDributes of ciCzen and which are not
•  The columns which are aDributes are put in ciCzen
table
•  From remaining fields, create second important table,
and repeat process Cll all aDributes are finished
Prabin Babu Dhakal, CDPA, TU

TransacCons and ACID


Prabin Babu Dhakal, CDPA, TU

TransacCon
•  A transacCon can be defined A’s Account
as a group of tasks. A single •  Open_Account(A)
task is the minimum •  Old_Balance = A.balance
processing unit which cannot
be divided further. •  New_Balance = Old_Balance - 500
•  Let’s take an example of a •  A.balance = New_Balance
simple transacCon. Suppose •  Close_Account(A)
a bank employee transfers B’s Account
Rs 500 from A's account to •  Open_Account(B)
B's account. This very simple
and small transacCon •  Old_Balance = B.balance
involves several low-level •  New_Balance = Old_Balance + 500
tasks. •  B.balance = New_Balance
•  Either all of the tasks should •  Close_Account(B)
be completed or none. If
some don't complete,
ERROR!
Prabin Babu Dhakal, CDPA, TU

Example of concurrency
•  Hemant account •  Nisha balance = 0
•  balance=100000 •  Nisha amount = 5000
•  read hemant’s balance =1000000
•  cheques for hemant’s account all •  new hemants balance = 95000
processed at same Cme •  new nisha's balance = 5000
•  rajan balance = 0 •  update hemant's balance
•  Rajan amount = 100000 •  update nisha's balance
•  read hemant’s balance =1000000
•  new hemants balance = 90000 •  Gagan's balance = 0
•  new rajan's balance = 10000 •  Gagan amount = 2000
•  update hemant's balance •  read hemant’s balance =1000000
•  update rajan's balance •  new hemants balance = 98000
•  new gagan's balance = 2000
•  update hemant's balance
•  update gagan's balance
Prabin Babu Dhakal, CDPA, TU

ACID
•  Atomicity. In a transacCon involving two or more discrete pieces of
informaCon, either all of the pieces are commiDed or none are.
•  Consistency. A transacCon either creates a new and valid state of
data, or, if any failure occurs, returns all data to its state before the
transacCon was started.
•  Isola@on. A transacCon in process and not yet commiDed must
remain isolated from any other transacCon.
•  Durability. CommiDed data is saved by the system such that, even
in the event of a failure and system restart, the data is available in
its correct state.
•  Each of these aDributes can be measured against a benchmark. In
general, however, a transacCon manager or monitor is designed to
realize the ACID concept. It ensures DB must commit to transacCon
compleCon or none do, and the transacCon is rolled back
Prabin Babu Dhakal, CDPA, TU

Access control and authorizaCon

AuthenCcaCon
AuthorizaCon
Audit
Prabin Babu Dhakal, CDPA, TU

Access control and authorizaCon


•  Access control is the selecCve restricCon of access to some
resource. It is responsible for control of rules determined
by security policies for all direct accesses to the system
•  Authen@ca@on - user idenCty is verified; this process is
based on knowledge of something, ownership of an object
or on physical characterisCcs of user e.g. username and
password
•  Authoriza@on - Permission to access a resource is called
authorizaCon. System answers only those queries that user
is authorized for (access control)
•  Audit - is composed from two phases; logging of acCons in
the system and reporCng of logged informaCon
Prabin Babu Dhakal, CDPA, TU

Database Security
•  Secure database system should saCsfy three
basic requirements on data protecCon
•  Security - prevenCng, detecCng and deterring
improper disclosure of informaCon. This is
especially important in strongly protected
environments (e.g. army).
•  Integrity - prevenCng, detecCng and deterring
improper changes of informaCon. The proper
funcCon of any organizaCon depends on proper
operaCons on proper data.
•  Availability - effort for prevenCon of improper
denial of service that DBMS provides
Prabin Babu Dhakal, CDPA, TU

Security Threat
•  Security threat from any agent which can obtain or
change informaCon randomly or with some intenCon.
•  Random security threats
–  Natural or accidental disasters- Data or hardware is
damaged which leads to the integrity violence and service
rejecCon.
–  Errors, design flaws and bugs in hardware and so<ware -
causes improper applicaCon of security policies.
–  human errors - unintenConal violaCons such as incorrect
input or wrong use of applicaCons.
–  Overloads, performance constraints and capability issues
•  Intended security threats
–  Authorized users - abuse their privileges
–  Hos@le agents - various hosCle programs - viruses, Trojan
horses, back-doors
Prabin Babu Dhakal, CDPA, TU

Requirements of DB security
•  Protec@on from improper access- only authorized users should be
granted access
•  Protec@on from inference - inference of confidenCal informaCon
from available data should be avoided
•  Database integrity – Integrity of data during and even aQer
transacCons; ensured with transacCons, various back-up and
recovery procedures
•  Seman@c data integrity - with integrity constraints
•  Accountability and audi@ng - log data accesses
•  User authen@ca@on - unambiguous idenCficaCon of each user
•  Management and protec@on of sensi@ve data - access should be
granted only to narrow round of users
•  Mul@level security - data may be classified and access right given
according to their sensiCvity
•  Confinement (subject isola@on) - isolate subjects to avoid
uncontrolled data flow between programs
Prabin Babu Dhakal, CDPA, TU

Views
•  View is the result set of a stored query in database
•  A view is a virtual table based on the result-set of an
SQL statement.
•  This pre-established query command is kept in the
database dicConary.
•  A view contains rows and columns, just like a real
table. The fields in a view are fields from one or more
real tables in the database.
•  You can add SQL statements to a view and present the
data as if the data were coming from one single table
•  We treat views like a real table
Prabin Babu Dhakal, CDPA, TU

Benefits of views
•  Hide complexity: If you have a query that requires joining several
tables, or has complex logic or calculaCons, you can code all that
logic into a view, then select from the view just like you would a
table.
•  Can be used as a security mechanism: A view can select certain
columns and/or rows from a table, and permissions set on the view
instead of the underlying tables. This allows surfacing only the data
that a user needs to see.
•  Aggregate or de-normalize data: If you have broken table to smaller
parts, views can be used as joined table without changing underlying
model. It is frequently used for reporCng purpose
•  Simplify suppor@ng legacy code: If you need to refactor a table that
would break a lot of code, you can replace the table with a view of
the same name. The view provides the exact same schema as the
original table, while the actual schema has changed. This keeps the
legacy code that references the table from breaking, allowing you to
change the legacy code at your leisure.
Prabin Babu Dhakal, CDPA, TU

Parallel and distributed


databases
RDBMS and NoSQL
Prabin Babu Dhakal, CDPA, TU

DefiniCon: Parallel vs distributed


Parallel Databases Distributed Databases
•  Machines are physically close •  Machines can be far from each
to each other, e.g., same other, e.g., in different
server room conCnent
•  Machines connects with •  Can be connected using
dedicated high-speed LANs public-purpose network, e.g.,
and switches Internet
•  CommunicaCon cost is •  CommunicaCon cost and
assumed to be small problems cannot be ignored
•  Can be shared-memory, •  Usually shared-nothing
shared-disk, or shared- architecture
nothing architecture •  Data stored across several
•  Improve performance through sites, each site managed by a
parallel implementaCon DBMS capable of running
•  Generally used for speedup independently
•  Generally used for scale up
Prabin Babu Dhakal, CDPA, TU

DefiniCon: TransacConal and Non


TransacConal Data
Data type Characteristics Scale Technology
Transactional We assume relationship Parallel DBMS
data exists among items with a small
number of machines
An operation involves
multiple data items
Non- We assume no NoSQL system
transactional relationship among with a large number
data data items of machines
Prabin Babu Dhakal, CDPA, TU

RDBMS
•  A rela@onal database management system (RDBMS) is a
DBMS that is based on the relaConal model as invented by
E. F. Codd.
•  RDBMS has oQen replaced legacy hierarchical databases
and network databases because they are easier to
understand and use.
•  Many other DBMS like object database management
systems and XML database management systems posed
failed challenge of replacing RDBMS.
•  Despite such aDempts, RDBMSs keep most of the market
share, which has also grown over the years.
•  Today, RDBMS is the most common choice for the storage
of informaCon in new databases used for financial records,
manufacturing and logisCcal informaCon, personnel data,
and many other applicaCons
Prabin Babu Dhakal, CDPA, TU

What is NoSQL?
•  NoSQL (Not Only SQL) represents a completely different framework
of databases that allows for high-performance, agile processing of
informaCon at massive scale. It is is very well-adapted to the heavy
demands of big data.
•  The efficiency of NoSQL can be achieved because unlike relaConal
databases that are highly structured, NoSQL databases are
unstructured in nature, trading off stringent consistency
requirements for speed and agility.
•  NoSQL centers around the concept of distributed databases, where
unstructured data may be stored across mulCple processing nodes,
and oQen across mulCple servers.
•  Distributed architecture allows it to be horizontally scalable - as
data conCnues to explode, just add more hardware to keep up,
with no slowdown in performance.
•  The NoSQL distributed database infrastructure has been the
soluCon to handling some of the biggest data warehouses on the
planet – i.e. the likes of Google, Amazon, and the CIA.
Prabin Babu Dhakal, CDPA, TU

NoSQL and Parallel DBMS


•  NoSQL systems
–  “Non-relaConal, distributed data stores that oQen
did not aDempt to provide ACID
guarantees” [Wik11]
–  e.g., GFS, BigTable, MapReduce
•  Parallel DBMSs
–  “Systems aDempt to exploit recent mulCprocessor
computer architectures in order to build a high-
performance and high-availability database
server” [Val93]
Prabin Babu Dhakal, CDPA, TU

Emphasis: NoSQL vs RDBMS


•  The emphasis of NoSQL databases is more on
availability, scalability and eventual consistency
–  very large unstructured dataset
–  highly parallel/distributed architecture
–  Large number of machines
•  Emphasis of RDBMS in ACID (Atomicity,
Consistency, IsolaCon, Durability) properCes.
–  smaller and structured dataset
–  transacConal data
–  Small number of machines
Prabin Babu Dhakal, CDPA, TU

When to chose NoSQL over RDBMS


•  Dealing with unstructured and non-tradi@onal data
–  Medical health records processing, media level content management,
geospaCal mapping, etc.
–  Limited primary enCCes; but acCviCes and relaConships on those objects
are abundant: social networking, messaging, cloud-based repositories,
fire-hose feeds, etc.
–  Dealing with massive data sets with variety of sources: big data
applicaCons, trading, applicaCons that need on-the-fly horizontal scaling,
etc.
–  Real Cme decision making based on dynamic events: trends, fraud
detecCon, enterprise security, inventory controls, etc.
•  Need for aggregated summaries:
–  Data visualizaCon of large data sets, senCment analysis, log analysis, etc.
–  Real Cme analyCcs: online gaming, ad targeCng, stock prices, enterprise
dashboards, etc.
•  Complex parallel programming:
–  MapReduce implementaCons, staCsCcal programming, network rouCng,
etc.
Prabin Babu Dhakal, CDPA, TU

NoSQL vs RDBMS
•  NoSQL – suitable for •  Parallel RDBMS – suitable
non-transacConal data for transacConal data
•  Advantages •  Advantages
–  Highly scalable –  Strong funcConaliCes
–  Highly fault tolerant •  SQL, Schemas, Indexes,
query opCmizaCon,
–  Inexpensive transacCons
–  Easy to setup and use •  Disadvantages
•  Disadvantages –  Difficult to scale
–  Weak funcConaliCes –  Expensive
•  SQL, Schemas, Indexes,
query opCmizaCon,
–  Not suitable where faults
transacCons occur frequently
–  Harder to setup and use
Prabin Babu Dhakal, CDPA, TU

NoSQL vs RelaConal
Prabin Babu Dhakal, CDPA, TU

Which to chose? NoSQL vs RDBMS


•  NoSQL databases and RDBMS could and should
work together in modern applicaCons.
•  Enterprises such as Facebook, Forbes, Disney etc.
deploy and decide when to use NoSQL databases
and RDBMS technologies to provide the best user
experience for their users.
•  Most well-rounded modern soQware programs
will employ both RDBMS and NoSQL
technologies and work alongside for opCmal
outcomes.
Prabin Babu Dhakal, CDPA, TU

Which of the following gives a logical


structure of the database graphically ?

a) EnCty-relaConship diagram
b) EnCty diagram
c) Database diagram
d) Architectural representaCon
Answer: a
ExplanaCon: E-R diagrams are simple and clear—
qualiCes that may well account in large part for the
widespread use of the E-R model.
Prabin Babu Dhakal, CDPA, TU

Which of the following is used to


represent EnCty
a) Rectangle
b) Oval
c) Triangle
d) Diamond
Answer: a
ExplanaCon: In ER diagram, enCty is represented in
Rectangle
Prabin Babu Dhakal, CDPA, TU

Which of the following is used to


represent relaConship in ER diagram

a) Rectangle
b) Oval
c) Circle
d) Diamond
Answer: d
ExplanaCon: In ER diagram, RelaConship is represented
in diamond
Prabin Babu Dhakal, CDPA, TU

Which of the following is used to


represent aDribute in ER diagram
a) Rectangle
b) Oval
c) Circle
d) Diamond
Answer: b
ExplanaCon: In ER diagram, aDribute is represented in
oval
Prabin Babu Dhakal, CDPA, TU

AuthenCcaCon is

a) Permission to access certain content in Database


b) Logging the acCons of users and using for recovery
c) Verifying user idenCty
d) PrevenCng detecCng and deterring improper changes
Answer: c

Prabin Babu Dhakal, CDPA, TU

Which of the following is used to


denote the selecCon operaCon in
relaConal algebra ?
a) Pi (Greek)
b) Sigma (Greek)
c) Lambda (Greek)
d) Omega (Greek)

Answer: b
The select operaCon selects tuples that saCsfy a given
predicate.

Prabin Babu Dhakal, CDPA, TU

For select operaCon the __ appear in


the subscript and the __ argument
appears in the parenthesis aQer the
sigma.
a) Predicates, relaCon
b) RelaCon, Predicates
c) OperaCon, Predicates
d) RelaCon, OperaCon

Answer: a
The predicates or the condiCons appear in subscript and
the relaCon or table appear in the bracket like following
σcondiCon(relaCon)

Prabin Babu Dhakal, CDPA, TU

The ___ operaCon, denoted by −,


allows us to find tuples that are in one
relaCon but are not in another.
a) Union
b) Set-difference
c) Difference
d) IntersecCon

Answer: b
The expression r − s produces a relaCon containing
those tuples in r but not in s.

Prabin Babu Dhakal, CDPA, TU

Which of the following creates a


virtual relaCon for storing the
query ?
a) FuncCon
b) View
c) Procedure
d) None of the menConed

Answer: b
Any such relaCon that is not part of the logical model,
but is made visible to a user as a virtual relaCon, is
called a view.

Prabin Babu Dhakal, CDPA, TU

An enCty in A is associated with at


most one enCty in B, and an enCty in B
is associated with at most one enCty in
A. This is called as
a) One-to-many
b) One-to-one
c) Many-to-many
d) Many-to-one

Answer: b
Here one enCty in one set is related to one one enCty
in other set.
Prabin Babu Dhakal, CDPA, TU

Data integrity constraints are used


to:
a) Control who is allowed access to the data
b) Ensure that duplicate records are not entered into the table
c) Improve the quality of data entered for a specific property
d) Prevent users from changing the values stored in the table

Answer: c
The data entered will be in a parCcular cell (i.e., table column).
Prabin Babu Dhakal, CDPA, TU

__ is a special type of integrity


constraint that relates two relaCons &
maintains consistency across the
relaCons.
a) EnCty Integrity Constraints
b) ReferenCal Integrity Constraints
c) Domain Integrity Constraints
d) Domain Constraints

Answer: b
Prabin Babu Dhakal, CDPA, TU

Which one of the following


uniquely idenCfies the elements in
the relaCon?
a) Secondary Key
b) Primary key
c) Foreign key
d) Composite key

Answer: b
Primary key checks for not null and uniqueness
constraint.
Prabin Babu Dhakal, CDPA, TU

______ is preferred method for


enforcing data integrity
a) Constraints
b) Stored Procedure
c) Triggers
d) Cursors

Answer: a
Constraints are specified to restrict entries in the
relaCon.

S-ar putea să vă placă și