Sunteți pe pagina 1din 62

Subject Code: IMT-37

Subject Name : DBMS

(ORACLE)

ASSIGNMENTS
PART A
Q1. a.
Describe what metadata are and what value they provide to the databa
se system.
b.
What are the advantages of having the DBMS between the end users a
pplications and the
database?
c. Discuss some considerations when designing a database.

ANS a) Metadata is "data about data". The term is ambiguous, as it is used for two
fundamentally different concepts (types). Structural metadata is about the design and
specification of data structures and is more properly called "data about the containers of
data"; descriptive metadata, on the other hand, is about individual instances of
application data, the data content. Metadata are traditionally found in the card
catalogs of libraries. As information has become increasingly digital, metadata are also
used to describe digital data using metadata standards specific to a particular discipline.
By describing the contents and context of data files, the quality of the original data/files is
greatly increased. For example, a webpage may include metadata specifying what
language it is written in, what tools were used to create it, and where to go for more on
the subject, allowing browsers to automatically improve the experience of users.
Metadata (metacontent)are defined as the data providing information about one or more
aspects of the data, such as:

Means of creation of the data

Purpose of the data

Time and date of creation

Creator or author of the data

Location on a computer network where the data were created

Standards used

For example, a digital image may include metadata that describe how large the picture
is, the color depth, the image resolution, when the image was created, and other data.
[1]
A text document's metadata may contain information about how long the document is,
who the author is, when the document was written, and a short summary of the
document.

Metadata are data. As such, metadata can be stored and managed in a database, often
called a Metadata registry or Metadata repository.[2] However, without context and a
point of reference, it might be impossible to identify metadata just by looking at them.
[3]
For example: by itself, a database containing several numbers, all 13 digits long could
be the results of calculations or a list of numbers to plug into an equation - without any
other context, the numbers themselves can be perceived as the data. But if given the
context that this database is a log of a book collection, those 13-digit numbers may now
be identified as ISBNs - information that refers to the book, but is not itself the
information within the book.
The term "metadata" was coined in 1968 by Philip Bagley, in his book "Extension of
programming language concepts" where it is clear that he uses the term in the ISO
11179 "traditional" sense, which is "structural metadata" i.e. "data about the containers of
data"; rather than the alternate sense "content about individual instances of data
content" or metacontent, the type of data usually found in library catalogues. [4][5] Since
then the fields of information management, information science, information technology,
librarianship and GIS? have widely adopted the term. In these fields the
word metadata is defined as "data about data".[6] While this is the generally accepted
definition, various disciplines have adopted their own more specific explanation and uses
of the term.
b.) A database management system (DBMS) is a collection of programs

which allow end-users to create, maintain and control records in a database.


DBMS features primarily address the creation of databases for data
extraction, queries and record interrogation. The difference between a
DBMS system and an application development environment range from
data usage to personnel. Records Interrogation

Records interrogation programs are designed to provide the end-user


information through several programs, Query, report generator or general
inquiry programs. The most popular program is the query program, which
allows the end-user to develop basic program skills by constructing simple
data programs using a query language processor to extract data. Query
programs are powerful utilities for records interrogation.

Personnel Advantages

A DBMS system consists of data managers and database administrators


that oversee the entire DBMS operation. Their primary duties are database
records maintenance, loading program releases and making sure primary
scheduling is run daily. Applications development consist of programmers,
computer technicians and a systems analyst with the job of finding software
errors for testing.

c. ) Designing a database requires an understanding of both the business


functions you want to model and the database concepts and features used to
represent those business functions.
It is important to accurately design a database to model the business
because it can be time consuming to change the design of a database
significantly once implemented. A well-designed database also performs
better.
When designing a database, consider:

The purpose of the database and how it affects the design. Create a
database plan to fit your purpose.

Database normalization rules that prevent mistakes in the database


design.

Protection of your data integrity.

Security requirements of the database and user permissions.

Performance needs of the application. You must ensure that the


database design takes advantage of Microsoft SQL Server 2000
features that improve performance. Achieving a balance between the
size of the database and the hardware configuration is also important
for performance.

Maintenance.

Estimating the size of a database.

Q2. a.
List and briefly describe the different types database maintenance activ
ities.
b.
Database backups can be performed at different levels. List and describ
e these.
c. What are the classical approaches to database design?

Ans a.) Database maintenance is an activity designed to keep a database running


smoothly. A number of different systems can be used to build and maintain
databases, with one popular example being MYSQL. The maintenance of databases
is generally performed by people who are comfortable and familiar with the database
system and the specifics of the particular database, although some maintenance
tasks can be performed by people who do not have experience.

Databases are used to maintain a library of information in a well organized,


accessible format. They usually are not static, however, because changes are
constantly being made as material is added, removed, and moved around. People
may also change parameters within the database, decide to use different indexing
systems, and so forth. Over time, this can cause the database to start to malfunction.
Database maintenance is used to keep the database clean and well organized so
that it will not lose functionality.
One important aspect of maintaining a database is simply backing up the data so
that, if anything happens, there will be another copy available. Some databasing
systems actually do this automatically, sending a backup to another location every
day, every week, or within any other set period of time. Backups are usually not
enough, however Database maintenance includes checking for signs of corruption in
the database, looking for problem areas, rebuilding indexes, removing duplicate
records, and checking for any abnormalities in the database that might signal a
problem. The goal is to keep the database operating smoothly for users, so that
ideally they never need to think about maintenance issues. A database that is not
maintained can become sluggish, and people may start to experience problems
when trying to access records.
Many servers have extensive databases that are used to serve up content to users
on an internal network or on the Internet. An important part of server
maintenance involves confirming that databases are working properly. This also
includes checks for security flaws and other issues that could threaten the integrity of
the database, ranging from viruses to records which are entered improperly.
Numerous textbooks are available with information about database management,
including how to maintain databases properly. It is also possible to take courses to
learn about different databasing systems and how to care for databases, whether
they are being built from scratch or taken over. People can also earn certifications in
specific systems which indicate a high level of competence.

b.) The focus in Oracle backup and recovery is generally on the physical backup

of database files, which permit the full reconstruction of your database. The files
protected by the backup and recovery facilities built into Enterprise Manager
include datafiles, control files, server parameter files (SPFILEs), and archived redo
log files. With these your database can be reconstructed. The backup mechanisms
that work at the physical level protect against damage at the file level, such as the
accidental deletion of a datafile or the failure of a disk drive.

Logical-level backups, such as exporting database objects like tables or


tablespaces, are a useful supplement to physical backups for some purposes but
cannot protect your entire database. Your approach to backing up your database
should therefore be based upon physical-level backups.
Oracle's flashback features provide a range of physical and logical data recovery
tools as efficient, easy-to-use alternatives to physical and logical backups. This
chapter will introduce two of the flashback features that operate at a logical level:
Oracle Flashback Table, which lets you revert a table to its contents at a time in the
recent past; and Oracle Flashback Drop, which lets you rescue dropped database
tables. Neither requires advance preparation such as creating logical-level exports
to allow for retrieval of your lost data, and both can be used while your database is
available. Oracle Database Backup and Recovery Advanced User's Guidediscusses
the flashback features of the Oracle database at greater length.
Oracle Enterprise Manager's physical backup and recovery features are built
on Oracle's Recovery Manager (RMAN) command-line client. Enterprise Manager
carries out its backup and recovery tasks by composing RMAN commands and
sending them to the RMAN client. Enterprise Manager makes available much of
the functionality of RMAN, as well as providing wizards and automatic strategies
to simplify and further automate implementing RMAN-based backup and recovery.
After you are familiar with the basics of backup and recovery through Oracle
Enterprise Manager, refer to Oracle Database Backup and Recovery Basics,
and Oracle Database Backup and Recovery Advanced User's Guide, for more
details on the full range of Oracle's backup capabilities.

c.) There are two approaches for developing any database, the top-down method
and the bottom-up method. While these approaches appear radically different, they
share the common goal of uniting a system by describing all of the interaction
between the processes.
Top down design method
The top-down design method starts from the general and moves to the specific. In
other words, you start with a general idea of what is needed for the system and then
work your way down to the more specific details of how the system will interact. This

process involves the identification of different entity types and the definition of each
entitys attributes.

Added by Ryan.M

Bottom up design method


The bottom-up approach begins with the specific details and moves up to the
general. This is done by first identifying the data elements (items) and then grouping
them together in data sets. In other words, this method first identifies the attributes,
and then groups them to form entities.

Added by Ryan.M

Two general approaches (top down and bottom up) to the design of the
databases can be heavily influenced by factors like scope, size of the system, the
organizations management style, and the organizations structure. Depending on
such factors, the design of the database might use two very different approaches,
centralized design and decentralized design.
Centralized design
Centralized design is most productive when the data component is composed of a
moderately small number of objects and procedures. The design can be carried out
and represented in a somewhat simple database. Centralized design is typical of a
simple or small database and can be successfully done by a single database
administrator or by a small design team. This person or team will define the
problems, create the conceptual design, verify the conceptual design with the user
views, and define system processes and data constraints to ensure that the design
complies with the organizations goals. That being said, the centralized design is not
limited to small companies. Even large companies can operate within the simple
database environment.

Added by Ryan.M

Decentralized design
Decentralized design might best be used when the data component of the system
has a large number of entities and complex relations upon which complex operations
are performed. This is also likely to be used when the problem itself is spread across
many operational sites and the elements are a subset of the entire data set. In large
and complex projects a team of carefully selected designers are employed to get the
job done. This is commonly accomplished by several teams that work on different
subsets or modules of the system. Conceptual models are created by these teams
and compared to the user views, processes, and constraints for each module. Once
all the teams have completed their modules they are all put aggregated into one
large conceptual model.
Q3.
a.Explain the differences between a centralized and decentralized appr
oach to database design.
b. Explain how database designers design and normalize databases.
c. Explain the BCNF. How is it related to other normal forms?
Ans a.) he issue of centralization versus decentralization of
computer resources is not a new one; it has been widely discussed and hotly
debated for
at least two decades now. The interest in this issue was
originally motivated by
the feeling that the computer, a costly expense in terr.s of
investment and operating budget, should be used to the fullest possible potential.
Interest also grew
because it was felt that within a corporation, a large measure of
political

power rested with whomever controlled the data processing


facility. Lately
advances in network technology and the advent of efficient low
cost mini and
micro computers has initiated the era of distributed data
processing and in
effect thrown new fuel into the centralization/ decentralization
fire.
Of the voluminous literature published on this subject, we first
concentrate
on key articles relating to one aspect of the problem: the
centralization/decentralization decision . Management, faced with decisions
regarding proper
long range directions toward optimal configurations of hardware,
software, and
personnel finds little by way of guidelines to follow. There
seems then to be
a real need for a rigorous decision model to provide management
with an
approach to solving this dilemma.
Ernest Dale (2) states: "the proper balance between
centralization and
decentralization often is decided by necessity, intuition, and
luck because of
the immense variety of possible human behavior and vast
multiplicity of
minute, undiscoverable causes and effects that cannot be
encompassed in

(2) Dale, E. '"Centralization versus Decentralization," Advanced


Management ,
June 1955.

- 3 -

any principal or standard of evaluation." In addition, current


solutions seem
highly dependent on the characteristics, philosophies, and
objectives of
the particular organization for which the decision is to be
inade. According to George Glaser (3) , "the organizational approach to data
p'-ocessing

should be consistent with the overall* organizational approach of


the company in which it functions." The problem is not only of major
importance
but of substantial complexity also.

b). Database design is the process of producing a detailed data model of a database.
This logical data model contains all the needed logical and physical design choices and
physical storage parameters needed to generate a design in a Data Definition
Language, which can then be used to create a database. A fully attributed data model
contains detailed attributes for each entity.
The term database design can be used to describe many different parts of the design of
an overall database system. Principally, and most correctly, it can be thought of as the
logical design of the base data structures used to store the data. In the relational
model these are the tables and views. In an object database the entities and
relationships map directly to object classes and named relationships. However, the term
database design could also be used to apply to the overall process of designing, not just
the base data structures, but also the forms and queries used as part of the overall
database application within the database management system (DBMS).[1]
The process of doing database design generally consists of a number of steps which will
be carried out by the database designer. Usually, the designer must:

Determine the relationships between the different data elements.

Superimpose a logical structure upon the data on the basis of these relationships

c. ) BoyceCodd normal form (or BCNF or 3.5NF) is a normal form used in database
normalization. It is a slightly stronger version of the third normal form (3NF). BCNF was
developed in 1974 byRaymond F. Boyce and Edgar F. Codd to address certain types of
anomaly not dealt with by 3NF as originally defined. [1]
Chris Date has pointed out that a definition of what we now know as BCNF appeared in
a paper by Ian Heath in 1971.[2] Date writes:
"Since that definition predated Boyce and Codd's own definition by some three years, it
seems to me that BCNF ought by rights to be called Heath normal form. But it isn't."[3]
Edgar F.Codd released his original paper 'A Relational Model of Data for Large Shared
Databanks' in June 1970. This was the first time the notion of a relational database was
published. All work after this, including the Boyce-Codd normal form method was based
on this relational model.

If a relational schema is in BCNF then all redundancy based on functional dependency


has been removed, although other types of redundancy may still exist. A relational
schema R is in BoyceCodd normal form if and only if for every one of
its dependencies X Y, at least one of the following conditions hold: [4]

X Y is a trivial functional dependency (Y X)

X is a superkey for schema R

Q4. a.
What is a schema? How many schemas can be used in one database?
b.
What command is used to save changes to the database? What is the sy
ntax for this
command?How do you delete a table from the database? Provide an exa
mple.
c.
What is a subquery? When is it used? Does the RDBMS deal with subque
ries any differently
from normal queries?

Ans . a. ) A database schema (/ski.m/ SKEE-m) of a database system is its


structure described in a formal language supported by the database management
system (DBMS) and refers to the organization of data as a blueprint of how a database
is constructed (divided into database tables in case of Relational Databases). The formal
definition of database schema is a set of formulas (sentences) called integrity
constraints imposed on a database. These integrity constraints ensure compatibility
between parts of the schema. All constraints are expressible in the same language. A
database can be considered a structure in realization of the database language.[1] The
states of a created conceptual schema are transformed into an explicit mapping, the
database schema. This describes how real world entities are modeled in the database.
"A database schema specifies, based on the database administrator's knowledge of
possible applications, the facts that can enter the database, or those of interest to the
possible end-users."[2] The notion of a database schema plays the same role as the
notion of theory in predicate calculus. A model of this theory closely corresponds to a
database, which can be seen at any instant of time as a mathematical object. Thus a
schema can contain formulas representing integrity constraints specifically for an
application and the constraints specifically for a type of database, all expressed in the
same database language.[1] In a relational database, the schema defines
the tables, fields, relationships, views, indexes, packages, procedures,functions, queues,
triggers, types, sequences, materialized views, synonyms, database
links, directories, XML schemas, and other elements.

Schemas are generally stored in a data dictionary. Although a schema is defined in text
database language, the term is often used to refer to a graphical depiction of the
database structure. In other words, schema is the structure of the database that defines
the objects in the database.
In an Oracle Database system, the term "schema" has a slightly different connotation.

b. ) A transaction is a unit of work that is performed against a database. Transactions are units
or sequences of work accomplished in a logical order, whether in a manual fashion by a user or
automatically by some sort of a database program.
A transaction is the propagation of one or more changes to the database. For example, if you are
creating a record or updating a record or deleting a record from the table, then you are
performing transaction on the table. It is important to control transactions to ensure data integrity
and to handle database errors.
Practically, you will club many SQL queries into a group and you will execute all of them together
as a part of a transaction.

Properties of Transactions:
Transactions have the following four standard properties, usually referred to by the acronym
ACID:
Atomicity: ensures that all operations within the work unit are completed successfully;

otherwise, the transaction is aborted at the point of failure, and previous operations are rolled
back to their former state.
Consistency: ensures that the database properly changes states upon a successfully

committed transaction.
Isolation: enables transactions to operate independently of and transparent to each

other.

Durability: ensures that the result or effect of a committed transaction persists in case of
a system failure.

Transaction Control:
There are following commands used to control transactions:

COMMIT: to save the changes.

ROLLBACK: to rollback the changes.

SAVEPOINT: creates points within groups of transactions in which to ROLLBACK

SET TRANSACTION: Places a name on a transaction.


Transactional control commands are only used with the DML commands INSERT, UPDATE and
DELETE only. They can not be used while creating tables or dropping them because these
operations are automatically commited in the database.

The COMMIT Command:


The COMMIT command is the transactional command used to save changes invoked by a
transaction to the database.
The COMMIT command saves all transactions to the database since the last COMMIT or
ROLLBACK command.

c. ) Oracle is a commercial relational database management system. As with


other large-scale RDBMSs, it uses the Structured Query Language for database
maintenance, administration and programming. The SQL language lets you create
sophisticated database queries for reporting, adding and changing information in the
database. These queries can include correlated sub-queries, in which the data from the
sub-query depends on the main query. An SQL query is a statement which

examines data in the database and uses it to produce a report or update the
database. One of the simplest queries is a list of records in a database table. It
looks like the following SQL statement:
SELECT * FROM customers;
This query produces an unsorted list of all the information in the customers
table, record by record. By using the powerful WHERE clause, you can create
selective queries which evaluate the data and list only those records matching
the clauses criteria:
SELECT * FROM customers WHERE state = CA;
This query lists only customers from California. The WHERE clause
accommodates very complex conditions, including the results of correlated
sub-queries, for selecting only the data you want.

Sub-Queries

A sub-query is a query in which the WHERE clause itself has its own
query. This is a convenient way to combine information from different
database tables to produce more sophisticated results. The following query
produces a list of only those customers who have placed orders in 2011:
SELECT * FROM customers WHERE customer_code IN (SELECT
customer_code FROM orders WHERE order_date BETWEEN 1/1/2011 AND
12/31/2011);
Notice that this is a query inside a query. The SELECT statement inside the
parentheses generates a list of customer codes from the orders table. The
outer query uses the customer codes to produce a list of customer names,
addresses and other information. This is a sub-query but not a coordinated
sub-query; though the outer query depends on the inner one, a coordinated
sub-query also has an inner query that depends on the outer one.

Sponsored Links

Story of Send
Follow an email on its journey. An inside look at how email works.
www.google.com/green/storyofsend

Correlated Sub-Queries

In a correlated sub-query, each query depends on the other. The


following Oracle SQL statement produces a list of customer codes, names, and
purchase totals for those customers whose purchase amounts fall below the
average for all customers in a state. You then have two mutually dependent
queries, one which lists the customers but needs the average sales figure
against which to compare, and the other which calculates the average but
needs the state. Note the use of the table aliases c1 and c2. The alias c1
refers to the customer table in the outer query, and c2 is the customer table
in the inner query.
Q5. a. Explain normalization and its different forms.
b. Describe the need for convenient and safe backup storage.
c. Explain user requirements gathering as part of the DBA's enduser support services.
Ans a) Normalization is the process of designing database tables to ensure that the
fields in each table do not repeat, are fully identified by a unique KEY, and are not
dependent on any non-key ATTRIBUTEs.
shr.aaas.org/DBStandards/glossary.html

Normalization helps in reducing data redundancy. As we move towards higher

normalization
1NF: This type of normalization states that there must not be any duplicates in the tables
that we use. In other words, all the tables used must have a primary key defined.
2NF: This type of normalization states that data redundancy can be reduced if attributes
those are dependent on one of the keys of a composite primary key are isolated to a
separate table. Not only does this reduces data redundancy but also helps in increasing
data retention when a delete is done. For example, consider a table that has the
following columns: Part Id, State, City, and Country. Here, assume Part Id & Country
form the composite primary key. The attributes state & city depend only on the country.
2NF states that if such is the case then split the table into 2 tables. One with Part Id &
country as the columns. Other with Country, state & city as the columns. In the 1st table
if a delete is made to all the rows with Part Id = X then we would lose country related
data too. But in the 2nd case this would not happen.

b.). a backup, or the process of backing up, refers to the copying and archiving of
computer data so it may be used to restore the original after a data loss event. The verb
form is to back up in two words, whereas the noun is backup.[1]
Backups have two distinct purposes. The primary purpose is to recover data after its
loss, be it by data deletion or corruption. Data loss can be a common experience of
computer users. A 2008 survey found that 66% of respondents had lost files on their
home PC.[2] The secondary purpose of backups is to recover data from an earlier time,
according to a user-defined data retentionpolicy, typically configured within a backup
application for how long copies of data are required. Though backups popularly
represent a simple form of disaster recovery, and should be part of adisaster recovery
plan, by themselves, backups should not alone be considered disaster recovery.[3] One
reason for this is that not all backup systems or backup applications are able to
reconstitute a computer system or other complex configurations such as a computer
cluster, active directory servers, or a database server, by restoring only data from a
backup.Since a backup system contains at least one copy of all data worth saving,
the data storage requirements can be significant. Organizing this storage space and
managing the backup process can be a complicated undertaking. A data repository
model can be used to provide structure to the storage. Nowadays, there are many
different types of data storage devices that are useful for making backups. There are
also many different ways in which these devices can be arranged to provide geographic
redundancy, data security, and portability.Before data is sent to its storage location, it is
selected, extracted, and manipulated. Many different techniques have been developed
to optimize the backup procedure. These include optimizations for dealing with open files
and live data sources as well as compression, encryption, and de-duplication, among
others. Every backup scheme should include dry runs that validate the reliability of the

data being backed up. It is important to recognize the limitations and human factors
involved in any backup scheme

c.). In a database environment such as Adabas, the same data is used by many

applications (users) in many departments of the organization. Ownership of and


responsibility for the data is shared by departments with diverse and often
conflicting needs. One task of the DBA is to resolve such differences.
Data security and integrity are no longer bound to a single individual or
department, but are inherent in systems such as Adabas; in fact, the DBA controls
and customized security profiles offered by such systems usually improve security
and integrity.
In the past, application development teams have been largely responsible for
designing and maintaining application files, usually for their own convenience.
Other applications wishing to use the data had to either accept the original file
design or convert the information for their own use. This meant inconsistent data
integrity, varied recovery procedures, and questionable privacy safeguards. In
addition, little attention was given to overall system efficiency; changes introduced
in one system could adversely affect the performance of other systems.
With an integrated and shared database, such a lack of central control would soon
lead to chaos. Changes to file structure to benefit one project could adversely
influence data needs of other projects. Attempts to improve efficiency of one
project could be at the expense of another. The use of different security and
recovery procedures would, at best, be difficult to manage and at worst, result in
confusion and an unstable, insecure database.
Clearly, proper database management means that central control is needed to
ensure adherence to common standards and an installation-wide view of hardware
and software needs. This central control is the responsibility of the DBA. For these
and other reasons, it is important that the DBA function be set up at the very
beginning of the database development cycle.

PART B
Q1a). Explain heterogeneous distributed database systems.

b.
A fully distributed database management system must perform all of th
e functions of a
centralized DBMS. Do you agree? Why or why not?
c. Describe the five types of users identified in a database system.

Ans a). A heterogeneous database system is an automated (or semi-automated)


system for the integration of heterogeneous, disparate database management
systems to present a user with a single, unified query interface.
Heterogeneous database systems (HDBs) are computational models and software
implementations that provide heterogeneous database integration.
This article does not contain details of distributed database management
systems (sometimes known as federated database systems).

Technical heterogeneity
Different file formats, access protocols, query languages etc. Often called syntactic
heterogeneity from the point of view of data.

Data model heterogeneity


Different ways of representing and storing the same data. Table decompositions may
vary, column names (data labels) may be different (but have the same semantics),
data encoding schemes may vary (i.e., should a measurement scale be explicitly
included in a field or should it be implied elsewhere). Also referred as schematic
heterogeneity.

Semantic heterogeneity
Data across constituent databases may be related but different. Perhaps a database
system must be able to integrate genomic and proteomic data. They are relateda gene
may have several protein productsbut the data are different (nucleotide sequences
and amino acid sequences, or hydrophilic or -phobic amino acid sequence and positively
or negatively charged amino acids). There may be many ways of looking at semantically
similar, but distinct, datasets.

b.) A distributed database is a database in which storage devices are not all attached
to a common processing unit such as the CPU,[1] controlled by a distributed database
management system (together sometimes called a distributed database system). It
may be stored in multiple computers, located in the same physical location; or may be

dispersed over a network of interconnected computers. Unlike parallel systems, in which


the processors are tightly coupled and constitute a single database system, a distributed
database system consists of loosely-coupled sites that share no physical components.
System administrators can distribute collections of data (e.g. in a database) across
multiple physical locations. A distributed database can reside on network servers on
the Internet, on corporateintranets or extranets, or on other company networks. Because
they store data across multiple computers, distributed databases can improve
performance at end-user worksites by allowing transactions to be processed on many
machines, instead of being limited to one. [2]
Two processes ensure that the distributed databases remain up-to-date and
current: replication and duplication.
1. Replication involves using specialized software that looks for changes in the
distributive database. Once the changes have been identified, the replication
process makes all the databases look the same. The replication process can be
complex and time-consuming depending on the size and number of the
distributed databases. This process can also require a lot of time and computer
resources.
2. Duplication, on the other hand, has less complexity. It basically identifies one
database as a master and then duplicates that database. The duplication
process is normally done at a set time after hours. This is to ensure that each
distributed location has the same data. In the duplication process, users may
change only the master database. This ensures that local data will not be
overwritten.
Both replication and duplication can keep the data current in all distributive locations. [2]
Besides distributed database replication and fragmentation, there are many other
distributed database design technologies. For example, local autonomy, synchronous
and asynchronous distributed database technologies. These technologies'
implementation can and does depend on the needs of the business and the
sensitivity/confidentiality of the data stored in the database, and hence the price the
business is willing to spend on ensuring data security, consistency and integrity.
When discussing access to distributed databases, Microsoft favors the term distributed
query, which it defines in protocol-specific manner as "[a]ny SELECT, INSERT,
UPDATE, or DELETE statement that references tables and rowsets from one or more
external OLE DB data sources".[3] Oracle provides a more language-centric view in
which distributed queries and distributed transactions form part of distributed SQL
A database user accesses the distributed database through:
Local applications

applications which do not require data from other sites.


Global applications
applications which do require data from other sites.
A homogeneous distributed database has identical software and hardware
running all databases instances, and may appear through a single interface as if
it were a single database. Aheterogeneous distributed database may have
different hardware, operating systems, database management systems, and
even data models for different databases.

c.) Software refers to the collection of programs used with in the database
system. It includes the operating system, DBMS Software, and application
programs and utilities.
Operating System
DBMS Software
Application Programs and Utilities
The operating System manages all the hardware components and makes it
possible for all other software to run on the computers. UNIX, LINUX,
Microsoft Windows etc are the popular operating systems used in database
environment.
DBMS software manages the database with in the database system. Oracle
Corporation's ORACLE, IBM's DB2, Sun's MYSQL, Microsoft's MS Access and
SQL Server etc are the popular DBMS (RDBMS) software used in the
database environment.
Application programs and utilities software are used to access and
manipulate the data in the database and to manage the operating
environment of the database.
People in a Database System Environment
People component includes all users associated with the database system. On
the basis of primary job function we can identify five types of users in a
database system: System Administrators, Database Administrators, Data
Modelers,System Analysts and Programmers and End Users.
System Administrators
Data Modelers
Database Administrators
System Analysts and Programmers
End Users
System Administrators oversees the database system's general operations.

Data Modelers (Architect) prepare the conceptualdesign from the


requirement.ER model represent the conceptual design of an OLTP
application.
Database Administrator (DBA) physically implements the database according
to the logical design. The DBA performs the physical implementation and
maintenance of a database system.
System Analysts and programmers design and implements the application
programs. They create the input screens, reports, and procedures through
which end users access and manipulate the database.
End Users are the people who use the application. For example incase of a
banking system, the employees, customer using ATM or online banking
facility are end users.

Q2. a. Describe the DBAs managerial role.


b. What are the three basic techniques to control deadlocks?
c.
How does a shared/exclusive lock schema increase the lock managers o
verhead?

Ans . a.). The success of a database environment depends on central control of

database design, implementation, and use. This central control and coordination is
the role of the database administrator (DBA).
This part of the DBA documentation describes the roles of the DBA, the authority
and responsibility the DBA might have, the skills needed, the procedures,
standards, and contacts the DBA may need to create and maintain.
In the context of this documentation, the DBA is a single person; however, large
organizations may divide DBA responsibilities among a team of personnel, each
with specific skills and areas of responsibility such as database design, tuning, or
problem resolution. The ability of the database administrator (DBA) to work
effectively depends on the skill and knowledge the DBA brings to the task, and the
role the DBA has on the overall Information Systems (IS) operation. This section
describes how best to define the DBA role, discusses the relationship of the DBA
to the IS organization, and makes suggestions for taking advantage of that
relationship.
Position of the DBA in the Organization

The DBA should be placed high enough in the organization to exercise the
necessary degree of control over the use of the database and to communicate at the
appropriate level within user departments. However, the DBA should not be remote

from the day-to-day processes of monitoring database use, advising on and


selecting applications, and maintaining the required level of database integrity.
The appropriate position and reporting structure of the DBA depends solely on the
nature and size of the organization.
In most organizations, the DBA is best placed as a functional manager with an
status equivalent to the systems, programming, and operations managers. The DBA
should have direct responsibility for all aspects of the continued operation of the
database. It is also useful to give the DBA at least partial control over the
programming and IS operation standards, since the DBA must have the ability to
ensure that DBMS-compatible standards are understood and observed.

b. ) A deadlock is a situation in which two or more competing actions are each waiting
for the other to finish, and thus neither ever does.
In computer science a deadly embrace is a deadlock involving exactly two competing
actions. It is a term more commonly used in Europe.
In a transactional database[disambiguation needed], a deadlock happens when two processes
each within its own transaction updates two rows of information but in the opposite order.
For example, process A updates row 1 then row 2 in the exact timeframe process B
updates row 2 then row 1. Process A can't finish updating row 2 until process B is
finished, but it cannot finish updating row 1 until process A finishes. No matter how much
time is allowed to pass, this situation will never resolve itself and because of
this database management systems will typically kill the transaction of the process that
has done the least amount of work.
In an operating system, a deadlock is a situation which occurs when
a process or thread enters a waiting state because a resource requested is being held
by another waiting process, which in turn is waiting for another resource. If a process is
unable to change its state indefinitely because the resources requested by it are being
used by another waiting process, then the system is said to be in a deadlock. [1]
Deadlock is a common problem in multiprocessing systems, parallel
computing and distributed systems, where software and hardware locks are used to
handle shared resources and implement process synchronization

c.) This article is about concurrency control. For commit consensus within a distributed

transaction, see Two-phase commit protocol.


In databases and transaction processing, two-phase locking (2PL) is a concurrency
control method that guarantees serializability.[1][2] It is also the name of the resulting set
of database transaction schedules (histories). The protocol utilizes locks, applied by a
transaction to data, which may block (interpreted as signals to stop) other transactions
from accessing the same data during the transaction's life.
By the 2PL protocol locks are applied and removed in two phases:
1. Expanding phase: locks are acquired and no locks are released.
2. Shrinking phase: locks are released and no locks are acquired.
Two types of locks are utilized by the basic protocol: Shared and Exclusive locks.
Refinements of the basic protocol may utilize more lock types. Using locks that block
processes, 2PL may be subject to deadlocks that result from the mutual blocking of two
or more transactions.
2PL is a superset of strong strict two-phase locking (SS2PL),[3] also
called rigorousness,[4] which has been widely utilized for concurrency control in
general-purpose database systems since the 1970s. SS2PL implementations have
many variants. SS2PL was called strict 2PL[1] but this name usage is not recommended
now. Now strict 2PL (S2PL) is the intersection of strictness and 2PL, which is different
from SS2PL. SS2PL is also a special case of commitment ordering,[3] and inherits many
of CO's useful properties. SS2PL actually comprises only one phase: phase-2 does not
exist, and all locks are released only after transaction end. Thus this useful 2PL type is
not two-phased at all.
Neither 2PL nor S2PL in their general forms are known to be used in practice. Thus 2PL
by itself does not seem to have much practical importance, and whenever 2PL or S2PL
utilization has been mentioned in the literature, the intention has been SS2PL. What has
made SS2PL so popular (probably the most utilized serializability mechanism) is the
effective and efficient locking-based combination of two ingredients (the first does not
exist in both general 2PL and S2PL; the second does not exist in general 2PL):
1. Commitment ordering, which provides both serializability, and effective distributed
serializability and global serializability, and
2. Strictness, which provides cascadelessness (ACA, cascade-less recoverability)
and (independently) allows efficient database recovery from failure.
Additionally SS2PL is easier, with less overhead to implement than both 2PL and S2PL,
provides exactly the same locking, but sometimes releases locks later. However,
practically (though not simplistically theoretically) such later lock release occurs only

slightly later, and this apparent disadvantage is insignificant and disappears next to the
advantages of SS2PL.
Thus, the importance of the general Two-phase locking (2PL) is historic only,
while Strong strict two-phase locking (SS2PL) is practically the important mechanism
and resulting schedule property. A lock is a system object associated with a shared
resource such as a data item of an elementary type, a row in a database, or a page of
memory. In a database, a lock on a database object (a data-access lock) may need to
be acquired by a transaction before accessing the object. Correct use of locks prevents
undesired, incorrect or inconsistent operations on shared resources by other concurrent
transactions. When a database object with an existing lock acquired by one transaction
needs to be accessed by another transaction, the existing lock for the object and the
type of the intended access are checked by the system. If the existing lock type does not
allow this specific attempted concurrent access type, the transaction attempting access
is blocked (according to a predefined agreement/scheme). In practice a lock on an
object does not directly block a transaction's operation upon the object, but rather blocks
that transaction from acquiring another lock on the same object, needed to be
held/owned by the transaction before performing this operation. Thus, with a locking
mechanism, needed operation blocking is controlled by a proper lock blocking scheme,
which indicates which lock type blocks which lock type.
Two major types of locks are utilized:

Write-lock (exclusive lock) is associated with a database object by a


transaction (Terminology: "the transaction locks the object," or "acquires lock for it")
before writing(inserting/modifying/deleting) this object.

Read-lock (shared lock) is associated with a database object by a transaction


before reading (retrieving the state of) this object.

The common interactions between these lock types are defined by blocking behavior as
follows:

An existing write-lock on a database object blocks an intended write upon the


same object (already requested/issued) by another transaction by blocking a
respective write-lock from being acquired by the other transaction. The second writelock will be acquired and the requested write of the object will take place
(materialize) after the existing write-lock is released.

A write-lock blocks an intended (already requested/issued) read by another


transaction by blocking the respective read-lock .

A read-lock blocks an intended write by another transaction by blocking the


respective write-lock .

A read-lock does not block an intended read by another transaction. The


respective read-lock for the intended read is acquired (shared with the previous
read) immediately after the intended read is requested, and then the intended read
itself takes place.

Q3. a.
Describe a conceptual model and its advantages. What is the most wide
ly used conceptual
model?
b.What is a key and why is it important in the relational model? Describ
e the use of nulls in a database.
c. Explain singlevalued attributes and provide an example. Explain the difference betwe
en simple
and composite attributes. Provide at least one example of each.

Ans . a. ) In the most general sense, a model is anything used in any way to represent
anything else. Some models are physical objects, for instance, a toy model which may
be assembled, and may even be made to work like the object it represents. Whereas,
a conceptual model is a model made of the composition of concepts, that thus exists
only in the mind. Conceptual models are used to help us know, understand,
or simulate the subject matter they represent.
The term conceptual model may be used to refer to models which are formed after a
conceptualization process in the mind. Conceptual models represent human intentions
or semantics[citation needed]. Conceptualization from observation of physical existence and
conceptual modeling are the necessary means human employ to think and solve
problems[citation needed]. Concepts are used to convey semantics during various natural
languages based communication[citation needed]. Since a concept might map to multiple
semantics by itself, an explicit formalization is usually required for identifying and
locating the intended semantic from several candidates to avoid misunderstandings and
confusions in conceptual models.[ The term "conceptual model" is ambiguous. It could
mean a model of concept or it could mean a model that is conceptual. A distinction can
be made between what models are and what models are models of. With the exception
of iconic models, such as a scale model of Winchester Cathedral, most models are
concepts. But they are, mostly, intended to be models of real world states of affairs. The
value of a model is usually directly proportional to how well it corresponds to a past,
present, future, actual or potential state of affairs. A model of a concept is quite different
because in order to be a good model it need not have this real world correspondence.

[2]

Models of concepts are usually built by analysts who are not primarily concerned
about the truth or falsity of the concepts being modeled. For example, in management
problem structuring, Conceptual Models of human activity systems are used in Soft
systems methodology to explore the viewpoints of stakeholders in the client
organization. In artificial intelligence conceptual models and conceptual graphs are used
for building expert systems and knowledge-based systems, here the analysts are
concerned to represent expert opinion on what is true not their own ideas on what is
true.

Type and scope of conceptual models


Conceptual models (models that are conceptual) range in type from the more concrete,
such as the mental image of a familiar physical object, to the formal generality and
abstractness ofmathematical models which do not appear to the mind as an image.
Conceptual models also range in terms of the scope of the subject matter that they are
taken to represent. A model may, for instance, represent a single thing (e.g. the Statue
of Liberty), whole classes of things (e.g. the electron), and even very vast domains of
subject matter such as the physical universe. The variety and scope of conceptual
models is due to the variety of purposes had by the people using them.

b.) The relational model for database management is a database model based
on first-order predicate logic, first formulated and proposed in 1969 by Edgar F. Codd.[1]
[2]
In the relational model of a database, all data is represented in terms of tuples,
grouped into relations. A database organized in terms of the relational model is
a relational database.

Diagram of an example database according to the Relational model. [3]

In the relational model, related records are linked together with a "key".

The purpose of the relational model is to provide a declarative method for specifying
data and queries: users directly state what information the database contains and what
information they want from it, and let the database management system software take
care of describing data structures for storing the data and retrieval procedures for
answering queries.
Most relational databases use the SQL data definition and query language; these
systems implement what can be regarded as an engineering approximation to the
relational model. A table in an SQL database schema corresponds to a predicate
variable; the contents of a table to a relation; key constraints, other constraints, and SQL
queries correspond to predicates. However, SQL databases, including DB2,deviate from
the relational model in many details, and Codd fiercely argued against deviations that
compromise the original principles.

c.) For

the next section we will look at a sample database called the


COMPANY database to illustrate the concepts of Entity Relationship
Model.
This database contains the information about employees, departments
and projects:

There are several departments in the company. Each


department has a unique identification, a name,
location of the office and a particular employee who
manages the department.

A department controls a number of projects, each of


which has unique name, a unique number and the
budget.
Each employee has name, an identification number,
address, salary, birthdate. An employee is assigned to
one department but can join in several projects. We
need to record the start date of the employee in each
project. We also need to know the direct supervisor of
each employee.
We want to keep track of the dependents of the
employees. Each dependent has name, birthdate and
relationship with the employee.

ENTITY, ENTITY SET AND ENTITY TYPE

An entity is an object in the real world with an independent existence


and can be differentiated from other objects
An entity might be

An object with physical existence. E.g. a lecturer, a


student, a car
An object with conceptual existence. E.g. a course, a
job, a position
An Entity Type defines a collection of similar entities.
An Entity Set is a collection of entities of an entity type at a point of
time. In ER diagrams, an entity type is represented by a name in a box.

Source: http://cnx.org/content/m28250/latest/

TYPES

OF

ENTITIES

Independent entities, also referred to as Kernels, are the backbone of the


database. It is what other tables are based on. Kernels have the
following characteristics:

they are the building blocks of a database


the primary key may be simple or composite
the primary key is not a foreign key
they do not depend on another entity for their
existence

Example: Customer table, Employee table, Product table


Dependent Entities, also referred to as derived, depend on other tables
for their meaning.

Dependent entities are used to connect two kernels


together.
They are said to be existent dependent on two or
more tables.
Many to many relationships become associative
tables with at least two foreign keys.
They may contain other attributes.
The foreign key identifies each associated table.
There are three options for the primary key:

use a composite of foreign keys of associated


tables if unique

use a composite of foreign keys and qualifying


column

create a new simple primary key

Characteristic entities provide more information about another table.


These entities have the following characteristic.

They represent multi-valued attributes.


They describe other entities.
They typically have a one to many relationship.
The foreign key is used to further identify the
characterized table.
Options for primary key are as follows:

foreign key plus a qualifying column

or create a new simple primary key

Employee(EID, Name, Address, Age, Salary)

EmployeePhone(EID, Phone)

ATTRIBUTES

Each entity is described by a set of attributes. E.g. Employee = (Name,


Address, Age, Salary).
Each attribute has a name, associated with an entity and is associated
with a domain of legal values. However the information about attribute
domain is not presented on the ER diagram.
In the diagram, each attribute is represented by an oval with a name
inside.

Q4. a. Describe specialization and generalization.


b.
Explain the different types of join operations. What are they and how d
o they work?
c.
What are SQL functions and when are they used? Provide a couple of ex
amples of situations in
which they are necessary.

Ans a.) A specialization may be a subclass, and also have a subclass


specified on it.

Employee

Secretary

Technician

Engineer

Manager

Salaried_Emp

Hourly Emp

Engineering Manager

For example, in the figure above, Engineer is a subclass of Employee, but also
a super class of Engineering Manager.
This means that every Engineering Manager, must also be an Engineer.
Specialization Hierarchy has the constraint that every subclass
participates as a subclass in only one class/subclass relationship, i.e. that
each subclass has only one parent. This results in a tree structure.
Specialization Lattice has the constraint that a subclass can be a
subclass of more than one class/subclass relationship. The figure shown
above is a specialization lattice, because Engineering_Manager participates
has more than one parent classes.
In a lattice or hierarchy, the subclass inherits the attributes not only of the
direct superclass, but also all of the predecessor super classes all the way to
the root.
A subclass with more than one super class is called a shared subclass. This
leads to multiple inheritance, where a subclass inherits attributes from
multiple classes.
In a lattice, when a superclass inherits attributes from more than one
superclass, and some attributes are inherited more than once via different
paths (i.e. Engineer, Manager and Salaried Employee all inherit from
Employee, that are then inherited by Engineering Manager.
In this situation, the attributes are included only once in the subclass.

Modeling Union Types using Categories

In the subclass-superclass relationship types we have seen so far, each has a


single superclass.
This means that in a particular subclass/super class relationship (Engineer,
Engineering Manager) even though the subclass has more than one
subclass/super class relationship, each relationship is between a single super
class and subclass.

There are situations when you would like to model a relationship where a
single subclass has more than one super class, and where each super class
represents a different entity type.
The subclass will represent a collection of objects that is a subset of the
UNION of the distinct entity types.
This type of subclass is a union type, or category subclass.
See Text Example, page 99.

A category has two or more super classes that may be distinct entity types,
where other super class/subclass relationships have only one super class.
If we compare the Engineering Manager subclass, it is a sub class of each of
the three super classes, Engineer, Manager and Salaried employee, and
inherits the attributes of all three. An entity that exists in Engineering
Manager exists in all three super classes. This represents the constraint that
an Engineering Manager must be an Engineer, a Manager, AND a Salaried
Employee. It can be thought of as an AND condition.
By contrast, a category is a union of its subclasses. This means that an entity
that is a subclass of a union, exists in ONLY ONE of the super classes. An
owner may be a Company, OR a Bank OR a PERSON, but not more than one.
A category can be partial or total. A total category holds the union of ALL its
super classes, where a partial category can hold a subset of the union.
If a category is total, it can also be represented as a total specialization.

Comparison of Registered Vehicle Category to Vehicle

1.

2.
Vehicle

Car

Truck

u
d

Car

Truck

Registered_Vehicle

The first example implies that every car and truck is also a vehicle. In the
second example, a registered vehicle can be a car or a truck, but every car
and truck is not a registered vehicle.

Other examples:
University (Researcher)

Example 4.21
Design a database to keep track of information for an art museum. Assume that
the following requirements were collected.

The museum has a collection of ART_OBJECTS . Each art object has a unique
IDNo, and Artist, if known, a Year (when created, if known) a Title and a
Description. The art objects are categorized in several ways, as discussed
below.
ART_OBJECTS are categorized based on types. There are three main types,
Painting, Sculpture and Statue, plus an Other category for those that dont
fit into one of the categories above.
A PAINTING has a PaintType (oil, watercolor, etc) a material on which it is
CrawnOn (paper, canvas, wood) and Style (modern, abstract etc)
A SCULPTURE or a STATUE has a Material from which it was created (wood,
stone, etc) Height, Weight and Style.
An art object in the OTHER category has a Type(print, photo, etc) and Style.
ART_OBJECTS are also categorized as PERMANENT_COLLECTION, which are
owned by the museum (DateAcquired, whether it is OnDisplay or Stored and
Cost) or BORROWED, which has information on the Collection (where it was
borrowed from), DateBorrowed, and DateReturned.
ART_OBJECTS also have information describing their country-culture using
information on country/culture of Origin (Italian, Egyptial, American, Indian
etc) and Period(Renaissance, Modern, Ancient)
The museum keeps track of ARTISTSs information, if known: Name,
DateBorn, DateDied, CountryOfOrigin, Period, MainStyle and Description. The
name is assumed unique.
Different EXHIBITIONS occur, each having a Name, StartDate and EndDate.
EXHIBITIONS are related to all the art objects that were on display during the
exhibition.
Information is kept on other COLLECTIONS with which the museum interacts,
including Name (unique), Type (museum, personnel etc), Description,
Address, Phone and ContactPerson.

b.) A composite table represents the result of accessing one or more tables in a query.
If a query contains a single table, only one composite table exists. If one or more joins
are involved, an outer composite table consists of the intermediate result rows from the
previous join step. This intermediate result might, or might not, be materialized into a
work file.
The new table (or inner table) in a join operation is the table that is newly accessed in
the step.
A join operation can involve more than two tables. In these cases, the operation is
carried out in a series of steps. For non-star joins, each step joins only two tables.
Sometimes DB2 has to materialize a result table when an outer join is used in
conjunction with other joins, views, or nested table expressions. You can tell when this
happens by looking at the TABLE_TYPE and TNAME columns of the plan table. When
materialization occurs, TABLE_TYPE contains a W, and TNAME shows the name of the

materialized table as DSNWFQB(xx), where xx is the number of the query block


(QBLOCKNO) that produced the work file.

Cartesian join with small tables first


A Cartesian join is a form of nested loop join in which no join predicates exist
between the two tables.

Nested loop join (METHOD=1)


In nested loop join DB2 scans the composite (outer) table. For each row in that
table that qualifies (by satisfying the predicates on that table),DB2 searches for
matching rows of the new (inner) table.

When a MERGE statement is used (QBLOCK_TYPE ='MERGE')


You can determine whether a MERGE statement was used and how it was
processed by analyzing the QBLOCK_TYPE and PARENT_QBLOCKNO
columns.

Merge scan join (METHOD=2)


Merge scan join is also known as merge join or sort merge join. For this method,
there must be one or more predicates of the form TABLE1.COL1=TABLE2.COL2,
where the two columns have the same data type and length attribute.

Hybrid join (METHOD=4)


The hybrid join method applies only to an inner join, and requires an index on the
join column of the inner table.

Star schema access


DB2 can use special join methods, such as star join and pair-wise join, to
efficiently join tables that form a star schema.

c.) Stored procedures are generally non-portable, meaning they are specific to a
particular RDBMS. As a matter of fact, stored procedures tend to be specific to a
particular VERSION of a particular RDBMS.
The development tools for the lifecycle of stored procedures tend to be very limited
compared to the tools available for general programming languages/platforms. The
tools are lacking in contextual help, in storage of the code, in debugging, in
refactoring, etc.
The languages for writing stored procedures tend to be very limited compared to
general programming languages/platforms. They tend to be procedural, lack many
operations, lack most common APIs, and lack many syntax advances (classes,
scope, etc.). This has changed somewhat with the introduction of Java into Oracle
and .NET into SQL Server.

So, as a general rule, avoid writing stored procedures; writing your code in a general
programming environment is more desirable. Use stored procedures when you need
their particular advantages, which mainly means high-performance and/or tightlyisolated data processing. A typical system will then have maybe a stored procedure
or two, but definitely not dozens to hundreds.
Best wishes.
EDIT: Clarification...
Please note that I am addressing enterprise-class development in-the-large. If you
have a tiny application and a few toy stored procedures, then you can probably
ignore everyone's advice. I am assuming that the question is being asked for nontrivial scenarios.
I have dealt with every significant RDBMS over a period of nearly twenty years. I
have dealt with databases upto 138 TBs, and individual tables of 8 TBs. I have
worked with systems exceeding one thousand SPs. I have converted such
databases across major versions and across major vendors. I am an architect, DBA,
and just a programmer. If you want the benefit of such experience, then here it is. If
not, fair enough.
EDIT: Expounding...
Nearly everything done in a stored procedure can be done by issuing comparable
SQL statements from an application, particularly including anonymous procedure
blocks (the guts of an SP without the name and permanence). Doing it well can
avoid the problems and limitations of stored procedures while still retaining most of
the benefits.
However, don't forget that bad code can be written in any language, so it is just as
possible to write bad SP code as it is to write bad application code. Indeed, based on
history and reports of observations in the wild, it seems even more likely to write bad
SP code.
EDIT: @Chris Lively: regarding putting database code where the DBA can apply his
tools...
Crippling your application development by using the DBA's limited tools is not an
advantage or a step forward, nor is it even necessary.
Besides that, having been a senior DBA/architect for about twenty years, I am not
generally impressed with what most DBAs do with database code in the applications
that they support. I have mentored a lot of DBAs and programmers regarding
database code, so please let me describe what I encourage them to do.

Every DBA should know how to make the database engine show them every SQL
statement that is executed, regardless of source (inside or outside the engine), and
they should know how to analyze that SQL's performance characteristics. I
recommend that every programmer learn to do the same. If you can do this, then it
no longer matters where the SQL originated, so Chris' recommendation to put the
SQL in a SP is null and void.
If the performance of your system matters, such as when several million customers
depend on it every day, then you should be checking the performance of every piece
of SQL before it gets deployed to production. I recommend doing so as part of the
automated tests that can be run as a part of the automated build for the system.
For example, it is very easy to configure an Ant build script to issue each piece of
SQL to the database engine for an execution plan analysis. I like to save each
execution plan to a text file and commit it to source control, where I can readily see a
history of changes. I also make the build script check the execution plan against
some simple criteria to ensure that SQL changes have not altered or compromised
the performance.
Likewise, I check all my SQL into source control, and I make it easily available both
to my application (for execution) and to my build script (for verification). At a
minimum, my build script for the database can recreate the entire structure from
scratch, and I often make it capable of loading or transferring data as well.
Obviously, I can handle stored procedures, but they are just one tool among many. It
is a mistake (an antipattern) to treat SPs as a Golden Hammer.
On the other hand, when the performance really matters, a stored procedure can
often be the best and even the only option. For example, when I redesigned a
database recently for a major telecommunications provider, a stored procedure was
an essential part of the strategy. I was loading forty thousand data files per day,
totaling forty million rows, into a single database table (8 TB) that was growing past
two billion rows of current data. A public-facing web site accessed that data via a
web service, which required pulling a handful of rows from those two billion within
just a few seconds. This was done using Oracle 10g, a custom C application,
external tables, some bulk data loading, and a stored procedure. However, most of
the database code was still in the C application and the stored procedure handled
just one specific, performance-intensive piece.

Q5. a.) Describe the characteristics of an Oracle sequence.


b.)Triggers are critical to proper database operation and management.
What are some of the ways that triggers are used?
c.) Describe query optimization.

Ans . a.) A sequence is a database object that generates numbers in sequential


order. Applications most often use these numbers when they require a unique value
in a table such as primary key values. Some database management systems use an
"auto number" concept or "auto increment" setting on numeric column types. Both
the auto numbering columns and sequences provide a unique number in sequence
used for a unique identifier.

The quickest way to retrieve the data from a table is to have a column in the table
whose data uniquely identifies a row. By using this column and a specific value, in
the where condition of a select statement the oracle engine will be able to identify
and retrieve the row fast.

To achieve this, a constraint is attached to a specific column in the table that ensures
that the column is never left blank and the data in the column are unique. Since data
entry is done by human being so it is quite likely to enter duplicate values.

If the value to be entered is machine generated it will always fulfill the constraints
and the row will always be accepted for storage. So sequences plays important role
for generating unique values.

Features

The following list describes the characteristics of sequences:

Sequences are available to all users of the database.

Sequences are created using SQL statements.


Sequences have a minimum and maximum value (the defaults are minimum=0 and
maximum=263-1); they can be dropped, but not reset
Once a sequence returns a value, the sequence can never return that same value.
While sequence values are not tied to any particular table, a sequence is usually
used to generate values for only one table.
Sequences increment by an amount specified when created (the default is 1).

Oracle allows you to define procedures that are implicitly


executed when an INSERT, UPDATE, or DELETE statement is
issued against the associated table. These procedures are called
database triggers.
b.)

Triggers are similar to stored procedures, discussed in Chapter 14, "Procedures and
Packages". A trigger can include SQL and PL/SQL statements to execute as a unit
and can invoke stored procedures. However, procedures and triggers differ in the
way that they are invoked. While a procedure is explicitly executed by a user,
application, or trigger, one or more triggers are implicitly fired (executed) by
Oracle when a triggering INSERT, UPDATE, or DELETE statement is issued, no
matter which user is connected or which application is being used.
For example, Figure 15 - 1 shows a database application with some SQL
statements that implicitly fire several triggers stored in the database.

Figure 15 - 1. Triggers
Notice that triggers are stored in the database separately from their associated
tables.
Triggers can be defined only on tables, not on views. However, triggers on the base
table(s) of a view are fired if an INSERT, UPDATE, or DELETE statement is
issued against a view.
How Triggers Are Used

In many cases, triggers supplement the standard capabilities of


Oracle to provide a highly customized database management
system. For example, a trigger can permit DML operations against
a table only if they are issued during regular business hours. The
standard security features of Oracle, roles and privileges, govern
which users can submit DML statements against the table. In
addition, the trigger further restricts DML operations to occur only
at certain times during weekdays. This is just one way that you
can use triggers to customize information management in an
Oracle database.
In addition, triggers are commonly used to
automatically generate derived column values

prevent invalid transactions


enforce complex security authorizations
enforce referential integrity across nodes in a distributed
database
enforce complex business rules
provide transparent event logging
provide sophisticated auditing
maintain synchronous table replicates
gather statistics on table access
Examples of many of these different trigger uses are included in
the Oracle7 Server Application Developer's Guide.
A Cautionary Note about Trigger Use

When a trigger is fired, a SQL statement within its trigger action


potentially can fire other triggers, as illustrated in Figure 15 - 2.
When a statement in a trigger body causes another trigger to be
fired, the triggers are said to be cascading.

Figure 15 - 2. Cascading Triggers


While triggers are useful for customizing a database, you should only use triggers
when necessary. The excessive use of triggers can result in complex
interdependences, which may be difficult to maintain in a large application.
Database Triggers vs. Oracle Forms Triggers

Oracle Forms can also define, store, and execute triggers.


However, do not confuse Oracle Forms triggers with the database
triggers discussed in this chapter.
Database triggers are defined on a table, stored in the associated database, and
executed as a result of an INSERT, UPDATE, or DELETE statement being issued
against a table, no matter which user or application issues the statement.

Oracle Forms triggers are part of an Oracle Forms application and are fired only
when a specific trigger point is executed within a specific Oracle Forms
application. SQL statements within an Oracle Forms application, as with any
database application, can implicitly cause the firing of any associated database
trigger. For more information about Oracle Forms and Oracle Forms triggers, see
the Oracle Forms User's Guide.
Triggers vs. Declarative Integrity Constraints

c.) Query optimization is the overall process of choosing the most


efficient means of executing a SQL statement. The database optimizes
each SQL statement based on statistics collected about the actual data
being accessed. The optimizer uses the number of rows, the size of the
data set, and other factors to generate possible execution plans, assigning
a numeric cost to each plan. The database uses the plan with the lowest
cost.
In Oracle Database optimization, cost represents the estimated resource
usage for an execution plan. The optimizer is sometimes called the costbased optimizer (CBO) to contrast with the legacy rule-based optimizer.
The CBO bases the cost of access paths and join methods on the
estimated system resources, which includes I/O, CPU, and memory. The
plan with the lowest cost is selected.
Note:
The optimizer may not make the same decisions from one version of Oracle
Database to the next. In recent versions, the optimizer might make different
decision because better information is available and more optimizer
transformations are possible.

4.1.3 Execution Plans


An execution plan describes a recommended method of execution for a
SQL statement. The plans shows the combination of the steps Oracle
Database uses to execute a SQL statement. Each step either retrieves
rows of data physically from the database or prepares them for the user
issuing the statement.
In Figure 4-1, the optimizer generates two possible execution plans for an
input SQL statement, uses statistics to calculate their costs, compares
their costs, and chooses the plan with the lowest cost.
Figure 4-1 Execution Plans

Description of "Figure 4-1 Execution Plans"

4.1.3.1 Query Blocks


As shown in Figure 4-1, the input to the optimizer is a parsed
representation of a SQL statement. Each SELECT block in the original SQL
statement is represented internally by a query block. A query block can
be a top-level statement, subquery, or unmerged view (see "View
Merging").
In Example 4-1, the SQL statement consists of two query blocks. The
subquery in parentheses is the inner query block. The outer query block,
which is the rest of the SQL statement, retrieves names of employees in
the departments whose IDs were supplied by the subquery.
Example 4-1 Query Blocks
SELECT first_name, last_name
FROM

hr.employees

WHERE

department_id

IN

(SELECT department_id
FROM

hr.departments

WHERE

location_id = 1800);

The query form determines how query blocks are interrelated.


See Also:
Oracle Database Concepts for an overview of SQL processing

4.1.3.2 Query Subplans


For each query block, the optimizer generates a query subplan. The
database optimizes query blocks separately from the bottom up. Thus, the
database optimizes the innermost query block first and generates a
subplan for it, and then generates the outer query block representing the
entire query.
The number of possible plans for a query block is proportional to the
number of objects in the FROM clause. This number rises exponentially with
the number of objects. For example, the possible plans for a join of five
tables are significantly higher than the possible plans for a join of two
tables.

Analogy for the Optimizer


One analogy for the optimizer is an online trip advisor. A cyclist wants to
know the most efficient bicycle route from point A to point B. A query is
like the directive "I need the most efficient route from point A to point B"
or "I need the most efficient route from point A to point B by way of point
C." The trip advisor uses an internal algorithm, which relies on factors
such as speed and difficulty, to determine the most efficient route. The
cyclist can influence the trip advisor's decision by using directives such as
"I want to arrive as fast as possible" or "I want the easiest ride possible."
In this analogy, an execution plan is a possible route generated by the trip
advisor. Internally, the advisor may divide the overall route into several
subroutes (subplans), and calculate the efficiency for each subroute
separately. For example, the trip advisor may estimate one subroute at 15
minutes with medium difficulty, an alternative subroute at 22 minutes
with minimal difficulty, and so on.
The advisor picks the most efficient (lowest cost) overall route based on
user-specified goals and the available statistics about roads and traffic
conditions. The more accurate the statistics, the better the advice. For
example, if the advisor is not frequently notified of traffic jams, road
closures, and poor road conditions, then the recommended route may turn
out to be inefficient (high cost).

PART C
Q1. a.) Explain ORDER BY and GROUP BY clause with an example.
b.)What is the difference between a nonprocedural language and a pro
cedural language? Give an
example of each.

Ans . a.) The GROUP BY clause will gather all of the rows together that
contain data in the specified column(s) and will allow aggregate functions
to be performed on the one or more columns. This can best be explained
by an example:
GROUP BY clause syntax:

SELECTcolumn1,
SUM(column2)

FROM"listoftables"

GROUPBY"columnlist";

Let's say you would like to retrieve a list of the highest paid salaries in
each dept:

SELECTmax(salary),dept

FROMemployee

GROUPBYdept;

This statement will select the maximum salary for the people in each
unique department. Basically, the salary for the person who makes the
most in each department will be displayed. Their, salary and their
department will be returned.
Multiple Grouping Columns - What if I wanted to display their lastname
too?

Use these tables for the exercises


items_ordered
customers

For example, take a look at the items_ordered table. Let's say you want
to group everything of quantity 1 together, everything of quantity 2
together, everything of quantity 3 together, etc. If you would like to
determine what the largest cost item is for each grouped quantity (all
quantity 1's, all quantity 2's, all quantity 3's, etc.), you would enter:


SELECTquantity,max(price)

FROMitems_ordered

GROUPBYquantity;

Enter the statement in above, and take a look at the results to see if it
returned what you were expecting. Verify that the maximum price in each
Quantity Group is really the maximum price.
Review Exercises
1. How many people are in each unique state in the customers table? Select
the state and display the number of people in each. Hint: count is used to
count rows in a column, sum works on numeric data only.
2. From the items_ordered table, select the item, maximum price, and
minimum price for each specific item in the table. Hint: The items will
need to be broken up into separate groups.
3. How many orders did each customer make? Use the items_ordered table.
Select the customerid, number of orders they made, and the sum of their
orders. Click the Group By answers link below if you have any problems.

There are two types of Data manipulation language (DML). One is


known as nonprocedural DML and other is known as procedural DML.
b.)

Nonprocedural DML: It is also known as high level Data Manipulation


language. It is used to specify complex database operations. We can
enter these high level DML statements from a display monitor with the
help of Database Management Systems or these statements can also be
entered through a terminal. We can also embed these high level DML
statements in a programming language.
Procedural DML: It is also known as low level DML. It is used to get data
or objects from the database. It processes each operation separately.
That's why it has to use programming language constructs to get a
record or to process each record from a set of records. Because of this
property low level DML is also called set at a time or set oriented DMLs.
Low level and high level DMLs are considered as part of the query
language because both languages may be used interactively. Normally
casual database (end) users use a nonprocedural language.

Q2.a).
What are views? Most database management systems support the creat
ion of views. Give
reasons.
b.)How can you usen the COMMIT, SAVEPOINT and ROLLBACK command
s to support transactions?

Ans a). A database is an organized collection of data. The data are typically
organized to model relevant aspects of reality in a way that supports processes requiring
this information. For example, modeling the availability of rooms in hotels in a way that
supports finding a hotel with vacancies.
Database management systems (DBMSs) are specially designed applications that
interact with the user, other applications, and the database itself to capture and analyze
data. A general-purpose database management system (DBMS) is a software system
designed to allow the definition, creation, querying, update, and administration of
databases. Well-known DBMSs
includeMySQL, MariaDB, PostgreSQL, SQLite, Microsoft SQL
Server, Oracle, SAP, dBASE, FoxPro, IBM DB2, LibreOffice Base and FileMaker Pro. A
database is not generally portable across different DBMS, but different DBMSs can
interoperate by using standards such as SQL and ODBC or JDBC to allow a single
application to work with more than one database

b.) JDBC Connection is in auto-commit mode, which it is by default, then every SQL statement
is committed to the database upon its completion.
That may be fine for simple applications, but there are three reasons why you may want to turn
off auto-commit and manage your own transactions:

To increase performance

To maintain the integrity of business processes

To use distributed transactions


Transactions enable you to control if, and when, changes are applied to the database. It treats a
single SQL statement or a group of SQL statements as one logical unit, and if any statement
fails, the whole transaction fails.
To enable manual- transaction support instead of the auto-commit mode that the JDBC driver
uses by default, use the Connection object's setAutoCommit() method. If you pass a boolean

false to setAutoCommit( ), you turn off auto-commit. You can pass a boolean true to turn it back
on again.
For example, if you have a Connection object named conn, code the following to turn off autocommit:

Q3. a.)
What is an index? What are the disadvantages of using an index?
b. )Describe the format for the UPDATE command.

Ans a.) There are tradeoffs to almost any feature in computer programming, and
indexes are no exception. While indexes provide a substantial performance benefit to
searches, there is also a downside to indexing. Let's talk about some of those
drawbacks now.
Indexes and Disk Space
Indexes are stored on the disk, and the amount of space required will depend on the
size of the table, and the number and types of columns used in the index. Disk
space is generally cheap enough to trade for application performance, particularly
when a database serves a large number of users. To see the space required for a
table, use the sp_spaceused system stored procedure in a query window.
EXEC sp_spaceused Orders
Given a table name (Orders), the procedure will return the amount of space used by
the data and all indexes associated with the table, like so:
Name

rows

index_size

unused

------- -------- ----------- ------

----------

-------

Orders

320 KB

24 KB

830

reserved

504 KB

data

160 KB

According to the output above, the table data uses 160 kilobytes, while the table
indexes use twice as much, or 320 kilobytes. The ratio of index size to table size can
vary greatly, depending on the columns, data types, and number of indexes on a
table.
Indexes and Data Modification

Another downside to using an index is the performance implication on data


modification statements. Any time a query modifies the data in a table (INSERT,
UPDATE, or DELETE), the database needs to update all of the indexes where data
has changed. As we discussed earlier, indexing can help the database during data
modification statements by allowing the database to quickly locate the records to
modify, however, we now caveat the discussion with the understanding that
providing too many indexes to update can actually hurt the performance of data
modifications. This leads to a delicate balancing act when tuning the database for
performance.
A Disadvantage to Clustered Indexes
If we update a record and change the value of an indexed column in a clustered
index, the database might need to move the entire row into a new position to keep
the rows in sorted order. This behavior essentially turns an update query into a
DELETE followed by an INSERT, with an obvious decrease in performance. A table's
clustered index can often be found on the primary key or a foreign key column,
because key values generally do not change once a record is inserted into the
database.

b. ) To create a PROC SQL table from a query result, use a CREATE TABLE statement,
and place it before the SELECT statement. When a table is created this way, its data is
derived from the table or view that is referenced in the query's FROM clause. The new
table's column names are as specified in the query's SELECT clause list. The column
attributes (the type, length, informat, and format) are the same as those of the selected
source columns.
The following CREATE TABLE statement creates the DENSITIES table from the
COUNTRIES table. The newly created table is not displayed in SAS output unless you
query the table. Note the use of the OUTOBS option, which limits the size of the
DENSITIES table to 10 rows.
proc sql outobs=10;
title 'Densities of Countries';
create table sql.densities as
select Name 'Country' format $15.,
Population format=comma10.0,
Area as SquareMiles,
Population/Area format=6.2 as Density
from sql.countries;

select * from sql.densities;

Table Created from a Query Result


Densities of Countries
Country
Population SquareMiles Density
------------------------------------------------Afghanistan
17,070,323
251825
67.79
Albania
3,407,400
11100
306.97
Algeria
28,171,132
919595
30.63
Andorra
64,634
200
323.17
Angola
9,901,050
481300
20.57
Antigua and Bar
65,644
171
383.88
Argentina
34,248,705
1073518
31.90
Armenia
3,556,864
11500
309.29
Australia
18,255,944
2966200
6.15
Austria
8,033,746
32400
247.96

The following DESCRIBE TABLE statement writes a CREATE TABLE statement to the
SAS log:
proc sql;
describe table sql.densities;

SAS Log for DESCRIBE TABLE Statement for DENSITIES


NOTE: SQL table SQL.DENSITIES was created like:
create table SQL.DENSITIES( bufsize=8192 )
(
Name char(35) format=$15. informat=$35. label='Country',
Population num format=COMMA10. informat=BEST8. label='Population',
SquareMiles num format=BEST8. informat=BEST8. label='SquareMiles',
Density num format=6.2
);

In this form of the CREATE TABLE statement, assigning an alias to a column renames
the column, while assigning a label does not. In this example, the Area column has been
renamed to SquareMiles, and the calculated column has been named Densities.
However, the Name column retains its name, and its display label is Country .
Q4. a. )
Describe the format of the ALTER TABLE command to add a new column.

b.)There are three set operations: union, intersection, difference. Defin


e each of these operations. Which are supported by Oracle?

Ans a. ) When declaring string or binary column types the maximum size must be

specified. The following example declares a string column that can grow to a
maximum of 100 characters,
CREATE TABLE Table ( str_col VARCHAR(100) )

When handling strings the database will only allocate as much storage space as the
string uses up. If a 10 character string is stored in str_col then only space for 10
characters will be allocated in the database. So if you need a column that can store
a string of any size, use an arbitrarily large number when declaring the column.
Mckoi SQL Database does not use a fixed size storage mechanism when storing
variable length column data.
is a column type that can contain serializable Java objects.
The JAVA_OBJECT type has an optional Java class definition that is used for runtime
class constraint checking. The following example demonstrates creating
a JAVA_OBJECT column.
JAVA_OBJECT

CREATE TABLE ObjectTable (


obj_id NUMERIC, obj JAVA_OBJECT(java.awt.Point))

If the Java class is not specified the column defaults to java.lang.Object which
effectively means any type of serializable Java object can be kept in the column.
String types may have a COLLATE clause that changes the collation ordering of the
string based on a language. For example, the folling statement creates a string that
can store and order Japanese text;
CREATE TABLE InternationalTable (
japanese_text VARCHAR(4000) COLLATE 'jaJP')

The 'jaJP' is an ISO localization code for the Japanese language in Japan. Other
locale codes can be found in the documentation to java.text.Collate.
Unique, primary/foreign key and check integrity constraints can be defined in
the CREATE TABLE statement. The following is an example of defining a table with
integrity constraints.
CREATE TABLE Customer
number VARCHAR(40)
name
VARCHAR(100)
ssn
VARCHAR(50)
age
INTEGER

(
NOT
NOT
NOT
NOT

NULL,
NULL,
NULL,
NULL,

CONSTRAINT cust_pk PRIMARY KEY (number),


UNIQUE ( ssn ),
// (An anonymous constraint)
CONSTRAINT age_check CHECK (age >= 0 AND age < 200)

3. ALTER TABLE syntax


ALTER
ALTER
ALTER
ALTER
ALTER
ALTER
ALTER
ALTER

TABLE table_name ADD [COLUMN] column_declare


TABLE table_name ADD constraint_declare
TABLE table_name DROP [COLUMN] column_name
TABLE table_name DROP CONSTRAINT constraint_name
TABLE table_name DROP PRIMARY KEY
TABLE table_name ALTER [COLUMN] column_name SET default_expr
TABLE table_name ALTER [COLUMN] column_name DROP DEFAULT
CREATE TABLE ....

is used to add / remove / modify the columns and integrity constraints of a


table. The ADD [COLUMN] form adds a new column definition to the table (using the
same column declaration syntax in the CREATE command). The DROP [COLUMN] form
drops the column with the name from the table. ALTER [COLUMN] column_name SET
default_expr alters the default value for the column. ALTER [COLUMN] column_name
DROP DEFAULT removes the default value set for the column.
ALTER

The following example adds a new column to a ta


b.) Precedence is the order in which Oracle evaluates different operators in the

same expression. When evaluating an expression containing multiple operators,


Oracle evaluates operators with higher precedence before evaluating those with
lower precedence. Oracle evaluates operators with equal precedence from left to
right within an expression.
Table 3-1 lists the levels of precedence among SQL operators from high to low.
Operators listed on the same line have the same precedence.
Table 3-1 SQL Operator Precedence
Operator

Operation

+, -

identity, negation

*, /

multiplication, division

+, -, ||

addition, subtraction,
concatenation

=, !=, <, >, <=, >=, IS NULL, LIKE, BETWEEN,

comparison

Operator

Operation

IN
NOT

exponentiation, logical negation

AND

conjunction

OR

disjunction

Precedence Example
In the following expression, multiplication has a higher precedence than addition,
so Oracle first multiplies 2 by 3 and then adds the result to 1.
1+2*3

You can use parentheses in an expression to override operator precedence. Oracle


evaluates expressions inside parentheses before evaluating those outside.
SQL also supports set operators (UNION, UNION ALL, INTERSECT, and MINUS), which
combine sets of rows returned by queries, rather than individual data items. All set
operators have equal precedence.
Arithmetic Operators

You can use an arithmetic operator in an expression to negate, add, subtract,


multiply, and divide numeric values. The result of the operation is also a numeric
value. Some of these operators are also used in date arithmetic.Table 3-2 lists
arithmetic operators.
Table 3-2 Arithmetic Operators
Operator Purpose

Example

+-

SELECT * FROM
orders
WHERE qtysold =
-1;
SELECT * FROM emp
WHERE -sal < 0;

When these denote a positive or negative expression,


they are unary operators.

Operator Purpose

*/

Example

When they add or subtract, they are binary operators.

SELECT sal + comm


FROM emp
WHERE SYSDATE hiredate
> 365;

Multiply, divide. These are binary operators.

UPDATE emp
SET sal = sal *
1.1;

Do not use two consecutive minus signs (--) in arithmetic expressions to indicate
double negation or the subtraction of a negative value. The characters -- are used to
begin comments within SQL statements. You should separate consecutive minus
signs with a space or a parenthesis.
See Also: "Comments" for more information on comments within SQL
statements

Concatenation Operator

The concatenation operator manipulates character strings. Table 3-3 describes the
concatenation operator.
Table 3-3 Concatenation Operator
Operator

Purpose

Example

||

Concatenates character strings.

SELECT 'Name is ' || ename


FROM emp;

The result of concatenating two character strings is another character string. If both
character strings are of datatype CHAR, the result has datatype CHAR and is limited to
2000 characters. If either string is of datatype VARCHAR2, the result has
datatype VARCHAR2 and is limited to 4000 characters. Trailing blanks in character
strings are preserved by concatenation, regardless of the strings' datatypes.

On most platforms, the concatenation operator is two solid vertical bars, as shown
in Table 3-3. However, some IBM platforms use broken vertical bars for this
operator. When moving SQL script files between systems having different
character sets, such as between ASCII and EBCDIC, vertical bars might not be
translated into the vertical bar required by the target Oracle environment. Oracle
provides the CONCAT character function as an alternative to the vertical bar operator
for cases when it is difficult or impossible to control translation performed by
operating system or network utilities. Use this function in applications that will be
moved between environments with differing character sets.
Although Oracle treats zero-length character strings as nulls, concatenating a zerolength character string with another operand always results in the other operand, so
null can result only from the concatenation of two null strings. However, this may
not continue to be true in future versions of Oracle. To concatenate an expression
that might be null, use the NVL function to explicitly convert the expression to a
zero-length string.

Q5.a.)Should a user be allowed to enter null values for the primary key?
Give reasons for your answer.
b.)
What is Data Independence? Explain Logical data independence Physica
l data independence.

A primary key is a field or set of fields in your


table that provide Microsoft Office Access 2007 with
a unique identifier for every row. In a relational
database, such as an Office Access 2007 database,
you divide your information into separate, subjectbased tables. You then use table relationships and
primary keys to tell Access how to bring the
information back together again. Access uses
primary key fields to quickly associate data from
multiple tables and combine that data in a
meaningful way.
Ans a.)

Often, a unique identification number, such as an ID


number or a serial number or code, serves as a
primary key in a table. For example, you might have

a Customers table where each customer has a


unique customer ID number. The customer ID field is
the primary key.
An example of a poor choice for a primary key would
be a name or address. Both contain information that
might change over time.
Access ensures that every record has a value in the
primary key field, and that the value is always
unique.

b.) Data independence is the type of data transparency that matters for a
centralized DBMS. It refers to the immunity of user applications to make changes in the
definition and organization of data.
Physical data independence deals with hiding the details of the storage structure from
user applications. The application should not be involved with these issues, since there
is no difference in the operation carried out against the data.
The data independence and operation independence together gives the feature of data
abstraction. There are two levels of data independence. The physical structure of the
data is referred to as "physical data description". Physical data independence deals with
hiding the details of the storage structure from user applications. The application should
not be involved with these issues since, conceptually, there is no difference in the
operations carried out against the data. There are three types of data independence:
1. Logical data independence: The ability to change the logical (conceptual) schema
without changing the External schema (User View) is called logical data
independence. For example, the addition or removal of new entities, attributes, or
relationships to the conceptual schema should be possible without having to
change existing external schemas or having to rewrite existing application
programs.
2. Physical data independence: The ability to change the physical schema without
changing the logical schema is called physical data independence. For example,
a change to the internal schema, such as using different file organization or
storage structures, storage devices, or indexing strategy, should be possible
without having to change the conceptual or external schemas.

3. View level data independence: always independent no effect, because there


doesn't exist any other level above view level.

Data Independence Types


Data independence has two types: Physical Independence and Logical Independence.
Data independence can be explained as follows: Each higher level of the data
architecture is immune to changes of the next lower level of the architecture.

Physical Independence
The logical scheme stays unchanged even though the storage space or type of some
data is changed for reasons of optimization or reorganization. In this external schema
does not change. In this internal schema changes may be required due to some physical
schema were reorganized here. Physical data independence is present in most
databases and file environment in which hardware storage of encoding, exact location of
data on disk,merging of records, so on this are hidden from user.
One of the biggest advantages of database is data independence. It means we can
change the conceptual schema at one level without affecting the data at other level. It
means we can change the structure of a database without affecting the data required by
users and program. This feature was not available in file oriented approach. There are
two types of data independence and they are:
1. Physical data independence
2. Logical data independence

CASE STUDY - I

Questions:
Q1. Design the database system for Laxmi Cycles.
Ans A dynamic and RESULTS oriented Talent Manager with over Four years of
recruiting experience across Information Technology. High energy with great
relationship, team building, leadership, and communication skills.

My focus is to understand both the client and candidate's unique requirements and
leverage their expertise and knowledge to align their search to specific corporate
cultures. I find the right professional talent for my client's business, as I understand,
anticipate and fulfill both sides of the business transaction.
Q2. Draw the corresponding ER Diagram for the above

Ans Draw the corresponding ERD for the following data structure:

ERD First Draft.jpgProducts(ProID, Descrip, Cost, Price, CatID)


People(ID, FName, LName, Phone)
Clients(CID, CreditLimit)
Employees(EID, DOH, DOB, SupervisorID)
ClientAddress(CLID, CID, Street1, Street2, City, State, Zip)
PriceHistory(PrID, ProductID, Price, StartDate, EndDate)
InvDetail(InvNum, DetID, Qty, ProdID, UnitPrice, Discount)
InvHeader(InvNum, Date, ClientID, CliAddrID, SalesPersonID, Memo)
Categories(ID, Description)
Underline stands for primary key
Italic stands for Foreign key.
Q3. Write a brief description of how the database could be
enhanced to further improve the management of the business.
Ans The quality of data contained in an EHR is dependent on accurate information

at the point of capturethe data source. Clinical documentation also plays a key
role in data quality. Clinical documentation practices need to be developed and
standardized to facilitate accurate data capture and encoding. In an EHR, it is
imperative these content standards are built into the fiber of decision making
screens, templates, drop-down lists and other tools for documentation.
Additionally, establishing consistent data models will assure the integrity and
quality of the data maintained in the EHR. Standardization of data definitions and
structure for clinical content (including smart text)and quality checkpoints, along
with traditional auditing procedureshelp ensure quality data is captured.
Productivity and effectiveness of new tools such as natural language processing
(NLP) and computer-assisted coding (CAC) can be enhanced when these controls
are in place.
CASE STUDY-II

Questions:
Q1. Design a database that stores the Cab service companys
information. Identify the entities
of interest and show their attributes.

Ans Data analysis is concerned with the NATURE and USE of data. It

involves the identification of the data elements which are needed to support
the data processing system of the organization, the placing of these
elements into logical groups and the definition of the relationships between
the resulting groups.
Other approaches, e.g. D.F.Ds and Flowcharts, have been concerned with
the flow of data-dataflow methodologies. Data analysis is one of several
data structure based methodologies Jackson SP/D is another.

Systems analysts often, in practice, go directly from fact finding to


implementation dependent data analysis. Their assumptions about the
usage of properties of and relationships between data elements are
embodied directly in record and file designs and computer procedure
specifications. The introduction of Database Management Systems
(DBMS) has encouraged a higher level of analysis, where the data
elements are defined by a logical model or `schema' (conceptual schema).
When discussing the schema in the context of a DBMS, the effects of
alternative designs on the efficiency or ease of implementation is
considered, i.e. the analysis is still somewhat implementation dependent. If
we consider the data relationships, usages and properties that are
important to the business without regard to their representation in a
particular computerised system using particular software, we have what we
are concerned with, implementationindependent data analysis.
Q2.
What relationships exist among these entities? Explain. Draw the corres
ponding ER Diagram.
Ans A data entity is anything real or abstract about which we want to store

data. Entity types fall into five classes: roles, events, locations, tangible
things or concepts. E.g. employee, payment, campus, book. Specific
examples of an entity are called instances. E.g. the employee John Jones,
Mary Smith's payment, etc.
Relationship
A data relationship is a natural association that exists between one or more
entities. E.g. Employees process payments. Cardinality defines the
number of occurrences of one entity for a single occurrence of the related
entity. E.g. an employee may process many payments but might not
process any payments depending on the nature of her job.
Attribute

A data attribute is a characteristic common to all or most instances of a


particular entity. Synonyms include property, data element, field. E.g.
Name, address, Employee Number, pay rate are all attributes of the entity
employee. An attribute or combination of attributes that uniquely identifies

one and only one instance of an entity is called a primary key or identifier.
E.g. Employee Number is a primary key for Employee.

AN ENTITY RELATIONSHIP DIAGRAM METHODOLOGY: (One way of doing it)


Identify the roles, events, locations, tangible things or concepts about which t
1. Identify Entities
users want to store data.
2. Find Relationships

Find the natural associations between pairs of entities using a relationship m

3. Draw Rough ERD

Put entities in rectangles and relationships on line segments connecting the e

4. Fill in Cardinality

Determine the number of occurrences of one entity for a single occurrence of


related entity.

5. Define Primary Keys

Identify the data attribute(s) that uniquely identify one and only one occurrenc
each entity.

6. Draw Key-Based ERD

Eliminate Many-to-Many relationships and include primary and foreign keys i


entity.

7. Identify Attributes

Name the information details (fields) which are essential to the system under
development.

8. Map Attributes

For each attribute, match it with exactly one entity that it describes.

9. Draw fully attributed ERD

Adjust the ERD from step 6 to account for entities or relationships discovered
8.

10. Check Results

Does the final Entity Relationship Diagram accurately depict the system data

Q3. Write a brief description of how the database could be


enhanced to further improve management of the business.

Ans The ability to share electronic health information both internally and

externally with healthcare organizations has been accepted as a method to improve


the quality and delivery of care.2 Data integrity is critical to meeting these
expectations. A single error in an electronic environment presents a risk that can be
magnified as the data transmits further downstream to data sets, interfaced systems,
and data warehouses.3 Accurate data leads to quality information that is required
for quality decision making and patient care.
The quality of data contained in an EHR is dependent on accurate information at
the point of capturethe data source. Clinical documentation also plays a key role
in data quality. Clinical documentation practices need to be developed and
standardized to facilitate accurate data capture and encoding. In an EHR, it is
imperative these content standards are built into the fiber of decision making
screens, templates, drop-down lists and other tools for documentation.

Q4. In your opinion, is it advisable for the Cab service company to go


for an in-house development of DBMS? Give reasons for your answer.
Ans A collection of interrelated data together with a set of programs to

access the data, also called database system, or simply database. The
primary goal of such a system is to provide an environment that is both
convenient and efficient to use in retrieving and storing information.
A database management system (DBMS) is designed to manage a large body
of information. Data management involves both defining structures for
storing information and providing mechanisms for manipulating the
information. In addition, the database system must provide for the safety of
the stored information, despite system crashes or attempts at unauthorized
access. If data are to be shared among several users, the system must avoid
possible anomalous results due to multiple users concurrently accessing the
same data.
Examples of the use of database systems include airline reservation systems,
company payroll and employee information systems, banking systems, credit
card processing systems, and sales and order tracking systems.
A major purpose of a database system is to provide users with an abstract
view of the data. That is, the system hides certain details of how the data are
stored and maintained. Thereby, data can be stored in complex data
structures that permit efficient retrieval, yet users see a simplified and easyto-use view of the data. The lowest level of abstraction, the physical level,
describes how the data are actually stored and details the data structures.
The next-higher level of abstraction, the logical level, describes what data
are stored, and what relationships exist among those data. The highest level
of abstraction, the view level, describes parts of the database that are
relevant to each user; application programs used to access a database form
part of the view level.

S-ar putea să vă placă și