Sunteți pe pagina 1din 19

MANILA AVENTIST COLLEGE

College of Business
1975 Donada St., Pasay City

INDEPENDENT STUDY
In
Database Theory and Application

EMMANUEL F. ESCULLAR
Submitted by:

JOEL P. ALABATA
Submitted to:
1. History of Database
The sizes, capabilities, and performance of databases and their respective DBMSs have grown in
orders of magnitude. These performance increases were enabled by the technology progress in
the areas of processors, computer memory, computer storage, and computer networks. The
development of database technology can be divided into three eras based on data model or
structure: navigational,[8] SQL/relational, and post-relational.
The two main early navigational data models were the hierarchical model and
the CODASYL model (network model)
The relational model, first proposed in 1970 by Edgar F. Codd, departed from this tradition by
insisting that applications should search for data by content, rather than by following links. The
relational model employs sets of ledger-style tables, each used for a different type of entity. Only
in the mid-1980s did computing hardware become powerful enough to allow the wide
deployment of relational systems (DBMSs plus applications). By the early 1990s, however,
relational systems dominated in all large-scale processing applications, and as of 2018 they
remain dominant: IBM DB2, Oracle, MySQL, and Microsoft SQL Server are the most
searched DBMS.[9] The dominant database language, standardized SQL for the relational model,
has influenced database languages for other data models.[citation needed]
Object databases were developed in the 1980s to overcome the inconvenience of object-
relational impedance mismatch, which led to the coining of the term "post-relational" and also
the development of hybrid object-relational databases.
The next generation of post-relational databases in the late 2000s became known
as NoSQL databases, introducing fast key-value stores and document-oriented databases. A
competing "next generation" known as NewSQL databases attempted new implementations that
retained the relational/SQL model while aiming to match the high performance of NoSQL
compared to commercially available relational DBMSs.

1960s, navigational DBMS Further information: Navigational database


Basic structure of navigational CODASYLdatabase model
The introduction of the term database coincided with the availability of direct-access storage
(disks and drums) from the mid-1960s onwards. The term represented a contrast with the tape-
based systems of the past, allowing shared interactive use rather than daily batch processing.
The Oxford English Dictionary cites a 1962 report by the System Development Corporation of
California as the first to use the term "data-base" in a specific technical sense.[10]
As computers grew in speed and capability, a number of general-purpose database systems
emerged; by the mid-1960s a number of such systems had come into commercial use. Interest in
a standard began to grow, and Charles Bachman, author of one such product, the Integrated Data
Store (IDS), founded the "Database Task Group" within CODASYL, the group responsible for
the creation and standardization of COBOL. In 1971, the Database Task Group delivered their
standard, which generally became known as the "CODASYL approach", and soon a number of
commercial products based on this approach entered the market.
The CODASYL approach relied on the "manual" navigation of a linked data set which was
formed into a large network. Applications could find records by one of three methods:
1. Use of a primary key (known as a CALC key, typically implemented by hashing)
2. Navigating relationships (called sets) from one record to another
3. Scanning all the records in a sequential order
Later systems added B-trees to provide alternate access paths. Many CODASYL databases also
added a very straightforward query language. However, in the final tally, CODASYL was very
complex and required significant training and effort to produce useful applications.
IBM also had their own DBMS in 1966, known as Information Management System (IMS). IMS
was a development of software written for the Apollo program on the System/360. IMS was
generally similar in concept to CODASYL, but used a strict hierarchy for its model of data
navigation instead of CODASYL's network model. Both concepts later became known as
navigational databases due to the way data was accessed, and Bachman's 1973 Turing
Award presentation was The Programmer as Navigator. IMS is classified by IBM as
a hierarchical database. IDMS and Cincom Systems' TOTAL database are classified as network
databases. IMS remains in use as of 2014.[11]
1970s, relational DBMS
Edgar Codd worked at IBM in San Jose, California, in one of their offshoot offices that was
primarily involved in the development of hard disk systems. He was unhappy with the
navigational model of the CODASYL approach, notably the lack of a "search" facility. In 1970,
he wrote a number of papers that outlined a new approach to database construction that
eventually culminated in the groundbreaking A Relational Model of Data for Large Shared Data
Banks.[12]
In this paper, he described a new system for storing and working with large databases. Instead of
records being stored in some sort of linked list of free-form records as in CODASYL, Codd's
idea was to use a "table" of fixed-length records, with each table used for a different type of
entity. A linked-list system would be very inefficient when storing "sparse" databases where
some of the data for any one record could be left empty. The relational model solved this by
splitting the data into a series of normalized tables (or relations), with optional elements being
moved out of the main table to where they would take up room only if needed. Data may be
freely inserted, deleted and edited in these tables, with the DBMS doing whatever maintenance
needed to present a table view to the application/user.
In the relational model, records are "linked" using virtual keys not stored in the database but
defined as needed between the data contained in the records.
The relational model also allowed the content of the database to evolve without constant
rewriting of links and pointers. The relational part comes from entities referencing other entities
in what is known as one-to-many relationship, like a traditional hierarchical model, and many-to-
many relationship, like a navigational (network) model. Thus, a relational model can express
both hierarchical and navigational models, as well as its native tabular model, allowing for pure
or combined modeling in terms of these three models, as the application requires.
For instance, a common use of a database system is to track information about users, their name,
login information, various addresses and phone numbers. In the navigational approach, all of this
data would be placed in a single record, and unused items would simply not be placed in the
database. In the relational approach, the data would be normalized into a user table, an address
table and a phone number table (for instance). Records would be created in these optional tables
only if the address or phone numbers were actually provided.
Linking the information back together is the key to this system. In the relational model, some bit
of information was used as a "key", uniquely defining a particular record. When information was
being collected about a user, information stored in the optional tables would be found by
searching for this key. For instance, if the login name of a user is unique, addresses and phone
numbers for that user would be recorded with the login name as its key. This simple "re-linking"
of related data back into a single collection is something that traditional computer languages are
not designed for.
Just as the navigational approach would require programs to loop in order to collect records, the
relational approach would require loops to collect information about any one record. Codd's
suggestions was a set-oriented language, that would later spawn the ubiquitous SQL. Using a
branch of mathematics known as tuple calculus, he demonstrated that such a system could
support all the operations of normal databases (inserting, updating etc.) as well as providing a
simple system for finding and returning sets of data in a single operation.
Codd's paper was picked up by two people at Berkeley, Eugene Wong and Michael Stonebraker.
They started a project known as INGRES using funding that had already been allocated for a
geographical database project and student programmers to produce code. Beginning in 1973,
INGRES delivered its first test products which were generally ready for widespread use in 1979.
INGRES was similar to System R in a number of ways, including the use of a "language"
for data access, known as QUEL. Over time, INGRES moved to the emerging SQL standard.
IBM itself did one test implementation of the relational model, PRTV, and a production
one, Business System 12, both now discontinued. Honeywell wrote MRDS for Multics, and now
there are two new implementations: Alphora Dataphor and Rel. Most other DBMS
implementations usually called relational are actually SQL DBMSs.
In 1970, the University of Michigan began development of the MICRO Information
Management System[13] based on D.L. Childs' Set-Theoretic Data model.[14][15][16] MICRO was
used to manage very large data sets by the US Department of Labor, the U.S. Environmental
Protection Agency, and researchers from the University of Alberta, the University of Michigan,
and Wayne State University. It ran on IBM mainframe computers using the Michigan Terminal
System.[17] The system remained in production until 1998.
Integrated approach
Main article: Database machine

In the 1970s and 1980s, attempts were made to build database systems with integrated hardware
and software. The underlying philosophy was that such integration would provide higher
performance at lower cost. Examples were IBM System/38, the early offering of Teradata, and
the Britton Lee, Inc. database machine.
Another approach to hardware support for database management was ICL's CAFS accelerator, a
hardware disk controller with programmable search capabilities. In the long term, these efforts
were generally unsuccessful because specialized database machines could not keep pace with the
rapid development and progress of general-purpose computers. Thus most database systems
nowadays are software systems running on general-purpose hardware, using general-purpose
computer data storage. However this idea is still pursued for certain applications by some
companies like Netezza and Oracle (Exadata).
Late 1970s, SQL DBMS
IBM started working on a prototype system loosely based on Codd's concepts as System R in the
early 1970s. The first version was ready in 1974/5, and work then started on multi-table systems
in which the data could be split so that all of the data for a record (some of which is optional) did
not have to be stored in a single large "chunk". Subsequent multi-user versions were tested by
customers in 1978 and 1979, by which time a standardized query language – SQL[citation needed] –
had been added. Codd's ideas were establishing themselves as both workable and superior to
CODASYL, pushing IBM to develop a true production version of System R, known as SQL/DS,
and, later, Database 2 (DB2).
Larry Ellison's Oracle Database (or more simply, Oracle) started from a different chain, based on
IBM's papers on System R. Though Oracle V1 implementations were completed in 1978, it
wasn't until Oracle Version 2 when Ellison beat IBM to market in 1979.[18]
Stonebraker went on to apply the lessons from INGRES to develop a new database, Postgres,
which is now known as PostgreSQL. PostgreSQL is often used for global mission critical
applications (the .org and .info domain name registries use it as their primary data store, as do
many large companies and financial institutions).
In Sweden, Codd's paper was also read and Mimer SQL was developed from the mid-1970s
at Uppsala University. In 1984, this project was consolidated into an independent enterprise.
Another data model, the entity–relationship model, emerged in 1976 and gained popularity
for database design as it emphasized a more familiar description than the earlier relational model.
Later on, entity–relationship constructs were retrofitted as a data modeling construct for the
relational model, and the difference between the two have become irrelevant.[citation needed]
1980s, on the desktop
The 1980s ushered in the age of desktop computing. The new computers empowered their users
with spreadsheets like Lotus 1-2-3 and database software like dBASE. The dBASE product was
lightweight and easy for any computer user to understand out of the box. C. Wayne Ratliff, the
creator of dBASE, stated: "dBASE was different from programs like BASIC, C, FORTRAN, and
COBOL in that a lot of the dirty work had already been done. The data manipulation is done by
dBASE instead of by the user, so the user can concentrate on what he is doing, rather than having
to mess with the dirty details of opening, reading, and closing files, and managing space
allocation."[19] dBASE was one of the top selling software titles in the 1980s and early 1990s.
1990s, object-oriented
The 1990s, along with a rise in object-oriented programming, saw a growth in how data in
various databases were handled. Programmers and designers began to treat the data in their
databases as objects. That is to say that if a person's data were in a database, that person's
attributes, such as their address, phone number, and age, were now considered to belong to that
person instead of being extraneous data. This allows for relations between data to be relations to
objects and their attributes and not to individual fields.[20] The term "object-relational impedance
mismatch" described the inconvenience of translating between programmed objects and database
tables. Object databases and object-relational databases attempt to solve this problem by
providing an object-oriented language (sometimes as extensions to SQL) that programmers can
use as alternative to purely relational SQL. On the programming side, libraries known as object-
relational mappings (ORMs) attempt to solve the same problem.
2000s, NoSQL and NewSQL
Main articles: NoSQL and NewSQL

XML databases are a type of structured document-oriented database that allows querying based
on XML document attributes. XML databases are mostly used in applications where the data is
conveniently viewed as a collection of documents, with a structure that can vary from the very
flexible to the highly rigid: examples include scientific articles, patents, tax filings, and personnel
records.
NoSQL databases are often very fast, do not require fixed table schemas, avoid join operations
by storing denormalized data, and are designed to scale horizontally.
In recent years, there has been a strong demand for massively distributed databases with high
partition tolerance, but according to the CAP theorem it is impossible for a distributed system to
simultaneously provide consistency, availability, and partition tolerance guarantees. A
distributed system can satisfy any two of these guarantees at the same time, but not all three. For
that reason, many NoSQL databases are using what is called eventual consistency to provide
both availability and partition tolerance guarantees with a reduced level of data consistency.
NewSQL is a class of modern relational databases that aims to provide the same scalable
performance of NoSQL systems for online transaction processing (read-write) workloads while
still using SQL and maintaining the ACID guarantees of a traditional database system.

2. Database Concepts
a. What is database?
A database is an organized collection of data, generally stored and
accessed electronically from a computer system. Where databases are more
complex they are often developed using formal design and modeling techniques.

b. What is the uses of a database?


Databases are used to support internal operations of organizations and to
underpin online interactions with customers and suppliers (see Enterprise
software).
Databases are used to hold administrative information and more
specialized data, such as engineering data or economic models. Examples include
computerized library systems, flight reservation systems, computerized parts
inventory systems, and many content management systems that store websites as
collections of webpages in a database.

c. What is the purpose of database?


The purpose of a database is to store and retrieve information in a way that
is accurate and effective.

d. Why do you need a database? A database is an organized collection of various


forms of data. It is also known as a structured set of data that is accessible in
many ways through your computer.
The various reasons for which we require databases are:
To manage large chunks of data: Yes, you can store data into a spreadsheet, but
if you add large chunks of data into the sheet, it will simply not work. For
instance: if your size of data increases into thousands of records, it will simply
create a problem of speed.
Accuracy: When doing data entry files in a spreadsheet, it becomes difficult to
manage the accuracy as there are no validations present in it.

Ease of updating data: With the database, you can flexibly update the data
according to your convenience. Moreover, multiple people can also edit data at
same time.

Security of data: There is no denying the fact that your data is less secure in
spreadsheets. Anyone can easily get access to file and can make changes to it.
With databases you have security groups and privileges you set to restrict access.

Data integrity: Data integrity also becomes a question when storing data in
spreadsheets. In databases, you can be assured of accuracy and consistency of
data due to the built in integrity checks and access controls.
e. Advantage of database
Reduced data redundancy
Reduced updating errors and increased consistency
Greater data integrity and independence from applications programs
Improved data access to users through use of host and query languages
Improved data security
Reduced data entry, storage, and retrieval costs
Facilitated development of new applications program

f. Disadvantages of database
Database systems are complex, difficult, and time-consuming to design
Substantial hardware and software start-up costs
Damage to database affects virtually all applications programs
Extensive conversion costs in moving form a file-based system
to a database system
Initial training required for all programmers and users

3. Tables
a. What is a tables in database?
A table is a collection of related data held in a table format within a
database. It consists of columns, and rows.

b. What are the uses of tables?


The main purpose of tables is to store data in an organized way that allows
to achieve company objectives. One database almost always contains
multiple tables that represent enity (for instance Customer, Product,
Order) and they relate to each other, for instance Customer buys a product
and receive unique Order Number.

c. How important the table in the database?


Database tables are like drawers in a kitchen. If well laid out, you will
have a place for everything and those drawers will be placed to make
cooking and cleaning up a pleasure. Typically, each drawer or cupboard
contains a single type of item: silverware, pots and pans, plates, glasses
etc. Databases are organized the same way. Each table contains only a
single type of info. And great databases, like kitchens, have plenty of
tables (‘drawers’) that are carefully designed for efficient workflow and a
great user experience.

4. Database relationships.

a. One-to-one: This type of relationship allows only one record on each side of the
relationship. The primary key relates to only one record – or none – in another
table. For example, in a marriage, each spouse has only one other spouse. This
kind of relationship can be implemented in a single table and therefore does not
use a foreign key.

b. One-to-many: A one-to-many relationship allows a single record in one table to


be related to multiple records in another table. Consider a business with a
database that has Customers and Orders tables. A single customer can purchase
multiple orders, but a single order could not be linked to multiple customers.
Therefore the Orders table would contain a foreign key that matched the primary
key of the Customers table, while the Customers table would have no foreign key
pointing to the Orders table.

c. Many-to-many: This is a complex relationship in which many records in a table


can link to many records in another table. For example, our business probably
needs not only Customers and Orders tables, but likely also needs a Products
table.

5. Database use

a. Database languages: Database languages are special-purpose languages, which


allow one or more of the following tasks, sometimes distinguished
as sublanguages:

 Data control language (DCL) – controls access to data;


 Data definition language (DDL) – defines data types such as creating, altering,
or dropping and the relationships among them;
 Data manipulation language (DML) – performs tasks such as inserting,
updating, or deleting data occurrences;
 Data query language (DQL) – allows searching for information and computing
derived information.

Database languages are specific to a particular data model. Notable examples include:

SQL combines the roles of data definition, data manipulation, and query in a single language. It
was one of the first commercial languages for the relational model, although it departs in some
respects from the relational model as described by Codd (for example, the rows and columns of a
table can be ordered). SQL became a standard of the American National Standards
Institute (ANSI) in 1986, and of the International Organization for Standardization (ISO) in
1987. The standards have been regularly enhanced since and is supported (with varying degrees
of conformance) by all mainstream commercial relational DBMSs.[29][30]

 OQL is an object model language standard (from the Object Data Management Group). It
has influenced the design of some of the newer query languages like JDOQL and EJB
QL.
 XQuery is a standard XML query language implemented by XML database systems such
as MarkLogic and eXist, by relational databases with XML capability such as Oracle and
DB2, and also by in-memory XML processors such as Saxon.
 SQL/XML combines XQuery with SQL.[31]

A database language may also incorporate features like:

 DBMS-specific configuration and storage engine management


 Computations to modify query results, like counting, summing, averaging, sorting,
grouping, and cross-referencing
 Constraint enforcement (e.g. in an automotive database, only allowing one engine type
per car)
 Application programming interface version of the query language, for programmer
convenience

b. Database security: Deals with all various aspects of protecting the database
content, its owners, and its users. It ranges from protection from intentional
unauthorized database uses to unintentional database accesses by unauthorized
entities (e.g., a person or a computer program).
Data security in general deals with protecting specific chunks of data, both
physically (i.e., from corruption, or destruction, or removal; e.g., see physical
security), or the interpretation of them, or parts of them to meaningful information
(e.g., by looking at the strings of bits that they comprise, concluding specific valid
credit-card numbers; e.g., see data encryption).

c. Database design: is the organization of data according to a database model. The


designer determines what data must be stored and how the data elements
interrelate. With this information, they can begin to fit the data to the database
model.
Database design involves classifying data and identifying
interrelationships. This theoretical representation of the data is called an ontology.
The ontology is the theory behind the database's design.

d. Database programming: is the heart of a business information system and


provides file creation, data entry, update, query and reporting functions. The
traditional term for database software is "database management system" (see
DBMS). For more about database structures, see DBMS, field, record, file,
database and database schema
.
e. Database managem: refers to the actions a business takes to manipulate and
control data to meet necessary conditions throughout the entire data lifecycle.

6. Database warehouse.
Is a relational database that is designed for query and analysis rather than for
transaction processing. It usually contains historical data derived from transaction data,
but it can include data from other sources. It separates analysis workload from transaction
workload and enables an organization to consolidate data from several sources.
In addition to a relational database, a data warehouse environment includes an
extraction, transportation, transformation, and loading (ETL) solution, an online
analytical processing (OLAP) engine, client analysis tools, and other applications that
manage the process of gathering data and delivering it to business users.

a. Creating the database warehouse


Creating database warehouse the following steps;
Step 1: Define the Processes
Step 2: Define the Data Sources
Step 3: Define the Dimensions
Step 4: Define the Grain
Step 5: Create a Star Schema
Step 6: Specify Data Points
Step 7: Select the Columns

b. Using the data warehouse


c. People
d. Product
7. Database management systems
DBMS is system software for creating and managing databases. The DBMS provides
users and programmers with a systematic way to create, retrieve, update and manage
data.A DBMS makes it possible for end users to create, read, update and delete data in a
database. The DBMS essentially serves as an interface between the database and end
users or application programs, ensuring that data is consistently organized and remains
easily accessible.
a. Concepts
Atomicity
Consistency
Isolation
Durability
b. Objects
Data availability-The data availability is responsible for the cost
performance and the query update. Availability functions make the
database available to users helps in defining and creating a database and
getting the data in and out of a database.

Data integrity-The data integrity provides protection for the existence of


the database and maintaining the quality of the database.

Data independence- DBMS provides two types of data independences.


First is a physical data independence program, which remains unaffected
from the changes in the storage structure or access method, and the second
is the logical data independence program, which remains unaffected from
the changes in the schema.
c. Components

The database management system can be divided into five major components,
they are:
Hardware-When we say Hardware, we mean computer, hard disks, I/O
channels for data, and any other physical component involved before any
data is successfully stored into the memory.

When we run Oracle or MySQL on our personal computer, then our


computer's Hard Disk, our Keyboard using which we type in all the
commands, our computer's RAM, ROM all become a part of the DBMS
hardware.
Software
Data
Procedures
Database Access Language

d. Functions:
There are several functions that a DBMS performs to ensure data integrity
and consistency of data in the database. The ten functions in the DBMS are: data
dictionary management, data storage management, data transformation and
presentation, security management, multiuser access control, backup and recovery
management, data integrity management, database access languages and
application programming interfaces, database communication interfaces, and
transaction management.

Data Dictionary Management: is where the DBMS stores definitions of


the data elements and their relationships (metadata). The DBMS uses this
function to look up the required data component structures and
relationships. When programs access data in a database they are basically
going through the DBMS. This function removes structural and data
dependency and provides the user with data abstraction. In turn, this
makes things a lot easier on the end user. The Data Dictionary is often
hidden from the user and is used by Database Administrators and
Programmers.

Data Storage Management: This particular function is used for the


storage of data and any related data entry forms or screen definitions,
report definitions, data validation rules, procedural code, and structures
that can handle video and picture formats. Users do not need to know how
data is stored or manipulated. Also involved with this structure is a term
called performance tuning that relates to a database’s efficiency in relation
to storage and access speed.
Data Transformation and Presentation: [ This function exists to
transform any data entered into required data structures. By using the data
transformation and presentation function the DBMS can determine the
difference between logical and physical data formats.

Security Management: This is one of the most important functions in the


DBMS. Security management sets rules that determine specific users that
are allowed to access the database. Users are given a username and
password or sometimes through biometric authentication (such as a
fingerprint or retina scan) but these types of authentication tend to be more
costly. This function also sets restraints on what specific data any user can
see or management.

Multiuser Access Control: Data integrity and data consistency are the
basis of this function. Multiuser access control is a very useful tool in a
DBMS, it enables multiple users to access the database simultaneously
without affecting the integrity of the database.

Backup and Recovery Management: Backup and recovery is brought to


mind whenever there is potential outside threats to a database. For
example if there is a power outage, recovery management is how long it
takes to recover the database after the outage. Backup management refers
to the data safety and integrity; for example backing up all your mp3 files
on a disk.

Data Integrity Management: The DBMS enforces these rules to reduce


things such as data redundancy, which is when data is stored in more than
one place unnecessarily, and maximizing data consistency, making sure
database is returning correct/same answer each time for same question
asked.
Database Access Languages and Application Programming
Interfaces: A query language is a nonprocedural language. An example of
this is SQL (structured query language). SQL is the most common query
language supported by the majority of DBMS vendors. The use of this
language makes it easy for user to specify what they want done without
the headache of explaining how to specifically do it.

Database Communication Interfaces: This refers to how a DBMS can


accept different end user requests through different network environments.
An example of this can be easily related to the internet. A DBMS can
provide access to the database using the Internet through Web Browsers
(Mozilla Firefox, Internet Explorer, Netscape).

Transaction Management: This refers to how a DBMS must supply a


method that will guarantee that all the updates in a given transaction are
made or not made.All transactions must follow what is called the ACID
properties.

A – Atomicity: states a transaction is an indivisible unit that is either


performed as a whole and not by its parts, or not performed at all.It is
the responsibility of recovery management to make sure this takes
place.

C – Consistency:A transaction must alter the database from one constant


state to another constant state. I – Isolation:Transactions must be executed
independently of one another.Part of a transaction in progress should not
be able to be seen by another transaction. D – Durability:A successfully
completed transaction is recorded permanently in the database and must
not be lost due to failures.

e. Database products

8. Database models

Models:
a. Network Model: The popularity of the network data model coincided with the
popularity of the hierarchical data model. Some data were more naturally modeled
with more than one parent per child. So, the network model permitted the
modeling of many-to-many relationships in data. In 1971, the Conference on Data
Systems Languages (CODASYL) formally defined the network model. The basic
data modeling construct in the network model is the set construct. A set consists
of an owner record type, a set name, and a member record type. A member record
type can have that role in more than one set, hence the multiparent concept is
supported. An owner record type can also be a member or owner in another set.
The data model is a simple network, and link and intersection record types (called
junction records by IDMS) may exist, as well as sets between them . Thus, the
complete network of relationships is represented by several pairwise sets; in each
set some (one) record type is owner (at the tail of the network arrow) and one or
more record types are members (at the head of the relationship arrow).

Usually, a set defines a 1:M relationship, although 1:1 is permitted. The


CODASYL network model is based on mathematical set theory.

b. Hierarchical Model: The hierarchical data model organizes data in a tree


structure. There is a hierarchy of parent and child data segments. This structure
implies that a record can have repeating information, generally in the child data
segments. Data in a series of records, which have a set of field values attached to
it. It collects all the instances of a specific record together as a record type. These
record types are the equivalent of tables in the relational model, and with the
individual records being the equivalent of rows. To create links between these
record types, the hierarchical model uses Parent Child Relationships. These are a
1:N mapping between record types. This is done by using trees, like set theory
used in the relational model, "borrowed" from maths. For example, an
organization might store information about an employee, such as name, employee
number, department, salary. The organization might also store information about
an employee's children, such as name and date of birth. The employee and
children data forms a hierarchy, where the employee data represents the parent
segment and the children data represents the child segment. If an employee has
three children, then there would be three child segments associated with one
employee segment. In a hierarchical database the parent-child relationship is one
to many. This restricts a child segment to having only one parent segment.
Hierarchical DBMSs were popular from the late 1960s, with the introduction of
IBM's Information Management System (IMS) DBMS, through the 1970s.

c. Relational Model: (RDBMS - relational database management system) A


database based on the relational model developed by E.F. Codd. A relational
database allows the definition of data structures, storage and retrieval operations
and integrity constraints. In such a database the data and relations between them
are organised in tables. A table is a collection of records and each record in a table
contains the same fields.

Properties of Relational Tables:


# Values Are Atomic
# Each Row is Unique
# Column Values Are of the Same Kind
# The Sequence of Columns is Insignificant
# The Sequence of Rows is Insignificant
# Each Column Has a Unique Name

Certain fields may be designated as keys, which means that searches for specific
values of that field will use indexing to speed them up. Where fields in two
different tables take values from the same set, a join operation can be performed
to select related records in the two tables by matching values in those fields.
Often, but not always, the fields will have the same name in both tables. For
example, an "orders" table might contain (customer-ID, product-code) pairs and a
"products" table might contain (product-code, price) pairs so to calculate a given
customer's bill you would sum the prices of all products ordered by that customer
by joining on the product-code fields of the two tables. This can be extended to
joining multiple tables on multiple fields. Because these relationships are only
specified at retreival time, relational databases are classed as dynamic database
management system. The RELATIONAL database model is based on the
Relational Algebra.

d. Object-Oriented Model: Object DBMSs add database functionality to object


programming languages. They bring much more than persistent storage of
programming language objects. Object DBMSs extend the semantics of the C++,
Smalltalk and Java object programming languages to provide full-featured
database programming capability, while retaining native language compatibility.
A major benefit of this approach is the unification of the application and database
development into a seamless data model and language environment. As a result,
applications require less code, use more natural data modeling, and code bases are
easier to maintain. Object developers can write complete database applications
with a modest amount of additional effort.
According to Rao (1994), "The object-oriented database (OODB)
paradigm is the combination of object-oriented programming language (OOPL)
systems and persistent systems. The power of the OODB comes from the
seamless treatment of both persistent data, as found in databases, and transient
data, as found in executing programs." In contrast to a relational DBMS where a
complex data structure must be flattened out to fit into tables or joined together
from those tables to form the in-memory structure, object DBMSs have no
performance overhead to store or retrieve a web or hierarchy of interrelated
objects. This one-to-one mapping of object programming language objects to
database objects has two benefits over other storage approaches: it provides
higher performance management of objects, and it enables better management of
the complex interrelationships between objects. This makes object DBMSs better
suited to support applications such as financial portfolio risk analysis systems,
telecommunications service applications, world wide web document structures,
design and manufacturing systems, and hospital patient record systems, which
have complex relationships between data.

Other models

b. Inverted file model: A database built with the inverted file structure is designed
to facilitate fast full text searches. In this model, data content is indexed as a
series of keys in a lookup table, with the values pointing to the location of the
associated files. This structure can provide nearly instantaneous reporting in big
data and analytics, for instance.
This model has been used by the ADABAS database management system
of Software AG since 1970, and it is still supported today.

c. Flat model: The flat model is the earliest, simplest data model. It simply lists all
the data in a single table, consisting of columns and rows. In order to access or
manipulate the data, the computer has to read the entire flat file into memory,
which makes this model inefficient for all but the smallest data sets.

d. Multidimensional model: This is a variation of the relational model designed to


facilitate improved analytical processing. While the relational model is optimized
for online transaction processing (OLTP), this model is designed for online
analytical processing (OLAP).
Each cell in a dimensional database contains data about the dimensions
tracked by the database. Visually, it’s like a collection of cubes, rather than two-
dimensional tables.
e. Semistructured model: In this model, the structural data usually contained in the
database schema is embedded with the data itself. Here the distinction between
data and schema is vague at best. This model is useful for describing systems,
such as certain Web-based data sources, which we treat as databases but cannot
constrain with a schema. It’s also useful for describing interactions between
databases that don’t adhere to the same schema.

f. Context model: This model can incorporate elements from other database models
as needed. It cobbles together elements from object-oriented, semistructured, and
network models.

g. Associative model: This model divides all the data points based on whether they
describe an entity or an association. In this model, an entity is anything that exists
independently, whereas an association is something that only exists in relation to
something else.

The associative model structures the data into two sets:


A set of items, each with a unique identifier, a name, and a type
A set of links, each with a unique identifier and the unique identifiers of a
source, verb, and target. The stored fact has to do with the source, and
each of the three identifiers may refer either to a link or an item.

Other, less common database models include:


Semantic model, which includes information about how the stored data
relates to the real world
XML database, which allows data to be specified and even stored in XML
format
Named graph
Triplestore
e. Implementation: An implementation of a given data model is a physical
realization on a real machine of the components of the abstract machine that
together constitute that model.

9. Types of database

a. Centralised database: The information is stored at a centralized location and the


users from different locations can access this data. This types of database
contains application procedures that help the users to access the data even a
remote location.
Various kinds of authentication procedures are applied for the verification and
validation of enc users, likewise, a registration number is provided by the
application procedures which keeps a track and record of dara usage. The local
area office handles this thing.
b. Distributed database: Just opposite of the centralized database concept, the
distributed database has contributions from the common database as well as the
information captured by local computers also. The data is not at one place and is
distributed at various sites of an organization. These sites are connected to each
other with the help of communication links which helps them to access the
distributed data easily.
You can imagine a distributed database as a one in which various portions
of a database are stored in multiple different locations(physical) along with the
application procedures which are replicated and distributed among various points
in a network.
There are two kinds of distributed database, viz. homogenous and
heterogeneous. The databases which have same underlying hardware and run over
same operating systems and application procedures are known as homogeneous
DDB, for eg. All physical locations in a DDB. Whereas, the operating systems,
underlying hardware as well as application procedures can be different at various
sites of a DDB which is known as heterogeneous DDB.

c. Personal Database; Data is collected and stored on personal computers which is


small and easily manageable. The data is generally used by the same department
of an organization and accessed by a small group of people.

d. End User Database: The end user is usually not concerned about the transaction
or operations done at various levels and is only aware of the product which may
be a software or an application. Therefore, this is a shared database which is
specifically designed for the end users.

e. Commercial database: These are the paid versions of the huge database
designed uniquely for the users who wants to access the information for help.
These database are subject specific, and one cannot afford to maintain such a huge
information .Access to such database is provided through commercial links.

f. NoSQL Database; These are used for large sets of distributed data. These are
some big data performance issues which are effectively handled bu relational
database , such kind of issues are easily managed by NoSQL database. There are
very efficient in analyzing large size unstructured data that may be stored at
multiple virtual servers of the cloud.

g. Operational database: Information related to operations of an enterprise is


stored inside this database. Functional lines like marketing, employee relations,
customer service etc. require such kind of databases.

h. Relational database: These databases are categorized by a set of tables where


data gets fit into a pre-defined category. The table consists of rows and columns
where the column has an entry for data for a specific category and rows contains
instance for that data defined according to the category. The Structured Query
Language (SQL) is the standard user and application program interface for a
relational database.
There are various simple operations that can be applied over the table
which makes these databases easier to extend, join two databases with a common
relation and modify all existing applications.

i. Cloud database: Now a day, data has been specifically getting stored over
clouds also known as a virtual environment, either in a hybrid cloud, public or
private cloud. A cloud database is a database that has been optimized or built for
such a virtualized environment. There are various benefits of a cloud database,
some of which are the ability to pay for storage capacity and bandwidth on a per-
user basis, and they provide scalability on demand, along with high availability.
A cloud database also gives enterprises the opportunity to support business
applications in a software-as-a-service deployment.

j. Object-oriented database: An object-oriented database is a collection of object-


oriented programming and relational database. There are various items which are
created using object-oriented programming languages like C++, Java which can
be stored in relational databases, but object-oriented databases are well-suited for
those items.
An object-oriented database is organized around objects rather than
actions, and data rather than logic. For example, a multimedia record in a
relational database can be a definable data object, as opposed to an alphanumeric
value.

k. Graph Database: The graph is a collection of nodes and edges where each node
is used to represent an entity and each edge describes the relationship between
entities. A graph-oriented database, or graph database, is a type of NoSQL
database that uses graph theory to store, map and query relationships.
Graph databases are basically used for analyzing interconnections. For
example, companies might use a graph database to mine data about customers
from social media.

S-ar putea să vă placă și