Sunteți pe pagina 1din 37


To be able to function, an organisation needs information, e.g. list of books in a library, customer details in a retail business, specifications of cars and their components for a car manufacturer Information may be defined as data represented in a meaningful form. Same data shown in different ways will provide different information to different viewers A major requirement of any computer system is to store and retrieve data in a way that is meaningful to the end user so the core of any Information System is data, which is to be transformed into information through data modelling

Data: Meaningful facts, text, graphics, images, sound, video segments Database: An organized collection of logically related data Information: Data processed to be useful in decision making Metadata: Data that describes data

Data in Context

Summarized data

File: A collection of records or documents dealing with one organization, person, area or subject. (Rowley) - Manual (paper) files - Computer files Database: A collection of similar records with relationships between the records. (Rowley) bibliographic, statistical, business data, images, etc.

A Database is a collection of stored operational data used by the application systems of some particular enterprise - Paper Databases Still contain a large portion of the worlds knowledge - File-Based Data Processing Systems
Early batch processing of (primarily) business data

- Database Management Systems (DBMS)

Database Management System (DBMS) - Software system used to define, create, maintain and provide controlled access to the database and repository Repository - Data Dictionary - the place where all metadata for a particular database is stored - may also include information on relationships between files or tables in a particular database

Metadata - Data about data - In DBMS means all of the characteristics describing the attributes of an entity, e.g.: - name of attribute - data type of attribute - size of the attribute - format or special characteristics - Characteristics of files or relations
- name, content, notes, etc.

Data Independence - Physical representation and location of data and the use of that data are separated
- The application doesnt need to know how or where the database has stored the data, but just how to ask for it. - Moving a database from one DBMS to another should not have a material effect on application program - Recoding, adding fields, etc. in the database should not affect applications


Enterprise - Organization Entity - Person, Place, Thing, Event, Concept... Attributes - Data elements (facts) about some entity - Also sometimes called fields or items or domains Data values - instances of a particular attribute for a particular entity


Records - The set of values for all attributes of a particular entity - tuples or rows in relational DBMS File - Collection of records - relation or table in relational DBMS Key - an attribute or set of attributes used to identify or locate records in a file Primary Key - an attribute or set of attributes that uniquely identifies each record in a file

Data Administrator (DA) - person responsible for the data administration function in an organization - sometimes may be the CIO -- Chief Information Officer Database Administrator (DBA) - person responsible for the database administration Function


Data Administration - Responsibility for the overall management of data resources within an organization Database Administration - Responsibility for physical database design and technical issues in database management Data Steward - Responsibility for some subset of the organizations data, and all of the interactions (applications, user access, etc.) for that data


Models - Levels or views of the Database

Conceptual, Logical, Physical

- DBMS types
Relational, Hierarchic, Network, Object-Oriented,



Hierarchical Model (1960s and 1970s) Similar to data structures in programming languages

Books (id, title)

Authors (first, last)




Network Model (1970s) Provides for single entries of data and navigational links through chains of data

Authors Subjects Books Publishers


Relational Model (1980s) Provides a conceptually simple model for data as relations (typically considered tables) with all data visible
pubid 1 2 3 4 pubname Harper Addison Oxford Que
Subid 1 2 3 4 4 2 1 Subid 3 2 3

Authorid 1 2 3 4 5

Book ID 1 2 3 4 5

Title pubid Introductio The history New stuff ab Another title And yet more

2 4 3 2 1

Author id 1 2 3 4 5

Book ID

Author name Smith Wynar Jones Duncan Applegate

Subject 1 cataloging 2 history 3 stuff


Object Oriented Data Model (1990s) Encapsulates data and operations as Objects

Books (id, title) Authors (first, last)




Object-Relational Model (1990s) Combines the well-known properties of the Relational Model with such OO features as: - User-defined datatypes - User-defined functions - Inheritance and sub-classing


History - 50s and 60s all applications were custom built for particular needs - File based - Many similar/duplicative applications dealing with collections of business data - Early DBMS were extensions of programming languages - 1970 - E.F. Codd and the Relational Model - 1979 - Ashton-Tate & first Microcomputer DBMS


Still widely used today (e.g. for backup) but have the following problems: Program-Data Dependence (see Fig.) file descriptions are stored within each application that accesses file, so change to file structure requires changes to all file descriptions in all programs Data Redundancy (Duplication of data) wasteful, inconsistent, loss of metadata integrity (same data has different names in different files, or same name may be used for different data in different files) Limited Data Sharing users have little opportunity to share data outside their own applications


Lengthy Development Times little opportunity to re-use previous development efforts Excessive Program Maintenance factors above combine to create heavy maintenance load


Three file processing systems


A DBMS is a data storage and retrieval system which permits data to be stored non-redundantly while making it appear to the user as if the data is well-integrated


Minimal Data Redundancy/Improved Consistency Data Integration Data Independence/Reduced Maintenance Improved Data Sharing Increased Application Development Productivity Enforcement of Standards Improved Data Quality (Constraints) Better Data Accessibility/ Responsiveness Reduced Program Maintenance

New, Specialized Personnel required Installation Management Cost and Complexity Conversion Costs Need for Explicit Backup and Recovery Organizational Conflict


Relational databases views all data in the form of tables Following Figs. a and b shows four tables, Customer, Product, Order and OrderLine (the 4 entities shown in the previous ER diagram) Each column represents an attribute, e.g. the Customer table has attributes ID, Name, Address etc. Relationships between entities are represented by values stored in columns of the corresponding tables, e.g. Customer_ID is an attribute of both the Customer table and the Order table. This makes it easy to link an order with its customer

Product and Customer tables


Personal Database PCs/PDAs, Cellphones OK in special situations where need to share data amongst users is unlikely to arise Workgroup Database. Designed to support collaboration in a small team (less than 25 people) Department Database typically larger than a workgroup (25-100 people) and more diverse range of functions e.g. personnel database Enterprise Database scope of the whole organisation. May be more than one, as a single database for a large organisation may be impractical due to performance difficulties for large databases, diverse needs of user groups, and difficulty of achieving common definition of data (metadata) for all users

Typical data from a personal computer database


Workgroup database with local area network


Departmental Database (Personnel Department)


An enterprise data warehouse


CASE Tools automated tools used to design databases and applications Repository generalised knowledge for all data definitions, relationships, screen/report formats an extended set of metadata for managing databases and other components of the information system Database Management System (DBMS) software (sometimes specialised hardware) used to define, create, maintain and provide controlled access to the database and the repository Database an organised collection of logically related data occurrences

Application Programs software used to create and maintain the database and provide information to users User Interface languages, menus etc by which users interact with other system components Data Administrators people responsible for overall information resources of an organization Systems analysts/programmers and end Users people who add, delete and modify the database and who get information from it


Components of the database environment