Sunteți pe pagina 1din 8

FUNDAMENTAL DATABASE CONCEPTS

Data Hierarchy in a Database (from lowest to the highest level)


Bits: 0/1; basically, the only language understood by a computer; represented by the strength of current, magnetic
polarity, or some other method.
Bytes (Characters): Group of bits (normally 7 or 8 bits to a byte) that represent alphanumeric characters (a,b,c
1,2,3)and special symbols (#,@,$).
Field (Attribute or Column): Group of characters that represent a field (name, social security number, address,
etc.).
** In a relational database management system, such as Access or Oracle, the term COLUMN is used
interchangeably with attribute or field.
Record: A group of related fields (e.g., a specific students social security number, name, address, major, etc.).
File (Table): A group of related records (e.g., a file containing records of students, or a file containing records of
faculty). We will not keep students and facultys records in a single file because a file should not contain unrelated
records.
** A table is typically shown as: table_name(column_name, column_name). And also we use an abbreviation of
the table name with the column names. For example, in a student table, we may name students ssn as std_ssn.
This way, we can tell what table a particular column appears in.
Database: A group of related files (a database of students, a database of administrators, a database of faculty,
etc.). Creating different databases helps effective administration and management of these databases. For
example, different database administrators (DBAs) may be assigned the responsibility of administering and
managing different databases.
Entity
Person, place, thing, event, etc., on which we keep information. For instance, we keep information on students,
faculty, courses and classes. For each entity in our database, we create at least one table for it. For example,
for faculty, we might create a table that includes information such as their social security numbers, names,
addresses, etc. And, then, we might also create another file that lists facultys social security numbers, names, and
professional certifications. Consequently, we have two files for the entity faculty. Importantly, faculty members
collectively form what is normally called an entity set, represented by the entity (an entity set may consist of only
one member though).
Attributes
Attributes are characteristics of an entity that may be of interest to us. If someone is a enrolled at a university, then
that university is interested in that persons social security number, name, address, major, etc., but not in the height
or weight of that person. However, a physician will be interested in the height and weight of the same person, but
not in his/her major.
Attribute Domain: Set of possible values of an attribute. For example, letter grade attribute domain will consist of
F, D-, D, D+A-, and A.
Simple Attribute: An attribute that cant be subdivided (e.g., social security number).
Composite Attribute: An attribute that can be subdivided into multiple attributes (e.g., a name may be divided into
last name, first name and middle initial).
Single-Valued Attribute: An attribute that can have only one value (for example, a person can have only one
social security number).

Multi-Valued Attribute: An attribute that can have many values (For example, a person may have multiple
telephone numbers).
Derived Attribute: An attribute that is based on other attributes. For example, if we want to determine years of
service for faculty, we will subtract date of employment from current or some other date; the resulting attribute will
be called a derived attribute.
Primary Key
Attribute that uniquely identifies a record in a table (or file). For example, a social security number will retrieve a
unique record from a student table.
** Primary keys are based on the concept of determination. If value of attribute A has specific values of B
associated with it. We say A determines B. For example, social security numbers are associated with specific
names, whereas same name may be associated with different social security numbers. Likewise, if I know the
overall average of a student in a class, I can tell the letter grade; therefore, average grade determines letter grade.
However, if I know I letter grade, I cant tell what the average grade is (A person with an A grade could have an
average of 100 or 96). Consequently, primary key attributes determine other attributes in the same record.
Composite Key: Primary key that is based on multiple attributes. Suppose we have a table that includes the
columns course_number, section_number, instructor, etc. The table may include several sections of the same
course. In this table, a course number and a section number, together, will return a unique record; a course number
by itself may return multiple records and, likewise, a section number may also return multiple records.
Foreign Key
Attribute(s) in one table that match primary key in another table. For instance, we might have two tables (1) student
and (2) major. The major table may include columns such as major_code (some abbreviation to uniquely identify
each major and serve as a primary key in the table), maj_desc (complete name of the major), sch_code (identifier
of the school that offers the major) and maj_coordinator (coordinator of that major or program). The student table
may include the columns such as std_ssn, std_fname, std_lname and maj_code (students major, assuming every
student can major in one area only). The maj_code is primary key in major table, and it is a foreign key in the
student table and refers to the primary key in the major table. Typically, same name is used for a primary key and
foreign key. Since an abbreviation of the table name as part of the column name, one can guess the table a foreign
key column references.
Foreign keys allow us to create common fields between two tables and are used to join the two tables. For
example, the joining condition for the above two tables will look like this (The complete syntax for referencing a
column in a table is table_name.column_name):
Select..........
From...........
Where major.maj_code=student.maj_code.
Of course, we can join multiple tables by joining the first table to the second table and then joining the second table
to the third table and so on. Consequently, if we will need to join two or more tables, we must create common fields
between them.
By the way, we can also create common fields by using a non-key column.
Relationship
Association between specific instances of the same or different entities. For example, faculty teach classes and
classes are covered by faculty. Therefore, faculty and class entities are related. Actually, specific instances of an
entity are associated with specific instances of the other entity. In other words, a specific faculty teaches a specific
class and a particular class is covered by a certain faculty. Here we have a relation between two different entities.
A relation can exist between different instances of the same entity. For example, suppose we have an employee
table that lists employees working for an organization. Here one employee may be related to another employee
and therefore we have relation between instances of the same entity.
Relation Participation: Participation in a relation may be mandatory or optional
for an entity.
2

Mandatory: Entity must participate in the relation.


Optional: Entity may participate in the relation.
For example, a class has to be covered by a faculty, but a faculty doesnt have to teach a class (I mean the faculty
is not scheduled to teach a class!!!). This implies that in the faculty-class relation, the class is optional but the
faculty is mandatory. Knowing the nature of participation is important as it helps us determine whether the foreign
key can be allowed to have null (blank) values or not.
Variation in a Relation
Relationships vary in terms of degree, connectivity and cardinality.
Degree of a Relationship: Number of associated entities.
Unary Relationship: Association within a single entity. For example, an employee may supervise other employees.
So some records in the employee table may be related to some other records within the same table.
Binary Relationship: Association between two entities. For example, a faculty teaches a course.
Ternary Relationship: Association between three entities. For instance, a donor donates funds for a specific area
of research, and a researcher is awarded funds from the donations made by a specific donor.
N-entity Relationship: Association between n entities (for example, 4-entity relation).
Cardinality of a Relationship: Specific number of an entity occurrences that are
associated with one occurrence of the related entity. For example, a faculty may teach 0 courses at the minimum,
and five courses at the maximum.
Type or Connectivity of a Relationship: 1:1, 1:many, many:many
Representing Relation Types in a Database
We represent relations or association among entities by creating common fields.
1:1 -> Include primary key of one of the related entities as a foreign key with the
other entity. For example, a school will have only one Dean, and a faculty
can be Dean of only one school. We can take the primary key (PK) of one of
the entities and make it part of the table of the other entity as a foreign
key (FK). In this case it will make sense to take the faculty PK and
include it as a FK in the school table.
1:m -> Include primary key of 1 entity as a foreign key with the m entity. For
example, a faculty may teach many classes and a class may be covered by
only one faculty. In this case, we always take the primary key of the 1
entity (faculty) and make it a foreign key in the m entity. So, we will
include the primary key of the faculty entity and include it as a foreign
key in the m entity table (class).
m:n -> In this case, we always create a new entity, called relationship entity whose
primary key will be based on the primary keys of the related entities. We do
this because in case of an m:n relation, there are attributes that dont
belong to either of the entity participating in the relation, but rather to
all of them together. For example, student and class entity have an m:n
relation. The students can take many courses, and a class can have many
students. Now, a letter grade doesnt belong to the student alone, or the
class alone, but to student and class together: a student achieves a specific
letter grade in a specific course. Therefore, we will create a relationship
entity between student and class and maintain the information on letter grades
in this table.
3

Database Models
Constructs and methods used to represent entities and relations.
Conceptual Models
Focus on What entities and relations are to be represented in the database.
Examples: E-R Model, Crow's Foot Model
Implementation Models
Focus on How entities and relations are to be represented in the database.
Examples: Hierarchical, Network, Relational, Object-Oriented
Database Management Systems
Software created based on a specific implementation model. DBMS are primarily used for record keeping.
Examples: Access, Adabase, IMS, Oracle, SQL Server
Normalization
Arrangement of attributes into different files. Normalizing tables is important; otherwise, we may problem inserting
some records, may end up deleting more data than what we intended to, and may have to unnecessarily update
multiple records instead of updating only one record.
Normal Forms
Normalization rules employed to minimize data anomalies.
First Normal Form: No repeating attributes or duplicate records. If we have duplicate records, then we cant
uniquely identify a record, and the ability to retrieve a unique record is imperative in a database. If we have
repeating attributes, then we may not be able to determine how many columns we need to create in the table for
these repeating attributes. For example, in a table called sales, we might maintain information on different items
that were ordered. If we define columns for these different items, then we have repeating attributes. Since we dont
know how many items a customer might order, we might end up defining too few or two many columns.
Second Normal Form: Every non-key attribute should be dependent on the whole of the key, not part of the key.
For example, if we have a table named std_act that lists various activities the students participate in and pay fees
for these activities. The table structure might look like this:
std_act(std_ssn, activity, fee)
** underlined columns together form primary key (composite key)
And the table data might look like this:
std_ssn
S1
S1
S2

activity
Tennis
Football
Football

fee
50
40
40

Obviously, std_ssn is not sufficient by itself to retrieve a unique record from this table. We need both std_ssn and
activity to retrieve unique record. Now, suppose we offer a new activity, but no student has yet enrolled in it, then
we cant keep the information about this activity in this table because we have data only on part of the primary key.
Also, if the student S1 decides to drop Tennis and we delete the record, and then we also lose the information
about the fee for Tennis. This table violates the second normal form rule and should be decomposed into two tables
as follows:
activities (activity, fee)
std_act (std_ssn, activity)

Third Normal Form: One non-key attribute should not be dependent on another non-key attribute. For instance, in
a student table, we may include maj_code, and maj_description columns. Since maj_code determines
maj_description, we should not include the maj_description column in this table. If we need to display description of
a major, we can always join the student table with the major table and retrieve the description of any major.
Now, this raises a database performance question. If we keep maj_description in the student table as well as in
major table, it can possibly result in inconsistent data; however, since we will not need to join the tables every time
we want to display the description of the major, it may improve the system performance.
When we discard the requirements of a normal form in a table, we refer to this action as de-normalization.
The Boyce-Codd Normal Form: A non-key attribute should not determine part of key. We take care of this
problem by decomposing the table as in the previous normal form.
Fourth Normal Form: A table should not have multiple multi-valued attributes. This is primarily a semantics issue.
Suppose, we have a table that looks like this:
std_ssn std_fname std_lname maj_code
S1
SFN1
SFN1
Accounting
S1
SFN1
SFN1
Marketing

activity
Tennis
Football

In this case, since a record is a group of related fields, someone may conclude that the student S1 plays Tennis as
an accountant, but plays football as a marketing major. We can take care of this problem by decomposing this table
so that there is at the most one multi-valued column in one table.
Methods For Handling Multi-Valued Attributes
1. Create multiple new attributes to replace the original multi-valued attribute.
For example, if the students are allowed to have multiple majors, then we can have a table with the following
structure:
std_ssn, std_fname, std_lname, maj_code1, maj_code2.......
This is appropriate when the multi-valued attribute can have a small number of possible values.
2. Create a new entity based on the original multi-valued attribute's components.
This approach would be used when for example a student is allowed unlimited number of majors. In this case, it
would be more appropriate to create a
Creating Database Tables
We create and populate database tables in 2 steps. First, we create structure of the database table; Secondly, we
insert records into the table.
We provide the following information to the database management system concerning the structure of the table
being created:
1.
2.
3.
4.
5.
6.
7.
8.

Names of columns in the table


Column(s) that will serve as primary key
Column(s) that will serve as foreign key(s)
Data types (e.g., character, numeric, date) of columns
Size of each column (number of spaces)
Constraints on the column such as default value, valid values, etc.
Storage parameters of the table
Some additional information

Importantly, we cant incorporate an organizations policies or rules within the structures of the tables. For example,
we cant specify within the structure of a table that a graduate international student must take at least 9 hours
during long semesters unless it is the last semester of the student. Such rules (normally called business rules) are
incorporated through program modules that interact with the database tables and enforce such rules.
In addition to tables, a database includes other objects such as a view (a logical table based on one or more tables
and/or views), sequences, indexes, etc.
In Oracle, object (e.g. table, sequence, procedures) names can be 1 - 30 alphanumeric characters long.
Integrity Constraints
Help maintain database integrity.
- Primary Key
- Foreign Key
Value Constraints
Help maintain the restrictions on the specific values for a column.
- Valid Value
- Default Value
- Not Null
Constraint Naming Convention
We will use the following convention in specifying the name of a constraint.
TableName_ColumnName_ConstraintAbbreviation
Constraint Abbreviations
PK: Primary Key
FK: Foreign Key
CC: Check Constraint
NN: Not Null
UK: Unique
CPK: Composite Primary Key
Primary Data Types
Character: Stores character data
Fixed Length: Char(n) (default 1, maximum 4000)
Variable Length: Varchar(n), Varchar2(n) (size must be specified, no maximum)
Number: Stores numeric data
Integer: Number(n)
Fixed Point: Number(p,s)
Floating Point: Number
Decimal
Others
Date: Stores date values
Date values components: Century, Year, Month, Day, Hour, Minutes, and Seconds
Timestamp: Date values that stores seconds in billionth of a second fractions.
Long: Stores character data upto 4 gigabytes.
Raw and Raw Long: Store binary data such as digitized sound and images
Accessing Database
A DBA grants specific rights to users, such as right to create a table, edit a table, or delete records from a table. A
DBA can also revoke these rights from the users. Usually, a DBA creates roles, grants specific rights on specific
database objects to these roles, and then simply assigns a role to a user. This way if a group of users need some
additional rights, the DBA can grant it to them by just granting that right to the role assigned to that group of users.
6

Likewise, if certain rights need to be revoked, the DBA just has to revoke those rights from the roles. This approach
makes it much easier to manage users rights to specific objects.

E-R Diagram
A graphical tool to represent relations among entities.
E-R Diagram Symbols
Entity: Square or Rectangle
Attribute: Oval
Relationship: Diamond
Relationship Entity: Diamond within a square.
Example of an E-R Diagram

Faculty

m
Teache
s

Class

This E-R diagram shows that there is a 1:m relation between faculty and class entities. We read E-R diagrams left
to right, top to bottom. An E-R Diagram provides us a blueprint of a database. From it, we can tell what entities are
related and how we will represent relations between them in the database. For example, from the diagram above, I
know I will need to include PK of the faculty as a FK in the class table.