Sunteți pe pagina 1din 18

Normalization

1
Normalization of DB Tables
 Normalization
► Process for evaluating and correcting table structures
• determines the optimal assignments of attributes to entities
► Normalization provides micro view of entities
• focuses on characteristics of specific entities
• may yield additional entities
► Works through a series of stages called normal forms
• 1NF  2NF  3NF  4NF (optional)
► Higher the normal form, slower the database response
• more joins are required to answer end-user queries

 Why normalize?
► Reduce uncontrolled data redundancies
• Help eliminate data anomalies
► Produce controlled redundancies to link tables

2
Example: Need for Normalization
 PRO_NUM is intended to be primary key but contain nulls
 Table entries invite data inconsistencies
► e.g. “Elect. Engineer”, “Elect.Eng.”, “EE”
 Table displays data redundancies that can cause data anomalies
► Update anomalies
• Modifying JOB_CLASS could require many alterations (all the rows for the same EMP_NUM)
► Insertion anomalies
• New employee must be assigned a project
► Deletion anomalies
• If employee quits and a row deleted, other vital data may get lost

Database Systems: Design, Implementation, & Management: Rob & Coronel

3
Normalization: First Normal Form
 First Normal Form (1NF)
► All the primary key attributes are defined
► There are no repeating groups
► All attributes are dependent on the primary key

 Conversion to 1NF
► Objective
• Develop a proper primary key
► Steps
1. Eliminate repeating groups
 fill in the null cells with appropriate data value
2. Identify primary key
 identify attribute(s) that uniquely identifies each row
3. Identify all dependencies
 make sure all attributes are dependent on the primary key

4
Normalization: 1NF example
1. Eliminate repeating groups - Fill in the null cells to make each row define a single entity
2. Identify the primary key - Make sure all attributes are dependent on the primary key

Database Systems: Design, Implementation, & Management: Rob & Coronel 5


Normalization: 1NF example
3. Identify all dependencies (in a Dependency Table)
► Desirable dependencies (arrows above)
• based on primary key (functional dependency)
► Less desirable dependencies (arrows below)
• Partial dependency
 based on part of composite primary key
• Transitive dependency
 one nonprime attribute depends on another nonprime attribute
• Subject to data redundancies and anomalies

Database Systems: Design, Implementation, & Management: Rob & Coronel 6


Normalization: Second Normal Form
 Second Normal Form (2NF)
► It is in 1NF
► There are no partial dependencies

 Conversion to 2NF
► Objective
• Eliminate partial dependencies
► Steps
1. Start with 1NF format
2. Write each key component (w/ partial dependency) on separate line
3. Write original (composite) key on last line
4. Each component is new table
5. Write dependent attributes after each key

1NF (PROJ_NUM, EMP_NUM, PROJ_NAME, EMP_NAME, JOB_CLASS, CHG_HOUR, HOURS)



PROJECT (PROJ_NUM, PROJ_NAME)
EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR)
ASSIGN (PROJ_NUM, EMP_NUM, HOURS)
7
Normalization: 2NF example

Database Systems: Design, Implementation, & Management: Rob & Coronel

8
Normalization: Third Normal Form
 Third Normal Form (3NF)
► It is in 2NF
► There are no transitive dependencies

 Conversion to 3NF
► Objective
• Eliminate transitive dependencies (TP)
► Steps
1. Start with 2NF format
2. Break off the TP pieces and create separate tables

EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR)



EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS)
JOB (JOB_CLASS, CHG_HOUR)

9
Normalization: 3NF example

Database Systems: Design, Implementation, & Management: Rob & Coronel

10
Normalization: Fourth Normal Form
 Forth Normal Form (4NF)
► It is in 3NF
► There are no multiple sets of independent multi-valued dependencies
► Infrequently needed
• e.g. COURSE has multiple texts and multiple instructors
(texts for a course are not decided by instructor)

 Conversion to 4NF
1. Identify multiple multi-valued attributes
2. Create separate tables containing each of multi-valued attributes

COURSE CRS_TEXT
S511 DB design
COURSE CRS_TEXT CRS_INSTRUCTOR
S511 Inside Access 2007
S511 DB design Jones
S511 DB design Smith
COURSE CRS_INSTRUCTOR
S511 Inside Access 2007 Jones
S511 Jones
S511 Inside Access 2007 Smith
S511 Smith

11
Additional Table Enhancement
 Adhere to naming conventions
 Use transaction code instead of composite primary key when appropriate
► e.g. ASG_NUM in ASSIGN
 Use simple attributes
► e.g. EMP_LNAME, EMP_FNAME, EMP_INIT in EMPLOYEE
 Add attributes to facilitate information extraction
► e.g. EMP_NUM in PROJECT to indicate project manager
► e.g. ASG_CHG_HR in ASSIGN for historical accuracy of data
 Allow data controlled data redundancies
► e.g. ASG_CHG_AMOUNT in ASSIGN (derived attribute)

PROJECT (PROJ_NUM, PROJ_NAME)


JOB (JOB_CLASS, CHG_HOUR)
ASSIGN (PROJ_NUM, EMP_NUM, HOURS)
EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS)

PROJECT (PROJ_NUM, PROJ_NAME, EMP_NUM)
JOB (JOB_CODE, JOB_DESCRIPTION, JOB_CHG_HR)
ASSIGN (ASG_NUM, ASG_DATE, PROJ_NUM, EMP_NUM, ASG_HRS, ASG_CHG_HR, ASG_CHG_AMOUNT)
EMPLOYEE (EMP_NUM, EMP_LNAME, EMP_FNAME, EMP_INIT, EMP_HIREDATE, JOB_CODE)

12
Denormalization
 Normalization is one of many database design goals.

 However, normalized tables result in:


► additional processing
► loss of system speed

 When normalization purity is difficult to sustain due to conflict in:


► design efficiency
► information requirements
► processing speed
 Denormalize by
• use of lower normal form
• use of controlled data redundancies

13
ACID
 ACID stands for: Atomicity, Consistency, Isolation,
Durability
 ACID is the standard in computer science to judge
the reliability of a transaction. In the context of
databases, it is for data transaction.

14
Atomicity
 “all or nothing”
► If a transaction fails in the middle, it will be no transaction.
► If a transaction is aborted, this transaction does not
happen
► If a transaction is committed, the entire transaction should
be completed

 Example
► Money transfer from one bank to another bank.
► Buying the same book in an online bookstore

15
Consistent
 Any transaction must be valid according to all pre-
defined rules (e.g., constraints, triggers).
 Any transaction violates the defined rules will not be
committed.

 Example
► Applying for loan.

16
Isolation
 Determines how transaction integrity is visible to
other users and systems.
 Can many users access the same data at the same
time?
 Will one transaction block another transaction?

 Example
► Watching a video, can two users access the video at the
same time?
► Withdrawing money, can you and your family member
withdraw money from the same bank account?
17
Durability
 It guarantees that transactions that have committed
will survive permanently, even during the power loss
and other emergent situations.
 Transaction logs are used to enforce the durability.

 Example
► Booking a flight ticket online: even the system crashes, the
ticket if committed for booking, will be booked.

18

S-ar putea să vă placă și