Sunteți pe pagina 1din 14

MIS 3500 Database Management Systems * Asper School of Business

Instructor: Bob Travica

Lab on Multiplicity and Normalization


(Updated 2017)

This lab demonstrates the concepts of multiplicity, normalization, referential


integrity, and data anomalies associated with theory of relational databases. You will
use MS Access for exploring the concepts.

You will create some tables, modify them, and test them. The goal is to study
principles of relational database systems, rather than to focus on procedures of
developing tables. A basic level of knowledge of using MS Access is assumed.

1. Multiplicity

To get started, find the MS Access icon on the desktop and click it. Then, create a
new database.

Understanding normalization starts with understanding multiplicity. Discussed below


are the three types of multiplicity we studied in class, which determine three types of
relationships.

1.1 One-to-Many Multiplicity (1:M)

This is the most frequent relationship in database systems. To establish the 1:M
relationship, you need to import the key from the table on the one side into the
table on the many side. Sometimes these are called left table and right table,
although this labeling corresponds to the direction of exporting the key from a source
to a sink table rather than physical locations of tables. Access may call these tables
as primary and secondary. See Figure 1.

When tables have single-attribute keys, then the direction of drawing the relationship
determines which is the one side or left table. So, the table can really be on the
right side, but if you start drawing the relationship from its key to an attribute in the

1
other table, the former one will be considered the left table or one side table
while the other table will be the right table or many side table.

Figure1. Creating Relationship between Tables

Sink Table, * 1 Source Table,


Secondary Table, Table X Table Y Primary Table,
Many Side Table, XID YID One Side Table,
Attribute Attribute
Right Table YID Left Table

The key in the One Side table (YID) is the foreign key in the Many Side table.

To study the one-to-many relationship, run the set of procedures below for creating
tables Customer and Order. Let us assume that the relationship between Customer
and Order is based on the business rule that each customer can place many orders,
while each order must be associated with only one customer. Therefore, this is one-
to-many relationship.

Be sure to choose Number as the data type for the keys. If you use the default data
type called AutoNumber, you may run into troubles when you want to modify tables
and enter values for a foreign key.

Run this set of procedures:

1. To use Design View, click Create and then Table Design. Then, create a Table
that you will name Customer.

Specifically, create CustomerID field, chose Number for the data type, and
make this attribute Primary Key (right-click the row and choose Primary Key).
Then, create a column called CName, and set Short Text for the data type.
See Figure 2. Save the table under name Customer.

2. Create a Table you will name Order. Create a field OrderID, set the data type
to Number, and make this column the Primary Key. Also, create columns

2
CustomerID (Number) and OrderDate (Date/Time). Save the table under
name Order.

3. Establish a relationship between CustomerID in Customer and CustomerID in


Order. To do this, first you have to close the tables you have just created.
Then, use Database Tools/Relationships, and bring the tables on the screen, if
they are not there yet (e.g., right-clicking in the relationship window will pop
up a menu with the function Show Table).

Draw the relationship by placing the cursor on CustomerID in Customer;


press the mouse left button and, while holding it pressed drive the cursor
from CustomerID in Customer to CustomerID in Order; then release the
button.

4. In the form Edit Relationship that popped up, click the Enforce Referential
Integrity checkbox and mark the options on updating and deletion. Look at
the Relationship Type text box in this form, and notice that Access
automatically specifies this relationship as One-to-Many.

Click the Create button. If all went OK, you should see in the Relationship
window a link between your tables and multiplicity 1 and many (the infinity
symbol) displayed.

5. Once both tables are finalized, enter some data in each (see Figure 2). You
need first to open a table by clicking its name on the left-hand section on the
screen; then focus on the View button in the File menu, which accesses either
the design view or the datasheet view which data entry/output; you need the
latter.

To save time, you can copy and paste from the rows in Figure 2. For dates,
use the Calendar help; for year, use the last year rather than the one in the
example shown. You can use the calendar control and choose dates randomly.

Figure 2. One-To-Many Relationship

Customer Order
CustomerIDCName OrderIDCustomerID OrderDate

3
100 Trudeau, Justin 1 100 Jan. 3, 2005
200 Long, John 2 100 Jan. 13, 2005
300 Pirelli, Micaela 3 200 Feb. 5, 2005
400 Hermosa, Rosa 4 200 May 15, 2005

Note: The key column is boldfaced.

Analysis:

1) Notice how the same values of CustomerID repeat in Order, while OrderID takes
on different values in each row. This is because each (one) instance of Customer can
be associated with many instances of Order (e.g., CustomerID 100 is associated with
OrderIDs 1 and 2). In contrast, each value of OrderID corresponds to only one value
of CustomerID (OrderID 1 is associated with CustomerID 100 only, OrderID 2 with
CustomerID 100 only, OrderID 3 with CustomerID 200 only, etc.).

2) Also depicted in Figure 2 is that table Order contains only those values of
CustomerID that exist in the table Customer. This means that referential integrity is
supported. In other words, only those values of CustomerID that already exist in the
Customer table can be used in the Order table. Without referential integrity, the
Order table would contain customer identifiers that do not exist in customer records
a perfect springboard for fraudulent transactions!

Check what happens when you violate referential integrity. For example, try to enter
the number 500 in the CustomerID column of the Order table. What happens?

Lastly, notice that the customers with CustomerIDs 300 and 400 are not associated
with any order yet. This fact implies that the minimum multiplicity on the Order side
is 0 (zero). However, the minimum multiplicity is not specified in a schema but just
in a class diagram (if there is a good reason for doing so, since the maximum
multiplicity suffices for most of practical purposes).

4
1.2 One-to-One Multiplicity (1:1)

In contrast to the frequently used one-to-many multiplicity you have just created,
one-to-one multiplicity is used much less. The one-to-one multiplicity is created by
sharing the same key between two tables.

An example of 1:1 multiplicity is between the Customer table and the BillingAddress
table shown in Figure 3. The logic behind separating the second address from the
customer record is that not all customers will have the second address, and so
inserting its column in Customer would waste the storage.

Concentrate on Figure 3. You only have to create a new table called BillingAddress
with the columns CustomerID (Number, key), and BillingAddress (Short Text).

First, modify the Customer table by adding the column Address (Short Text). To use
the design modification function, open a table, and then click View/Design View.

Establish a relationship between Customer and BillingAddress tables by clicking


CustomerID in Customer and drawing the line to CustomerID in BillingAddress;
enforce referential integrity and cascading updates and deletions.

Figure 3. One-to-One Relationship


Customer
CustomerID CName Address
100 Trudeau, Pierre 57 St. Catherine, Montreal, QC
200 Long, John 33 RR Pkwy, London, ON
300 Pirelli, Micaela 75A Taylor Ave., Winnipeg, MB
400 Hermosa, Rosa 101 Rosebud Way, Rosetown, BC

BillingAddress
CustomerID Billing Address
100 1-101 Square 1, Gatineau, QC
300 234 Castle St., Toronto, ON

In drawing the relationship, note again that it does matter which is the left
(source) and which the right (sink) table, although they are both one table in
terms of multiplicity.

5
If you move table BillingAddress around, you will see that the system really created
a new multiplicity of one next to table Customer.

Note: To work on an established relationship, including deleting it, right-click on the


relationship line, and then manipulate the data entry form that pops up.

1.3 Many-to-Many Multiplicity (M:N)

Many-to-many relationships occur in class diagrams to reflect frequent business


situations (e.g., an item can appear on many orders, and an order can contain many
items; an employee can work on many projects, and each project may have many
employees engaged). To be implemented in a relational database system, M:N
relationships must be transformed into 1:M relationships through the technique of
data normalization. (More on this in the section on normalization below.) Therefore,
a schema shows a M:M relationship as two 1:M relationships.

Let us try to implement a design that is not normalized first. Note that this is just for
the purpose of studynot the design you should ever implement in reality.

To create a many side table, you need to have a FK in it. Based on this logic, a M:N
relationship we are trying to simulate here will have FKs in each of the two
associated table, as shown in Figure 4.

Figure 4. Simulating M:N Relationship (incorrectly)

To implement this relationship, run the following set of procedures.

Create the tables:

6
1. Make a copy of the Order table, and name it Order1. A quick method is to
open Order table, click Save As/Save Object As, and type Order1.

Open Order1, and delete all the data from it. Rename OrderID into Order1ID.
This procedure speeds up your work, while making sure that you can
manipulate data as you wish. For example, if you want to add another field to
the key in the existing Order table and so make a combined (concatenated)
key but you did not delete the old data from Order1, Access will report the
error that the key field cannot be null (see step 2).

2. In the table Order1, add a new column ItemID (Number). You can do this in
Design View. Optionally, in the Datasheet View, right-click the column
CustomerID. In the popup menu, click Insert Column. You want to name the
newly created column ItemID. Hint: In the Datasheet View, right-click the
column name, and select Rename Field.

3. Set Order1ID and ItemID to be a combined key. Hint: In Design View, press
Control key, and click the left-most column of each columns name in rows for
Order1ID and ItemID. While the mouse button is still pressed, in the popup
menu, select Primary Key. Optionally, in the Design Ribbon, click the Primary
Key (the mouse button should be released to get the Primary Key function
active.

4. Create the table Item with columns ItemID (Number), Order1ID (Number),
and ItemName (Short Text). Make ItemID and Order1ID a combined key.

5. Set a M:N relationship between the tables Order1 and Item, by drawing two
relationships. The first relationship is between the column Order1ID in the
table Order1 and the column Order1ID in the table Item. What happens with
referential integrity? You realize that it is not possible to enforce it when you
try it since the system reports an error. In addition, the relationship cannot
be designed as 1:M but only as indeterminate. This is what happens when
you try to force a M:N relationship on the systemit will not be accepted. You
can just fudge it (make it up), and the system will give you no guarantee
that data will be correct and consistent.

7
6. Set the other relationship between the column ItemID in the table Item and
the column ItemID in the table Order1. You experience the same problems as
in step 5. The database engine will ask you if you want to edit the existing
relationship, and you should answer No. This will result in creating a new
relationship between Order1 and a new table the system will create. This is
how the database engine responds to your essentially illegal (wrong) design
step.

7. Enter some data in the tables (perhaps it is best for now to use the example
in Figure 5). To keep it simple, let us assume that ItemID takes values of
two-digit numbers 10, 20, etc.

Figure 5. Many-to-Many Relationship


(Note: Tables are not normalized and we use them just for study purposesnot in a properly

designed database system.)

Order1
Order1IDItemID CustomerID Date
1 10 100 1/3/2005
2 10 200 1/3/2005
Note: Business rule (the first half): Each item appears on many orders

Item
ItemID Order1ID ItemName
10 1 Nut
20 1 Bolt
Note: Business rule (the second half): and each order can contain many items.

Notice above that item 10 appears on orders 1 and 2 (table Order1), while order 1
contains items 10 and 20 (table Item). Therefore, the M:N relationship appears to be
implemented. But the design is all wrong!

Analysis: The design you created represents the business rule that each item can be
associated with many orders, while each order can contain many items. Thus, the
many-to-many relationship between Order1 and Item appears as if it is really
implemented.

8
However, Access (or any relational database) actually does not support this
relationship, and that is why you are getting the strange indeterminate
relationships, the extra table, and a rejection to your request to enforce referential
integrity. Indeed, this design is not normalized, and therefore it is inappropriate for a
relational database system. For the things that can go wrong with this design, please
see the next section.

2. Normalization

Data in a relational database system must be normalized. The purpose of


normalizing data is to preserve data quality (accuracy, integrity) and to avoid
problems (anomalies) with data insertion, modification and deletion.

2.1 Problems with Non-Normalized Data

Referential integrity Loss. You have already encountered the issue of normalization
when exploring referential integrity in this lab. Normalized data support referential
integrity, while non-normalized data do not.

Consider again the one-to-many relationship between the tables Customer and Order
in Figure 2. When you tried to enter such a value of CustomerID in the table Order
that did not exist already in the table Customer, the database engine stopped you.
Otherwise, a user could enter orders for non-existing customers. In contrast, the
non-normalized design in Figure 4 allows you to enter FKs that do not match PKs.
This is a violation of referential integrity. For exercise, try to enter Oder1ID 2000 in
the table Item. What happens?

The system will take this incorrect input, although there are just orders 1 and 2.

9
Indeed, you can enter any data in the columns for primary and foreign keys, and the
system will not detect errors.

To explore other disadvantages of non-normalized data (and advantages of


normalized data), let us create a table CustomerLong that mixes master data with
transactional data. Therefore, this table will contain a repeating group (customer
data repeat for each order; note that a real customer record would be much longer;
a repeating group is considered to be two or more attributes). As shown in Figure 5,
CustomerLong merges tables Customer and Order from Figure 2.

To quickly create the table CustomerLong do the following.

1. Make a copy of the table Customer and name it CustomerLong. Delete rows for
Micaela and Rosa.

2. Open CustomerLong in Design view and remove the key property from the column
CustomerID. This has to be done because you will have some rows with same values
in CustomerID and the system will not tolerate this. So, you have to fool the
system for the sake of this exercise by having a table with no key column for a bit.

3. Create columns OrderID (Number) and OrderDate (Date/Time).

4. Enter additional data in the two order columns as shown in Figure 5.

5. Make columns CustomerID and OrderID the primary key.

Figure 5. Non-normalized Table with Repeated Group Customer

CustomerLong
CustomerID CName Address OrderID OrderDate
100 Pierre, Justin 57 St. Catherine, Montreal, QC 1 11/16/2016
200 Long, John 33 RR Pkwy, London, ON 2 9/14/2016
100 Pierre, Justin 57 St. Catherine, Montreal, QC 3 12/1/2016
200 Long, John 33 RR Pkwy, London, ON 4 10/26/2016

Now, see what happens if you try to perform standard database operations of
deleting, inserting, and modifying data.

10
Deletion Anomaly: If you delete records on orders #1 and #3 in CustomerLong (say,
the customer #100 cancels these orders), you will loose the data on this Customer
as well. Therefore, there is deletion anomalyunintended loss of data when target
data are deleted.

Insertion Anomaly: To enter a customer record for a new customer Micaela you
would need to also enter OrderIDs for each record because it is part of the key. A
key column cannot be empty (technically, cannot contain a null value). To bypass
this problem, you can fake an order number (say, there is a reserved set of numbers
for this purpose close to the upper limit of the values range, such as 9000001; see
Figure 6). This solves the input problem but apparently reduced the quality of data
(accuracy, consistency). Therefore, there is insertion anomaly: the desired data
cannot be entered when so needed or some violation of data accuracy must be
applied to enter the desired data. Business consequences of this design, if
implemented, are extremely serious (falsification of facts).

Figure 6. Demonstration of Data Anomalies with Not-Normalized Table


CustomerLong
CustomerID CName Address OrderID OrderDate
100 Pierre, Justin 57 St. Catherine, Montreal, QC 1 11/16/2016
200 Long, John 33 RR Pkwy, London, ON 2 9/14/2016
100 Pierre, Justin 57 St. Catherine, Montreal, QC 3 12/1/2016
200 Long, John 33 RR Pkwy, London, ON 4 10/26/2016
300 Pirelli, Micaela 75A Taylor Ave., Winnipeg, MB 9000001
400 Hermosa, Rosa 101 Rosebud Way, Rosetown, BC 1 1/23/2017

Another instance of the insertion anomaly is in the row with customer 400. Order ID
1 is repeated for this customer 100, making the combined key values still distinct
due to the variation in CustomerID. But the big problem is that the system fails to
detect the error since there is no referential integrity. The consequence is that the
data accuracy is compromised.

Modification Anomaly: If you want to change the customer address for Customer 1,
you would need to apply the change to every row in which this customer appears.
Suppose that Customer 1 appears in hundreds of rows. Doing this manually would

11
almost certainly result in errors. The task can be automated, but the inefficiency of
such an update procedure and the store waste are obvious problems.

2.2 Normalizing Many-to-Many Relationships

A relational database system is most comfortable with the 1:M relationship. In this
form, data integrity can be preserved and querying properly performed. This means
that a M:N relationship must be transformed into two 1:M relationships.

The method of normalizing this M:N relationship is by inserting a new table that will
bridge (interface) the tables Order and Item. You can think of this new table as the
one that is "absorbing" or "reducing" the multiplicity between tables Order and Item.
Both Order and Item will have a separate 1:M relationship with this bridge table
(OrderItem1 in Figure 7).

To see how this works, run the following set of procedures. Your result should
resemble what is depicted in Figure 7.

Figure 7. Normalized M:N Relationship

Order OrderItem1 Item1


1 * 1
OrderID OrderID * Item1ID

CustomerID Item1ID Item1Name

Date Quantity

To complete this exercise do the following.

1. Use the old Order table.

2. Create a new table Item1 with fields Item1ID (Number, PK) and Item1Name
(Short Text). You can make a copy from the old Item table and then adjust it.

3. Create a new table OrderItem1 with columns OrderID (Number, PK), Item1ID
(Number, PK), and Quantity (Number).

3. Close the tables, and establish a 1:M relationship between Order and OrderItem1,
while enforcing referential integrity and the cascading update and deletion.

12
4. Establish a relationship between Item1 and OrderItem1, while enforcing
referential integrity and the cascading update and deletion.

5. Enter some data into Item1.

Test your design for normalization. Open all the three tables (Order, Item1 and
OrderItem1).

Enter data in OrderItem1. What determines the range of acceptable values for the
key columns? Why? (Hint: its about the PK determining values of the FK, which is a
consequence of referential integrity.)

Once you have some records in OrderItem1, try to change values of the keys in
Order and in Item1. Clicks Save after each change. What happens with foreign keys
in OrderItem1? (Yes, they change accordingly due to the working of referential
integrity in normalized tables, which enables these parallel changes.)

Try to delete some records in Order and in Item1. What happens in OrderItem1?
Why? (The reason is the same as for updating PK values above - referential integrity
and cascading deletion in this case).

2.3 Normalizing One-to-One Relationships

The tables Customer and BillingAddress in Section 1.2, Figure 3, are already
normalized since all non-key attributes depend on the key only (3NF).

Run some normalization tests. For example, try to insert in BillingAddress a value of
CustomerID that does not exist in Customer. What happens? (Referential integrity
blocks you.)

You can still insert a new customer in the Customer table since this is considered the
left table (the one from which the key is exported, provided that you drew the
relationship from Customer to BillingAddress).

13
Thats all, folks!
(For now )

14

S-ar putea să vă placă și