Sunteți pe pagina 1din 5

Converting an ERD to a Physical Database

The purpose of this article is to provide the student with an approach to converting an Entity Relationship
Diagram (ERD) into a physical database. It is a top down approach modeled after Theory and Kroenke. It
assumes the database implementer is going to be using the Microsoft Access GUI tool to generate the DDL
"behind the scenes" and is familiar with this tool. If coded by hand many steps would be combined. In
fact, the same is true with a GUI, but thinking of it as four steps can still be helpful.

This article does not treat some advanced topics such as relationships involving three or more entities.
Rather, we will focus on "binary" relationships involving two entities which are those most commonly
encountered in commercial practice.

An alternative bottom up approach modeled after Hernandez will be presented in a later article.

In a nutshell there are four steps:

1. Covert the entities into tables. There are four "sub-steps" here depending on the
relationship(s) the entities participate in:

 one-to-one

 one-to-many

 many-to-many

 supertype-subtype

2. Apply business rules to tables created by setting field (or column) constraints
appropriately

3. Create references between tables (in Access called permanent links) and referential integrity,
applying appropriate business rules.

4. Create appropriate indexes. For the typical OLTP database we are considering here this would
mean ensuring that primary and foreign keys are indexed.

5. I am beginning to think there is another step that can be placed right here: create supporting
views (in Access - saved queries). But we'll leave this off for now.
Step 1: Convert Entities
One-to-one relationships

Two entities participating in a one-to-one relationships are each converted to a table. Typically the
table name is plural where it makes sense. For example: Employees, Parts, Suppliers, etc. Some people
further like to prefix the table name with tbl or an equivalent convention. For example: tblEmployees,
tblSuppliers.

The primary key of one of the tables is placed in the other table as a foreign key. Which table gets
the foreign key depends on your application. If for example we had a field engineering application with car
and engineer entities where at most one car was assigned to an engineer we might place the foreign
key in the engineer table. We would do this because we would typically be working with the
engineer table but might occasionally want to see information about his or her assigned car. Either will
work.

One more thing: optionality. If the assignment is optional, then set the foreign key to allow nulls.

One-to-many relationships

This will probably be your most common case. Fortunately, it is also the simplest. Each of the two
entities is converted to a table.

The primary key of the table on the one side is placed in the table on the many side as a foreign
key.

Again, if the relationship is optional, set the foreign key to allow nulls. (In Access this would mean set the
required property to no. We will discuss field properties in greater depth later.)

Many-to-many relationships

Your design may or may not have any many-to-many relationships. You may have "flattened" the
design already by converting the many-to-many to one-to manies. We'll assume your diagram still
has a many-to-many relationship.

First convert the two entities to tables. Then create a third table, often called an intersection or
junction table (but there are many different terms used). If there is no obvious name, give it a name
representative of the two other tables. Example: students are in a many-to-many relationship with
courses. We could call the intersection table student_courses. In this case, though, we can probably use a
"real" name such as enrollments.
Step 3: Create Relationships
In Access these are called Permanent Links. Once created they are available to the design tools,
wizards, etc. The other kind of a link is a Temporary Link created in the Query Designer that exists only
for the duration of a query.

In the Relationship Designer add the tables you have created to the design surface.

To create a relationship between two tables drag the primary key from the parent table to the
child table's corresponding foreign key. Enforce referential integrity. This maintains the integrity
of the relationship between the two tables. Child table foreign keys will always reference a parent
table record with a matching primary key. What to do when this is not the case will depend on you
business rules.

Child records can be added only if there exists a parent table record with a primary key that matches the
child's foreign key. If no corresponding parent table record exists an error message results.

If an attempt is made to delete a parent table record with child table records that reference it, or to
modify its primary key, an error message results. That is unless cascade options discussed below have
been selected.

Cascade Options

These correspond to ON <action> CASCADE. in SQL.

Cascade Delete

Causes the related child records to be deleted along with the parent table record. You probably want to be
cautious about when to use this one. Set it according to your business rule(s).

Corresponds to ON DELETE CASCADE

Cascade Update

Update the values of the foreign keys in the child records to reflect the new parent record primary key
value.

Corresponds to ON UPDATE CASCADE.

Join Type

You have one more decision. What type of join do you want the relationship to implement. There are three
choices: INNER, LEFT OUTER, AND RIGHT OUTER.

Inner Join
This is the default and is the typical join used in most cases. It will retrieve all records from the two tables
with matching key values.

Left Outer Join

This join will retrieve all records from the two tables with matching key values plus any records from the
first table that do not match. The missing values from the second table will be filled with NULLS.

Most database books do not emphasize how common this type of join really is, especially for reporting
purposes. In our field engineering example, say that you wanted to see a list of all your engineers and
details about the car they were assigned if one was assigned. An inner join would not return all engineers.
Only those with cars assigned. A left join would return all engineers, filling in car information with NULLS if
no car was assigned.

You probably are not going to set up many of these as permanent links but be aware when you go to the
query designer what type of join has been created for you.

Right Outer Join

This join will retrieve all records from the two tables with matching key values plus any records from the
second table that do not match. The missing values from the first table will be filled with NULLS.

I have not seen this join type used very much.

Full Outer Join

There is also a full outer join which includes unmatched records form both tables. Access does not support
this join type, although many other relational databases do.

Step 4: Create Indexes


Back to the table properties. We left off discussing indexes until last because this area is related to
performance. You may need to experiment a bit. A relational database and this includes Access, will work
without explicitly creating indexes. That is the good news. The bad news is that performance may be poor.
Before discussing which columns to place indexes on we need to review the two types.

Unique

This type of index ensures that all entries are unique. Primary keys would have this type of index. It can
also be used to ensure uniqueness of, for example, social security numbers, in an employee table with
badge number as the primary key.

Not Unique (allow duplicates)

These indexes allow duplicate values. They can be used to speed searches and reporting and for foreign
keys.

Guidelines - When and Where to Index


Although a seemingly simple enough subject, indexing is actually a very large subject. In this article we
are barely scratching the surface. As a rough rule of thumb:

 You want indexes on all primary and foreign keys


 In addition to the above, use indexes only on highly selective columns (more about this later) or
when required to enforce a business rule.

 There should be no more than about 4 to 5 indexes total for the typical OLTP type database. A read
intensive reporting database can have up to about 12 or so.

Access will have set up unique indexes for the primary keys automatically. For the foreign keys set the
indexed field property to yes(allow duplicates).

You may also want an index or two on columns that will be searched frequently. However, do this only if
the search column is highly selective. I.e., if you expect the typical retrieval operation to return 20% or
less of the records in the table then an index will help. Otherwise, due to the way relational database
indexes are structured the database will wind up doing less work just scanning through the table a record
at a time. The exact percentage will vary depending on the particulars of your database. Remember, I said
you might have to experiment a bit.

If your business rules require certain fields, other than primary keys to be unique, then set the
corresponding indexed field property to yes(no duplicates).

S-ar putea să vă placă și