Sunteți pe pagina 1din 16

2 The ER model

In this chapter:
• Approaches to database design
• Definitions of entity, attribute, and relationship
• Representing one-to-many relationships

“How do I start to design a database?”


This is a question that students frequently will ask. It’s also a good
question for anyone who has just purchased a database software pack-
age, and wants to put it to use. We can think of several ways to get
started:

• Pick an example from the ones supplied with your software, or


use the software’s built-in guides (Access, for example, calls
these “wizards”). If your application looks like one of theirs,
this may be a good method. But you probably won’t learn the
reasoning behind the examples, and some of the “wizards”
may not be as smart as they should be. (We’ll show examples of
this.)

• List all of the data elements that you need, and build them into
a single table. This is what many naïve designers do, and the
result is guaranteed to be a disaster for anyone who has to use
it or maintain it. We have seen many “horror-stories” of data-
bases in real organizations that were (non-) designed this way;
perhaps you have, too. Actually, listing all of the necessary data
isn’t a bad way to start, but you will need to develop the skills
to turn the list into a well-designed database. From a software
engineering perspective, you could call this a “bottom-up”
design strategy. You’ll learn how to do it later in this book.
24 Practical Relational
Database Design

• Think of the real-world enterprise that you are modeling. Try


to identify what objects, events, or facts have to be repre-
sented; what characteristics of these objects should be stored
in the database; and how the objects are related to one another.
This could be called a “top-down” design strategy, and we
believe that it is the easiest and most effective way to start.

The ER model supports the third design strategy, real-world enter-


prise modeling. There are only three elements in an ER model: enti-
ties, attributes, and relationships.

2.1 Entities
An entity is any “thing” that is part of the enterprise and has to be
represented in a database. It could be a physical object: a person, a
piece of furniture, or an item for sale in a store. It could be a concep-
tual object: a movie, for example, that has a title, a plot, dialog and
action but no physical form until it is stored on film or video tape. It
could be an event that happens: a customer places an order, a student
enrolls in a class, or a bus makes a stop to pick up passengers. It could
simply be a fact: a medical patient’s blood serum level is measured at a
specific value.
For our first try at ER modeling, let’s consider
the enterprise of a library. You’re familiar with this
enterprise; you’ve used libraries and you know their
purpose and how they operate. What are the Figure 2-1:
“things” in this enterprise that might need to be rep- Library entities
resented in a database? Well, books would probably
be the first to come to mind. But there are also customers (who check
out books), librarians (who catalog and maintain the books), shelves
(to store the books), reading tables, and probably many more. Of
these, which are relevant to a database? Since our purpose in this
example is to keep track of which books are currently loaned out to
which customers, those two (books and customers) should be the
entities we start with. We’ll diagram them as simple labeled rectangles
or boxes. As it happens, both of these entities are physical objects, but
we will soon show examples of entities that you can’t touch or see.
We will use the word “entity” (and the names of entities) in differ-
ent ways, which normally should be clear from the context. Where it
Chapter 2 25
The ER model

may not be clear, we will specifically identify which one of the follow-
ing we are talking about.

Figure 2-2: Entity concepts

• An entity individual is one specific “thing” (physical or con-


ceptual object, fact, or event) associated with the enterprise.
We will always refer to an entity individual in the singular: “a
book” or “a customer.”

• An entity type is a class of individuals associated with the


enterprise, which can be described by a finite set of syntactic
and semantic rules determined by the enterprise—according to
their intrinsic, real-world structure and their meaning in the
enterprise. Any individual can be classified as belonging to or
not belonging to the entity type by evaluating it according to
these rules. The entity type is what is shown in the ER dia-
gram; you might also think of it as a prototype that defines the
characteristics of each individual. We will normally refer to the
entity type in the singular (BOOK or BOOK type), unless the
context of a sentence requires the plural.

• The active set of an entity type consists of all those individuals


of the given type who are currently participating in the enter-
prise. We will always refer to the active set in the plural
(BOOKS). As a student, you might fit the semantic rules for
being a library customer—but if you don’t have a library card
yet, you aren’t in the active set of library customers. (Notice
that you don’t have to have any books checked out to be in the
active set of customers.) The concept of entity individuals and
active sets will be important when we discuss relationships
26 Practical Relational
Database Design

and mapping (below). At the ER model level, we are not con-


cerned yet with which specific entity individuals are active in
the enterprise—this will be represented in the relational model
(Chapter 5)

Remember that an entity (individual, type, or active set) is some-


thing that exists in the real world, as we show in Figure 2-2. You can’t
magically store it in a computer. You have to decide what characteris-
tics of it can be (and need to be) represented in the actual database.

2.2 Attributes
In Chapter 1, we claimed that building a model was a process of
selecting relevant characteristics of whatever we were modeling. The
Civil War soldier was represented by his name, rank, height, eye color,
and so on. The pipe, painted by Magritte, was represented by its shape
and color. The cities, roads, and other features on the map were repre-
sented by lines, symbols, colors, and text. The characteristics of each
entity that you select to include in your ER model are called the
attributes of that entity. When you are first thinking of attributes, try
to decide what (relevant) characteristics are needed to describe the
entity. Or to put it another way: what does the client need to know
about this entity?
Unlike entities, attributes can actually be represented in a com-
puter. We can say that an attribute defines storable values that can be
stored on disk, transmitted over a network, and retrieved to an output
device in the form of a text string, a number, a digitized picture, an
audio or video file, and so on.
We can also think of an attribute more formally as a function that
assigns a single storable value (from a well-defined set of storable val-
ues) to each individual in a set of real-world entities. So name (you) =
“Your Name,” birthdate (you) = “1 July 1975” (or whatever it is), and
so on. The set of storable values that each attribute can assume is
called the domain of that attribute. You will learn more about
attribute domains in Chapter 9.
Let’s think about the entities in our library model. What charac-
teristics—attributes—describe each of them? What does the client
need to know about them?

• Books have titles and authors, but in a library they also have
call numbers and possibly copy numbers (if the library has
Chapter 2 27
The ER model

multiple copies). If you want to find a specific book, you look


up the call number and go to the shelf where it is stored. The
combination of call number and copy number (if needed) is
enough to identify the particular book that you have in your
hand. When you check out the book, this is how the library
will keep track of it, so it’s clearly an important attribute. For
simplicity in this example, we’ll say that the title is also an
attribute of the book. Later on, we will think of the title as an
attribute of another (conceptual) entity. We’ll avoid the prob-
lem of the author attribute for the moment, since books can be
written by more than one author. You’ll find out how to handle
this situation in the next chapter. You could probably think of
more attributes of a book: whether it is cloth-bound or paper-
back, what size it is, and so on. Some of these may be impor-
tant to the library: size, for example, if extra-large books are
shelved in a separate room. Again, we’ll keep the example sim-
ple by excluding these.

• Customers have names, addresses, and telephone numbers, at


the minimum (so the library can send out over-due notices).
We’ll include these as attributes in our model. Other attributes
of a customer such as status (student/faculty/staff) could be
added later. But we need to think for a moment about the
name attribute. What are the chances that two customers have
the same name? If the library is large, this is very likely—and
you will need to distinguish between them in some way. We’ll
fix the problem by giving each customer an identification num-
ber, which we will guarantee is unique to each person. (We’ll
look at this problem again below, and in much more detail in
Chapter 4.)

We will list the attributes


of each entity in the same
boxes that we drew before.
This makes an uncluttered,
easy-to-read graphic presenta-
tion, which is also reasonably
Figure 2-3: Customer and book
consistent with many of the attributes
modern ER design packages.
28 Practical Relational
Database Design

The attributes that are starred are those that serve to uniquely iden-
tify individuals of the entity type.
Attributes are defined for each entity type according to the rules
of the type. If the attribute value A(j) is defined for an individual j of
type E whenever j belongs to the active set for E, then the attribute A
is said to be defined for the type E. The attribute set SE for an entity
type E is the set of all attributes defined for E. If an attribute is
defined for every individual of a given type then it is defined for that
type. However, the converse is not true. Some attributes that are
defined for the type may not be defined for individuals that are not in
the active set. For example, books that are not in the library do not
have call numbers. However, every book in the library has a call num-
ber. Thus, the attribute call number is defined for the entity type BOOK
in the library, even though call number is not defined for every book
individual in the universe.

2.3 Relationships
Once we have our first entities, and know at least a few of their
attributes, we need to decide how they are connected—what they
have to do with each other. (If an entity has absolutely nothing to do
with any other entity in the model, it shouldn’t be there.) The connec-
tion between two entities is called, not surprisingly, a relationship.

2.3.1 Relationships represent mappings between entities


A relationship describes the con-
nection between two entity types. Entity Relationship
It actually represents a mapping concept concept
between members of each entity
Entity type Relationship
set. When an individual of one
entity set is paired with an indi- Active set Mapping
vidual of another entity set, we Individual Link
say that these individuals are
linked by the relationship. There- Figure 2-4: Correspondence between
entity and relationship concepts
fore, the mapping between entity
sets is the set of all links between
individuals of those entity sets.
These concepts are easiest to understand with an example, for
which we will use the library’s customers and books.
Chapter 2 29
The ER model

In general, what
is the connection
between customers
and books; what do
they have to do
with each other?
Clearly, customers Figure 2-5: Relationship between customer and book
borrow books—or, entities
said the other way,
books are borrowed by customers. A relationship always can be
described in either direction, and we’ll show it by simply drawing a
line between the two related entities. (We’ve labeled the line in the
illustration. Some development tools let you do this; others don’t—but
you should try to do so whenever possible.)
Let’s be a bit more precise in our definition of this relationship.
Think about only one customer. How many books can that customer
borrow at a given time? Normally, the answer is many—although a
given customer might have only one book (or none at all) checked out
right now. We are concerned here with the largest number, though, so
we will describe the relationship as “each customer borrows many
books.” Now think about the relationship the other way, but start
with only one book. If we are only concerned with the current status
of the book (whether or not it is checked out, and if so, to whom),
then each book is borrowed by (at most) one customer at a time.
The relationship
between the entity
types, shown in Fig-
ure 2-5 above, repre-
sents the mapping of
customers (members
of the active CUSTOM-
ERS entity set) to
books (members of
the active BOOKS en-
tity set) that they
have currently bor- Figure 2-6: Mapping Customers to Books
rowed. Figure 2-6 il-
lustrates the mapping; each line connecting a customer to a book
represents a single link in this mapping.
30 Practical Relational
Database Design

• We call this a one-to-many relationship between customers and


books. For each customer that participates in the relationship
(by checking out books), there can be many books participat-
ing in the relationship (by being checked out).

• But it is a many-to-one relationship between books and custom-


ers. For each book that participates in the relationship (by
being checked out), there is only one customer participating
(by having checked out the book).

The easiest way


to understand the
one-to-many and
many-to-one map-
pings of a relation-
ship is to always
Figure 2-7: ER representation of one-to-
describe them with many relationship
sentences construct-
ed like the ones we
have shown above. Form these sentences in standard English syntax:
subject – verb – direct object.

• The subject—one entity name—is always modified by the


determiner “each.”

• The verb describes the relationship.

• The object—the other entity name—is always modified by a


determiner of either “one” or “many.”

• Do this in both directions! “Each customer borrows many


books.” “Each book is borrowed by one customer.”

Actually, the second sentence, “Each book is borrowed...,” is in the


form that your English writing teacher would call “passive voice”—
with the object coming first and the subject coming after the (passive)
verb. In writing, this is usually a mistake; in describing relationships,
it’s unavoidable.
Chapter 2 31
The ER model

2.3.2 Mappings are constrained in number


The mapping between entities is always
constrained in number as it is here
between customers and books. We call this
constraint the cardinality of the relation-
ship. (Some books use the term maximum
cardinality.) Informally, we can say that Figure 2-8: Cardinality
cardinality is a constraint on a relation- symbols
ship that describes the maximum number
of individuals in one active set that may be
linked to a single individual in the other set. Cardinality may be one-
to-many or many-to-one, as in the example above. It may also be one-
to-one, for a relationship in which each member of one entity set is
mapped to (at most) only one member of the other entity set, in both
directions. It may not be many-to-many; you will find out how to han-
dle what appear to be “many-to-many” relationships in the next chap-
ter. The symbols we use to represent “one” and “many” are
commonly used (but not standardized) in textbooks and database
software.

2.4 Representing relationships: local, surrogate, and


foreign attributes
Before we complete the model, we have to identify attributes that will
represent the relationship. Said another way, for every book that is
checked out, we need to know which customer has it (including the
customer’s name, address, and phone number). While the book is
checked out, the attributes of the customer who has it temporarily
become attributes of the book itself.
We are actually dealing here with three types of attributes: local,
surrogate and foreign.

• Local attribute: An attribute A for an entity type E is local if


it is defined from intrinsic properties of E. That means, every
individual p in the active set of E has a well defined value A(p)
which is obtained by applying the semantic rules of E directly
to the individual p. The name, address, and telephone number
are local attributes of a CUSTOMER; the call number, copy number,
and title are local attributes of a BOOK.
32 Practical Relational
Database Design

• Surrogate attribute: A surrogate attribute for an entity type


E is an attribute that is manufactured by the database designer
and has no relationship to the semantic rules which define E.
The customer ID is a good example of a surrogate attribute.
Surrogate attributes exist solely for the convenience of the
database designer, and have no descriptive value. In this case,
the surrogate attribute is designed to insure that we can distin-
guish between two customers who have the same name. We
will look at surrogate attributes in detail in Chapter 4.

• Foreign attribute: An attribute A of an entity type E is a for-


eign attribute if it imports its definition from another entity
type F. The foreign attribute A can only be defined when E
relates to another type F via a relationship r that is one-to-one
or many-to-one. Moreover, r must map each individual in the
active set of E to some individual q in type F. (That is, E’s par-
ticipation in the relationship is mandatory.) In this case
attribute A is defined as follows. For each individual p in the
active set of E, A(p)= A'(q) where A’ is an attribute of F and
q is the unique individual of type F that is paired with p by the
relationship r. All of the attributes of the customer (customer
ID, name, address, phone) become foreign (imported) attributes
of the book when the book is checked out.

Although these definitions are wordy, the concepts are straightfor-


ward. Local attributes are intrinsic to the type—home grown. Surro-
gate attributes are useful but artificial creations. Foreign attributes are
defined outside the entity type in which they appear, and their mean-
ing is imported.

2.4.1 Each attribute is local to one entity


An important design guideline is that every attribute must be local to
at most one entity type. We certainly don’t want to repeat all of the
information about a customer in the BOOK entity. Fortunately, we
don’t need to. We designed the customer identification so that it
uniquely identifies each customer. (It is called the primary key of the
customer entity, and is starred in the ER diagram. We’ll define this
term in Chapter 4.) So all we need to know (about a checked-out
book) is the identification number of the customer who has checked it
out. Given that, we can always find other attributes of the customer
Chapter 2 33
The ER model

(name, address, etc.) because they are part of the customer entity.
Remember how the soldier in our Civil War example could have
accomplished the same thing by using an officer’s initials to save writ-
ing the name over and over.
In other words, the customer ID is the only foreign attribute that
has to appear in the BOOK entity; all of the other customer attributes
will be inherited by the book along with it. This special foreign
attribute, used to represent the relationship between a customer and a
book, is called a foreign key (which we will also define in Chapter 4).
A primary key value and a matching foreign key value form the link
between two individuals of their respective entity types. In our ER
diagrams, we show foreign keys in italic, so you can immediately see
their function. (Some designers add “fk” to foreign key attribute
names.)
Notice that each customer ID will appear as a member of the active
CUSTOMERS entity set only once (remember that we designed it to be
guaranteed unique). Importantly, the local attributes associated with
that customer ID (name, address, and phone) will also appear only once
in the database, as our design guideline above suggested. But because
the relationship of customers to books is one-to-many, the same cus-
tomer ID might appear as a foreign key attribute value of many books.
When the customer returns a book, the customer ID attribute
value will be removed from that book. In other words, a book that is
not checked out will not have a customer ID associated with it. When
an attribute value like this is missing we call it null. The special con-
stant value null, compatible with the domain of any attribute in a
relational database, means that the attribute is undefined for this
entity individual.

2.4.2 This model has a problem


Although our model at this point has correctly represented the one-to-
many relationship between customers and books—we can show
which customer has which book checked out—there is still a fairly
serious problem: where should we put the due date? Clearly, this is not
an attribute of the CUSTOMER entity, since each customer may have
many books checked out. Almost as clearly, it really doesn’t make
much sense as an attribute of the BOOK, either—although for the
moment, this is the only place to put it. (Notice that the due date, like
34 Practical Relational
Database Design

the customer ID, is entered only if a book is checked out; otherwise, it


will be null.)
What has hap-
pened is that we’ve
not yet identified
all of the entities
that are needed to
model the check-
out process. This is
Figure 2-9: Library, Version 1 (incorrect)
realistic—complete
models don’t just
appear, they are built step-by-step. An apparent problem like the awk-
wardly located and frequently null customer ID and due date can help
you to identify where a model needs further work. We’ll refine this
one in the next chapter; for the time being, our library ER model
looks like Figure 2-9.

2.5 Summary: one-to-many relationships


The one-to-many relationship (or many-to-one, read in the other
direction) is the most common relationship in the ER model. As a
final example, we will look at another enterprise: a medical clinic.
We wish to track the level of various substances (for example, cho-
lesterol or alcohol) in the blood of patients. To do this we develop a
model with two entities: PATIENT and BLOOD SAMPLE.
Within this mod-
el there are local, sur-
rogate and foreign
attributes. BLOOD
SAMPLE has the local
attributes Substance,
Date that the sample
Figure 2-10: Patient and Blood Sample entities
was taken, and Level
that was measured.
PATIENT has the local attribute patient name and the surrogate attribute
patient ID, a number defined by the hospital to identify the patient and
protect privacy. It is important to note that patient ID can also be de-
fined for every BLOOD SAMPLE individual. Each sample is related to ex-
actly one patient, although each patient may give many samples. So,
Chapter 2 35
The ER model

patient ID (of a particular blood sample) =


patient ID (of the patient who gave the sample).

The surrogate attribute patient ID is the primary key of PATIENT.


Thus the single foreign attribute patient ID in BLOOD SAMPLE consti-
tutes a foreign key.
The attribute in BLOOD SAMPLE that corresponds to patient ID need
not have the same name in the ER model, although we usually give
matching attributes the same name. The designer could call it donor
ID if this seems more appropriate. In that case:

donor ID (of a particular blood sample) =


patient ID (of the patient who gave the sample).

So far, we’ve seen only relationships in which at least one side has
a cardinality of one. In fact, we’ve said that relationships can only be
one-to-many, many-to-one, or one-to-one. What happens if both sides
of a relationship appear to have a value of many? We’ll look at this sit-
uation in the next chapter.

2.6 Glossary
Entity: any “thing” that is part of the enterprise and has to be repre-
sented in a database. (p. 24)

Entity individual: one specific “thing” (physical or conceptual


object, fact, or event) associated with the enterprise. (p. 25)

Entity type: a class of individuals associated with the enterprise,


which can be described by a finite set of syntactic and semantic rules
determined by the enterprise—according to their intrinsic, real-world
properties and their meaning (semantics) in the enterprise. (p. 25)

Active set (of an entity type): all those individuals of the given type
who are currently participating in the enterprise. (p. 25)

Attribute: a characteristic of an entity that you select to include in


your ER model. Formally, an attribute is a function that assigns a sin-
gle storable value (from a well-defined set of storable values) to each
individual in a set of real-world entities. (p. 26)

Domain: the set of storable values that each attribute can assume. (p.
26)
36 Practical Relational
Database Design

Relationship: a description of the connection between two entity


types, which represents a mapping of members of one active entity set
to members of the other active entity set. (p. 28)

Mapping: the set of all links between individual members of the


active sets of two entity types. (p. 28)

Link: a pairing of an individual of one active entity set with an indi-


vidual of another active entity set. (p. 28)

Cardinality: a constraint on a relationship that describes the maxi-


mum number of individuals in one active entity set that may be
mapped to a single individual in the other set. Cardinality may be one-
to-many, many-to-one, or one-to-one. (p. 31)

Local attribute: an attribute of an entity type E that is defined from


intrinsic properties of E. (p. 31)

Surrogate attribute: an attribute of an entity type E that is manufac-


tured by the database designer and has no relationship to the semantic
rules that define E. (p. 32)

Foreign attribute: an attribute of an entity type E that imports its


definition from another entity type F. (p. 32)

Null: a special constant value, compatible with the domain of any


attribute in a relational database, which means that the attribute is
undefined for this entity individual. (p. 33)

2.7 Exercises
2-1. Any university database will have an entity type called STUDENT.
Develop a model of this entity type, including any attributes which
you think are appropriate. To demonstrate the concept of storable val-
ues, try to include some attributes that cannot be represented in text
(character) form. Describe in your own words what (or who) a set of
students might include, and give an example of an individual member
of the student entity set.

2-2. Many large companies are organized into different departments


(for example, sales, manufacturing, administration, and so on).
Employees are typically hired by (or administratively assigned to)
only one department, regardless of where they actually work. Create
the ER diagram of employees, departments, and the relationship
Chapter 2 37
The ER model

between them. Don’t forget to show local and foreign attributes, and
the cardinality of the relationship.

2-3. You are developing a database which will contain information


about cities and states in the United States. Each city is located in
only one state (Texarkana TX is different than Texarkana, AR). Cre-
ate the ER diagram of cities, states, and the relationship between
them. Again, include local and foreign attributes, and the cardinality
of the relationship.

2-4. Think of an activity or enterprise that you are interested in. It


could be something from home, school, work, or recreation. Identify
two entities in this enterprise that have a one-to-many relationship,
and develop the ER diagram to illustrate this.

2-5. Build the tables of the library or the patients database, as we have
shown them in this chapter. If your software permits setting up rela-
tionships at design time, link the tables on their related attributes.

• Enter data in each of the tables. Be sure to illustrate how cus-


tomer IDs may be copied several times into the BOOKS table
(and patient IDs into the BLOOD SAMPLES table).

• Develop a query that demonstrates how the one-to-many rela-


tionship is actually represented by the data. Normally, the
query result will show only the books that are currently
checked our (compare with the BOOKS table) and the customers
that currently have books checked out (compare with the CUS-
TOMERS table). Review Figure 2-5 for a conceptual view of the
mapping between these entities.

• What happens if you enter a customer ID in the BOOKS table


that isn’t already entered in the CUSTOMERS table? (Results may
vary, depending on your software.)
38 Practical Relational
Database Design

S-ar putea să vă placă și