Sunteți pe pagina 1din 11

2.

Object Relational Mapping Strategies


Contents
2.1. Why Mapping?

2.2. Mapping Basics

2.3. Mapping a Class Inheritance Tree

2.4. Mapping Object Relationships

2.5. Modeling with Join Tables

2.6. Database Table Design Guidelines

2.1. Why Mapping?


Writing a Java application that stores data in a RDBMS can be as simple as slapping a few data-
aware components on a form. However, if you rely on data-aware controls you lose the benefits of
encapsulation and set yourself up for a larger maintenance burden ahead. (See the sidebar "Benefits
of the Business Object"). Proper object-relational integration requires a strategy for mapping the
object model to the relational model in order for Java objects to become persistent to the RDBMS.
A persistent object is one that can automatically store and retrieve itself in permanent storage.
The root of the problem is that objects can't be directly saved to and retrieved from relational
databases. While objects have identity, state, and behavior in addition to data, an RDBMS stores
data only. Even data alone can present a problem, since there is often no direct mapping between
Java and RDBMS data types. Furthermore, while objects are traversed using direct references,
RDBMS tables are related via like values in foreign and primary keys. Additionally, current
RDBMS have no parallel to Java's object inheritance for data and behavior. Finally, the goal of
relational modeling is to normalize data (i.e., eliminate redundant data from tables), whereas the
goal of object-oriented design is to model a business process by creating real-world objects with
data and behavior. Robust object-oriented application development requires a mapping strategy
built on a solid understanding of the similarities and differences in these models.

Benefits of the Business Object


If you are still writing applications that use data-aware GUI components to interface
directly with the database, you might want to reconsider. Applications built in this manner
are not object oriented. Such two-layer (GUI/Database) applications violate one of the
primary principles of object-oriented design: encapsulation.

Encapsulation is what allows a client to interact with an object without knowledge of its
implementation details, a primary premise for loose coupling. In applications that use
data-aware widgets, the opposite is the case. The client and the database are very tightly
coupled. GUI code, business logic, and SQL statements are all interwoven throughout
the application source code. As a result, the application can become a maintenance
nightmare. Any changes in the database schema will surely cascade into unexpected
failures.
A much better approach utilizes the business object model. A business object is modeled
after a business or organization concept, such as a person, place, event, or process. It
represents a real-world entity such as an employee, product, invoice, or payment.
Business objects fully encapsulate the behavior that is expected from the entity they
model.
For example, an Item object in an invoicing application can be asked "How much is
available for sale?" or "What is the unit price when 10 units are purchased?" simply by
calling getQtyAvailable() or getUnitPriceForQty( qty ) methods in the object.

Below is a short list of reasons you might want to use business objects in your next
project:

1. Business objects fully insulate clients from business rules, the underlying data storage,
and concurrency issues, resulting in complete encapsulation. Because clients only
interact with business objects, changes can be made in the database without having to
touch a single line of code at the client.
2. Applications have an open architecture and are fully scalable. Business objects can be
implemented in a variety of configurations from client/server to n-tier.
3. Applications built with business objects are also much more maintainable. The
architecture separates responsibility into loosely coupled layers (i.e., user interface,
business objects, and data storage). Business objects can be loosely coupled to the user
interface using the model-view-controller (MVC) architecture. MVC is a modular
approach to GUI construction that separates the user interface, control of user input (i.e.,
event handling), and business objects into three different components that interact
together. Because of this separation of responsibility applications can be modified more
easily when the business needs change.
4. Business object models are based on procedures already established by the
organization. Experts in the problem domain who are not necessarily developers can
design the model. Developers can then implement it using any object-oriented language.
5. Business objects can be implemented as components that are reusable by different
client applications. Clients can have different user views (GUI interfaces) of the same
object depending on purpose.
6. Thin clients that run on the new network computers such us the NC or the NetPC, or
on other devices such as mobile phones, PDAs, or TV boxes, can be more easily
implemented. The client needs contain only user interface code-all business logic and
data storage code resides on other machines across the network.
The extra planning and effort required to implement an application using business objects
are well worth it. Business-object-based applications are much more robust and
maintainable than traditional two-layer applications. It's surprising that most popular IDEs
in use today still endorse the two-layer approach.

2.2. Mapping Basics


Java classes can be mapped to RDBMS tables. The simplest mapping between a persistent class and
a table is one-to-one. In this case, all attributes in the persistent class are represented by all columns
of a table. Each instance of a business class is then stored in a row of the table.
Although this type of mapping is straightforward, it might conflict with the existing object and
entity-relation (ER) models. This is because the goal of the object modeling is to model a business
process using real world objects, whereas the goal of ER modeling is normalization and fast
retrieval of data. As a result, VBSF supports two types of class-to-table mappings that help
overcome the differences between relational and object models: SUBSET and SUPERSET.
SUBSET Mapping. The attributes of a persistent class with a subset mapping represent either a
portion of the columns in a table or all of the columns in the table. This type of mapping is used
when all the attributes of a persistent class are mapped to the same table. It is also useful when a
persistent class is not concerned with some of the columns of its corresponding table in the database
because they are not part of the business model. Subset mappings can also be used to create
"projection classes" for tables with a large number of columns. A projection class has just enough
information to allow a user to select a row for full retrieval from the database. The full row can be
mapped to another persistent class. This type of design reduces the amount of information passed
across the network. Subset mappings are also used to map a class inheritance tree to a table using
filters. This case is explained in more detail later in the section below Mapping an Inheritance Tree.
SUPERSET Mapping. A persistent class with a superset mapping contains attributes derived from
columns of multiple tables. This type of mapping is also known as table spanning. Superset
mappings are used to create "view classes" that hide the underlying data model, or to map a class
inheritance tree to the database using a Vertical mapping approach (see next section). VBSF fully
supports performing insertions, updates, and deletions of objects with this type of mapping, while
transparently updating and maintaining all foreign key columns that join the tables.

2.3. Mapping a Class Inheritance Tree


The three most common strategies for representing a class inheritance tree in the RDBMS are
vertical mapping, horizontal mapping, and filtered mapping. VBSF supports all three methods. A
combination of these types of mappings can also be used within an inheritance tree. To illustrate
each type of mapping we'll use the UML class diagram depicted below:

This diagram models a simple inheritance tree made up of a Person abstract class from which the
Employee and Customer concrete classes inherit.
Vertical Mapping. In vertical mapping, each class in the tree, whether abstract or concrete, is
mapped to a different table. All branch and leaf tables in the tree must be linked to their parent
tables. This can be accomplished by means of a foreign key column that references the primary key
of the parent table. In order to instantiate a concrete class using this method, a join query of the
concrete class table and all its abstract parent class tables must be performed. The table below
shows how our sample model is mapped to an Access database using vertical mapping:

Person Table
Column Name Data Type

PersonID AutoNumber

LastName Text(25)

FirstName Text(25)

Title Text(3)

Employee Table
Column Name Data Type

ID AutoNumber

PersonID Long
Since Date/Time

PayRate Currency

Customer Table
Column Name Data Type

ID AutoNumber

PersonID Long

Terms Text(25)

Owes Currency
The EMPLOYEE and CUSTOMER tables are related to the PERSON table via the PersonID
column. Thus, instantiating an employee object with OID=100 requires a join of the PERSON and
EMPLOYEE tables:
SELECT e.ID, p.LastName, p.FirstName,p.Title, e.Since, e.PayRate
FROM PERSON p, EMPLOYEE e
WHERE e.PersonID=p.ID AND e.ID=100;
This method closely mimics the class inheritance tree in the database, but can result in complex and
time consuming queries on moderately deep inheritance hierarchies. VBSF allows you to define this
type of mapping by mapping each class in the tree to its own table using a SUPERSET mapping.
Horizontal Mapping. Under horizontal mapping, each concrete class in the tree is mapped to a
different table. Each of these mapped tables contains columns for all attributes in its concrete class,
plus all attributes inherited from all its abstract parent classes. In other words, abstract classes are
not mapped to their own table. This approach provides very fast performance, and is simple to
design. However, if an attribute of an abstract parent class is changed, then potentially many tables
must be modified. Consequently, this method is most useful if the inheritance tree is more method
driven than attribute driven. To be more specific, if a substantial number of classes inherit a large
number of attributes from an abstract parent class, then the vertical or filtered methods might be
better choices.
As shown in the table below, the number of tables necessary to map our sample model to an Access
database using horizontal mapping is reduced to two. Retrieving an employee or customer in this
case involves querying only its corresponding table.

Employee Table
Column Name Data Type

ID AutoNumber

LastName Text(25)

FirstName Text(25)

Title Text(3)

Terms String

Owes Currency

Customer Table
Column Name Data Type

ID AutoNumber
LastName Text(25)

FirstName Text(25)

Title Text(3)

Since Date/Time

PayRate Currency
Filtered Mapping. In filtered mapping all concrete classes in the tree are mapped to the same table.
The table must contain columns for all attributes of all the abstract and concrete classes in the
inheritance tree (or the part of the tree using this mapping). In addition, a filter column is created in
the table. The value of the filter column is used to distinguish between subclasses. Abstract classes
are not mapped to this table. This approach provides adequate performance, but violates table-
normalization rules. More specifically, it could lead to a substantial number of NULL columns in the
table, wasting space. Consequently, this method is most useful if most of the attributes are inherited
from the abstract parent classes. As shown in the table below, only one table is necessary to map our
sample model to an Access database using filtered mapping.

Person Table
Column Name Data Type

ID AutoNumber

Type Integer

LastName Text(25)

FirstName Text(25)

Title Text(3)

Since Date/Time

PayRate_Owes Currency

Terms Text(25)
The Type column is used as a filter used to distinguish the type of object being stored. For
example, assuming that a Type=1 is used for employees, then to retrieve all employees from the
Person table we can issue the following query:
SELECT ID, LastName, FirstName, Title, Since, PayRate_Owes
FROM Person WHERE Type=1;
The Since column will always be NULL for customers, as will the Terms column for employees.
Notice, however, that the PayRate_Owes column can be used for both because of the common
data type.
VBSF provides specific support for filters in such a way that the filter column does not have to be
defined as an attribute in your business class. VBSF automatically reads and updates this column as
necessary. For detailed information about defining a filter for a persistent class see section 3.4.
Filter and Concurrency Columns.

2.4. Mapping Object Relationships


The UML class diagram shown below depicts a model of three simplified business classes in a point
of sale (POS) application:
This POS model is mapped to an Access database using the table structures shown below:

Invoice Table
Column Name Data Type

ID AutoNumber

CustomerID Long

Date Date/Time

Sales Person Text(20)

Invlineitem Table
Column Name Data Type

ID AutoNumber

InvoiceID Long

Description Text(30)

Price Currency
Qty Double

Customer Table
Column Name Data Type

ID AutoNumber

Name Text(30)
The POS class diagram and its table structure will be used in the following sections to discuss the
various ways one-to-one (husband-wife), one-to-many (father-child), and many-to-many
(Grandparent-child) objects relationships can be mapped to the relational model.
One-to-One Relationships
A to-one relationship is implemented in Java using a direct reference to another object. VBSF
supports two kinds of to-one relationships: contained and referenced. When an object holds a
contained reference to another object, it can create and delete its contained object. Furthermore,
when the holder of the contained reference is updated in the database (inserted, updated, or deleted),
all its contained references are automatically updated (this default behavior can be overridden at
runtime if necessary). In the POS diagram, the Invoice class has a to-one relationship with the
Customer class. If this reference is not contained, then when an Invoice is updated in the
database, its referenced Customer object is not automatically updated.
In the relational model a to-one relationship is usually maintained by means of an embedded foreign
key column. This foreign key column holds the value of the primary key (the object ID) of the row
(object) being referenced. In the POS table structure the Invoice table has an embedded foreign key
column named CustomerID. This column holds the value of the primary key column (i.e. ID) in the
Customer table.
A to-one relationship is implemented in VBSF by defining a Reference attribute. VBSF
transparently converts the foreign key reference to an object, and updates the value of the foreign
key column from referenced objects.
It is also possible to define a to-one relationship in the relational model using a join table. VBSF
also supports this scenario. In this case, however, the attribute must be defined as a Referenced
Collection. The only difference is that, by virtue of the association, the collection holds just one
referenced object instead of a collection of objects.
One-to-Many Relationships
In the object model there are two types of to-many relationships: aggregation (part-of), and
association (acquaintance). Under VBSF an aggregation relationship is defined by means of an
Owned Collection attribute, and an association relationship by means of a Referenced Collection
attribute. The difference between the two is that in an owned relationship when the owner is
updated in the database, all objects in all its owned collections are automatically updated (this
default behavior can be overridden at runtime if necessary).
In the relational model a one-to-many relationship can be defined either using an embedded foreign
key column, or using a join table. An embedded foreign key is a column defined in the table on the
many side of the relationship that holds a key to the table in the one side of the relationship. A join
table is a table whose sole purpose is to store mapping values between the key and foreign key of
the two tables involved in the relationship.
An owned relationship can only be implemented using an embedded foreign key column. A
referenced relationship can be implemented using either method-embedded foreign key or join
table. The following two sections discuss each case in further detail.
Owned Relationships (Aggregation)
In an owned relationship the owner class holds a reference to its owned collection using an Owned
Collection attribute and the owned class holds a direct reference to its owner using a Reference
attribute. An owner object automatically handles all inserts, updates and deletions of all its owned
objects in the database. VBSF automatically and transparently takes care of updating owned objects
in the database whenever an owner object is updated in the database. This behavior is recursive, so
if any owned objects are in turn owners (i.e. they hold other owned collections), all their owned
objects will also be updated in the database too, and so on. In this manner whole objects graphs can
be updated in the database in one operation. This type of behavior is known as persistence by
reachability, and is fully supported by VBSF.
An owned relationship is implemented in the relational model using an embedded foreign key
column in the table on the many side of the relationship. In the POS diagram the Invoice class
has a one-to-many relationship with the InvoiceLineItem class. In the POS table structure the
Invlineitem table has an embedded foreign key column named InvoiceID. This column holds the
value of the primary key column (i.e. ID) in the Invoice table.
VBSF transparently reads and updates the embedded foreign key column value of owned objects.
Referenced Relationships (Association)
In a one-to-many association, the class on the one side of the relationship does not own the other
class, it merely references it. This means that an object on the one side (also referred to as the
holder of the collection) of the relationship cannot create, update, or delete objects on the many side
of the relationship, it can only retrieve them. In this case, the holder uses a Referenced Collection
attribute to reference its associated collection instead of an Owned Collection attribute. A
Referenced Collection attribute can also be used to hold collections of objects of the same class as
the holder of the collection.
A Referenced Collection attribute can handle the association in the database by using either an
embedded foreign key column or a join table. To handle the association using a join table, the
referenced collection attribute must be mapped to a join table. See section 4.6. Defining Joins for
Referenced Collections for details.
Many-to-Many Relationships
A many-to-many relationship can be thought of as a bi-directional one-to-many association. To
create this type of relationship you simply define a Referenced Collection attribute in each of the
classes involved in the relationship. In the relational model a many-to-many relationship can be
defined either using foreign key columns or using a join table.
To use foreign keys, an embedded foreign key column is defined in each of the tables involved in
the relationship. Each foreign key column holds the key to the other table. The next section
discusses various ways to implement many-to-many associations using join tables.

2.5. Modeling with Join Tables


A join table is a very common way of implementing many-to-many relationships in the relational
model. This section discusses how join tables can be used to implement many-to-many associations
in the object model.
In relational databases a join table is used to maintain a relationship between the rows of two tables
(or two different rows in the same table). At a minimum a join table must contain a foreign key
column that points to each of the primary keys of the tables in the association. However, many
times additional information is maintained in the join table. For example, a bill of materials join
table might contain the columns listed below:

BOM Table
Column Name Description

ID Primary Key

AssemblyID Foreign Key to Assembly Table

PartID Foreign Key to Part Table

Quantity of Components required


In the above case a quantity field is necessary in the join table to keep track of the number of units
of each component that one assembly requires. Whenever one or more additional fields such as a
quantity or date is defined in a join table, the join is said to be of a transactional nature. If the join
only contains foreign key columns it is said to be transparent.
The question when modeling many-to-many relationships in the object model is whether to create a
class that represents the join table. If the join table is transactional, then a corresponding persistent
class should be created in the object model.
Transactional Joins
In a transactional join, the join table is modeled by two or more transaction classes. Each
transaction class is owned by one of the classes involved in the many-to-many relationship.
The transactionaljoin sample application describes a scenario in which three tables: Assembly, Part,
and BOM ( the bill of materials join table described in the above section ) are modeled using four
classes: Assembly, ComponentPart, Part, and BuildsAssembly. The BOM join table is
represented by two transaction classes: ComponentPart representing a component in an
assembly, and BuildsAssembly representing an assembly that a part builds into.
ComponentPart is owned by Assembly, and BuildsAssembly is owned by Part. This
example introduces the concept of mapping two persistent classes to one table to display different
'views' of the same information.
Transparent Joins
In a transparent join, the join table is not modeled by a class. Instead, each class involved in the
relationship has get methods that return its related objects in the other class. This is implemented
using a Referenced Collection attribute on each class involved in the relationship. VBSF
automatically maintains join tables so your classes are unaware of any issues related to join tables.
The transparentjoin sample application describes a scenario in which three tables: Assembly, Part,
and BOM ( the bill of materials join table shown above without the quantity column ) are modeled
using two classes: Assembly, and Part. A Referenced Collection attribute is used by the
Assembly class to references its component parts, and by the Part class to reference the
assemblies it builds into.
When two classes are related via join table, one of the classes must be designated as the join
manager. In our example the Assembly class acts as the join manager responsible for updating the
BOM join table in the database. The join manager has no functionality in the object model. It is
simply a designation that tells VBSF which persistent class updates the join table when it is updated
in the database. The choice of a join manager is arbitrary, but only one persistent class can be
assigned as the join manager of a join table. Consequently, in this example the Part class cannot
be a join manager for the BOM join table.
The PartApp application included in the transparentjoin sample application demonstrates how a
client uses the Assembly class to retrieve, add, and remove parts that belong to an assembly.
Self-Join Tables
A variation of the many-to-many relationship described in the transparentjoin sample application
discussed in the previous section occurs when both foreign keys in the join table point to a primary
key on the same table. The selfjoin sample application illustrates this scenario. In this application
both assemblies and component parts are stored in the same Part table. The BOM join table
contains the following fields:

BOM Table
Column Name Description

ID Primary Key

Part_AssemblyID Foreign Key to Part Table

ComponentPartID Foreign Key to Part Table


Note that both foreign key columns ( Part_AssemblyID and ComponentPartID ) point to the PartID
primary key column in the Part table.
This scenario is modeled using one class, Part, which defines a referenced collection attribute to
access its component parts. The referenced collection attribute is mapped to the BOM join table. If
the self-join table contained a Quantity column, then it would have to be modeled by creating a
ComponentPart class.

2.6. Database Table Design Guidelines


This section wraps up the information discussed in the previous sections by summarizing the design
of database tables for persistent classes. A table that stores instances of persistent classes should
have the columns outlined below defined:
1. One or more ID columns that map to the object ID.
An embedded foreign key column if the class is owned by another persistent class.
A column of the correct type for each simple attribute.
2. A foreign key column for each object reference attribute.
Object ID Column
Every persistent class has an object ID that must be mapped to an object ID column. The value of
the object ID column must be unique within the table. This can be ensured by defining it as the
table's primary key, or by defining a unique index for it. VBSF also supports multi-attribute object
IDs mapped to composite primary keys.
Foreign Key Column for Owned Classes
A foreign key column must be defined for persistent classes that are owned by another persistent
class if the relationship is maintained in the relational model by means of an embedded foreign key
column. An embedded foreign key is a column defined directly in the owned table that holds the
object ID of its owner. The foreign key column must be of the same data type as the object ID (or
another join column) of the owner, and should be indexed for faster retrieval. VBSF also supports
owner references mapped to composite foreign key columns that reference tables with composite
primary key columns.
Columns for Simple Attributes
For each simple attribute defined in a persistent class a corresponding column must be defined in its
database table. The corresponding column data type for each attribute type is dependent on the
database used. See section 4.2.2. Mapping Simple Attributes for recommended column data types..
Columns for Object Reference Attributes
For each object reference attribute defined in a persistent class a corresponding foreign key column
must be defined in its database table. This column must be of the same data type as the object ID of
the class being referenced. If the table will be queried often for rows that match a specified value in
a foreign key column, then that column should be indexed for faster retrieval.VBSF also supports
object references mapped to compound foreign key columns that reference tables with compound
primary key columns.
Examples
Below are three examples of the Access table structures used in the contactdemo application.
The Category table below is a very simple table that contains only two simple attributes named
Number, and Description, and is not owned by another persistent class:

Category Table
Column Name Data Type

ID AutoNumber

Number Text(10)

Description Text(25)

The Locator table below contains three simple attributes named Type, LocatorNo, and Description,
and an additional embedded foreign key field named ContactID that links it to its Contact owner:

Locator Table
Column Name Data Type

ID AutoNumber

ContactID Long

Type Integer

LocatorNo Text(15)
Description Text(25)

The Contact table below contains an object reference foreign key field named CategoryID:

Contact Table
Column Name Data Type

ID AutoNumber

LastName Text(25)

FirstName Text(25)

Title Text(3)

CategoryID Integer