Sunteți pe pagina 1din 26

Long Running User Transactions

in Database Systems

Master-Master Database Replication

Fabian Merki
merkisoft informatik
Long Running User Transactions in Database Systems

Table of Contents
1 Preface...............................................................................................................................3
2 Application..........................................................................................................................4
3 Concept overview...............................................................................................................6
3.1 Long running user transactions..................................................................................6
3.2 Master-master or multi master replication..................................................................6
4 Use Case...........................................................................................................................7
5 Evaluation..........................................................................................................................8
5.1 Oracle Workspace......................................................................................................8
5.1.1 Example...............................................................................................................8
5.1.2 Conflict resolution................................................................................................9
5.1.3 Conclusion of Oracle's Workspace................................................................... 10
5.2 Daffodil Replicator....................................................................................................11
5.2.1 Testing Daffodil Replicator................................................................................12
5.2.2 Conclusion.........................................................................................................12
5.3 Hibernate..................................................................................................................13
5.3.1 What does Hibernate offer?..............................................................................13
5.3.2 Replication modes.............................................................................................14
5.3.3 What is missing in Hibernate?...........................................................................14
5.4 Others.......................................................................................................................15
5.4.1 Microsofts SQL Server......................................................................................15
5.4.2 Slony I / II...........................................................................................................15
5.5 Conclusion................................................................................................................16
6 Design and implementation of replication with Hibernate............................................... 17
6.1 Methods of replication...............................................................................................17
6.2 Replication Framework.............................................................................................18
6.3 Algorithm...................................................................................................................18
6.4 Additional replication table........................................................................................19
6.5 Conflict resolution.....................................................................................................19
6.6 Dependencies...........................................................................................................20
6.7 Additional unique constraint vs. UUID......................................................................21
6.8 Transaction handling................................................................................................22
7 Testing..............................................................................................................................23
8 Conclusion.......................................................................................................................25
9 References.......................................................................................................................26

Fabian Merki, merkisoft informatik 5.11.2006 2 / 26


Long Running User Transactions in Database Systems

1 Preface
The subject "Long Running User Transactions in Database Systems" was selected for this
assignment because of a requirement that arose from a Course Administration software
that was under development at that time. Refer to http://kursweb.merkisoft.ch for more
information on the Course Administration application.
This application had specific database replication requirements the likes of which I had not
encountered before. The assignment was a perfect opportunity to spend some time to
investigate what solutions are available on the market and then to test a solution in a real
world scenario.
I decided to write the document in English, because I feel the Course Administration
application is not the only software that could benefit from such a solution. Documenting
my findings in English on the web will open up the results to a broader spectrum of people
than if written in my native German. It also gave me the opportunity to practise writing
technical documentation in English which is a requirement in my current employment.
Acknowledgements
I would like to thank my lecturer Mr. Herbet Bitto for his help and support for the content of
this assignment, Mr. Steven Hawkes for reviewing the document and the SBB for offering
a comfortable environment where due to time pressures, most of this assignment was
conducted. I found it a challenging experience researching and developing this solution
whilst commuting on a daily basis.
I hope you will enjoy reading this document.

Fabian Merki

Hereby I do confirm that everything in this assignment is created, written, drawn by myself
unless otherwise stated.

________________ __________________________

Date / Place Fabian Merki

Fabian Merki, merkisoft informatik 5.11.2006 3 / 26


Long Running User Transactions in Database Systems

2 Application
The following diagram outlines a problem scenario for an application that uses multiple
databases. Customers subscribe for courses via the internet. The administrator manages
courses, subscriptions, teachers and additional data.

Figure 1

A local database was selected because the administrators have a slow internet connection
but require fast data access. The centralised database on the internet can be modified at
the same time as the local database. At some point the local and the central database
must be replicated. Because both databases are master databases, such a replication is
called master-master replication. The term master means that the databases is updated by
a user and therefore becomes the master of the modified data. Multiple local databases
might exist since more than one administrator can manage the data. An organisational
process must be established by the administrators so that changes do not get overwritten
by others.
A further requirement of this architecture is that changes may be stored before they are

Fabian Merki, merkisoft informatik 5.11.2006 4 / 26


Long Running User Transactions in Database Systems

published. Such time consuming changes by a user are called long-running user
transaction.
Because a course is not a single database record (actually it is a complex structure) the
replication process must replicate the whole database as a single entity – replication of
single tables would most likely fail because of the foreign key constraints that exist within
the database.
This document outline how the previous requirements can be addressed using existing “off
the shelf” products together with custom software.

Fabian Merki, merkisoft informatik 5.11.2006 5 / 26


Long Running User Transactions in Database Systems

3 Concept overview

3.1 Long running user transactions


Databases in general only support the concept of all-or-nothing transactions. Either the
whole transaction completes and is visible for everyone or the transaction has to be rolled
back because of an error condition.
If a user works for several hours or days on a specific task they most certainly will require
data persistence to protect themselves from data losses. Most database vendors offer
such functionality eg. Oracle with its Workspace feature. Oracle traditionally developed this
feature for geographical data management. Generating such a map from the raw data
probably takes months, requiring many people to work in parallel on subsets of the data.
From time to time the changes from the team need to be brought together.

3.2 Master-master or multi master replication


When using local databases for faster performance, the data has to be replicated
(synchronized) with the central database. If the local database is only used for queries
then the replication is called a master-slave replication, i.e. changes from the master are
replicated to all slave(s). This is simple and almost every database provider offers product
that provide this feature.
Master-master replication is where the local database is used for updates, inserts or
deletes and these modifications must be replicated to other databases which themselves
are concurrently serving user requests (including data modifications). In this scenario there
are many masters databases which are updating each other.
This concept is well-known for source control systems such as CVS, Subversion or etc.
where files are modified by several programmers concurrently. Once a developer has
performed a change, the work is committed to a central repository from where others
merge their changes against the latest version. Because most of the time, changes are
made in different lines within, no conflicts occur and the source control system performs
these merges automatically. Sometimes two developers changed a file in the same area
(i.e. the same line). In this scenario when the second developer checks in its changes it
will be asked to resolve the conflict. Either the own version or the latest version in the
repository are chosen. Very seldom does it happens that one developer is deleting a
method, variable etc. while the other is newly referring to it in new code. Or two developers
are introducing a new symbol in the same namespace twice. The source control does not
check such problems where as a database will always perform constraint checks.

Fabian Merki, merkisoft informatik 5.11.2006 6 / 26


Long Running User Transactions in Database Systems

4 Use Case
Title Database replication
Precondition Local and remote database exists.
At least the local database is filled with data.
Description For all tables which need to be replicated, the versions in both
systems are checked and the corresponding action for each row is
applied.
Postcondition The local and the remote database are identical in terms of the data
in the replicated tables.
The version of the replicated objects are stored.
Variations Database download:
1. clean / delete local tables
2. start replication
Actors Administrator

Fabian Merki, merkisoft informatik 5.11.2006 7 / 26


Long Running User Transactions in Database Systems

5 Evaluation
A major proportion of the available time for this assignment was allocated to locate and
evaluate existing products offering solutions for the specified requirements. This chapter
provides specific detailed information on the key features of products outlined previously.
It was found that some replication products only cover the simple case of master-slave
replication and therefore do not fit the requirements of this assignment. Others do have
master-master concepts but most of them only work if the second instance of the database
is always available and are not able to accommodate a dynamic set of master databases.
The evaluation below focus's on Oracle Workspace, Daffodil, Hibernate and provides a
insite into these products can be used to solve the master-master replication problem,
covering issues such as merging, conflict resolution etc.

5.1 Oracle Workspace


Since Version 9i of Oracle DBMS there is built in support for long-running user
transactions through the use of stored procedures. The basic operations are to create,
switch to, refresh, merge back and finally delete a Workspace. The following code
illustrates how this can be realised using Workspace functionality.

5.1.1 Example
Session 1 Session 2
execute DBMS_WM.EnableVersioning('emp');

execute DBMS_WM.CreateWorkspace('NEWWORKSPACE');
execute DBMS_WM.GotoWorkspace('NEWWORKSPACE');

update emp set ename='-' || ename;


commit;

select ename from emp;

select ename from emp;

execute DBMS_WM.MergeWorkspace('NEWWORKSPACE');

select ename from emp;


update emp set ename=substr(ename,2);
commit;

execute DBMS_WM.RefreshWorkspace('NEWWORKSPACE');

select ename from emp;

execute DBMS_WM.RemoveWorkspace('NEWWORKSPACE');

Example 1

Fabian Merki, merkisoft informatik 5.11.2006 8 / 26


Long Running User Transactions in Database Systems

5.1.2 Conflict resolution


What happens when user A updates (and commits) the same record in its Workspace as
user B simultaneously? When user A merges the changes to the Workspace 'LIVE',
nothing will happen except that the change is successfully stored (assuming there is no
constraint violation) and everything looks as expected. When user B merges this table a
conflict will be visible in the view <table name>_CONF and the merge aborts. Now the
conflicting rows have to be handled manually by invoking DBMS_WM.BeginResolve,
DBMS_WM.ResolveConflicts and DBMS_WM.CommitResolve. Finally a second merge
must be performed. Because new conflicts occur, additional merges might be required.
The following example illustrates resolving a conflict in a very simple manner.
Session 1 Session 2
assuming the NEWWORKSPACE exists
execute DBMS_WM.GotoWorkspace('NEWWORKSPACE'); update emp
set ename='X' || ename;
update emp set ename='*' || ename;
commit;
commit;

execute DBMS_WM.MergeWorkspace('NEWWORKSPACE');

--- merge fails because


--- session 1 & 2 updated the records
--- overwrite the changes
--- with the ones of NEWWORKSPACE

execute DBMS_WM.BeginResolve ('NEWWORKSPACE');

execute DBMS_WM.ResolveConflicts('NEWWORKSPACE',
'emp', 'empno>=0','child');

execute DBMS_WM.CommitResolve('NEWWORKSPACE');
execute DBMS_WM.MergeWorkspace('NEWWORKSPACE');

select ename from emp;

Example 2

More information is available on [RES-ORACLE-WS].

Fabian Merki, merkisoft informatik 5.11.2006 9 / 26


Long Running User Transactions in Database Systems

5.1.3 Conclusion of Oracle's Workspace


Oracle's solution meets almost all requirements. The available stored procedures of the
package dbms_wm allow to manage two different states of the data in a database. As long
as all users use the same database and different workspaces, the workspace concept
works very well. It becomes harder when each user has its own database. In this case the
central repository has to reference the remote database. Additionally the merge becomes
more complex: copy operations must be performed from the remote table into the central
table in a different workspace and vice-versa to be able to perform the workspace
operations in single tables.
However there is one major drawback: the price! Therefore Oracle Workspace was not an
option in for my solution.

Fabian Merki, merkisoft informatik 5.11.2006 10 / 26


Long Running User Transactions in Database Systems

5.2 Daffodil Replicator


Daffodil Replicator is a Java based data replication tool available in two version: open
source and an enterprise version. It offers the following features:
• Bi-directional Data Synchronization
• Supports two merge strategies, merges single columns of a row
• Supports replication across heterogeneous database
• Conflict detector and resolution
• Partial data (Tables, Rows and column) Replication
• Large datatype support
• Scheduling
• Platform independent synchronization
• Debugging

This product supports bi-directional data replication by either capturing a data source
snapshot or by synchronizing the changes. It monitors for data changes in the tables and
synchronizes all data changes made by the subscriber and the publisher on a periodic
basis or on-demand by the subscriber. While synchronizing with one or more target data
source, Replicator uses pre-defined conflict resolution algorithms to resolve conflicts
between the publisher and subscriber. The publications and subscriptions are defined
using GUI or APIs on existing database servers.

Figure 2

Source: [RES-DAFFODIL]

Fabian Merki, merkisoft informatik 5.11.2006 11 / 26


Long Running User Transactions in Database Systems

5.2.1 Testing Daffodil Replicator


I wrote a small Java tool to test Daffodil Replicator. To simulate a production environment,
two processes run separately and perform db manipulations: one is the publisher, one the
subscriber. The subscriber, which also is the test driver, communicates via a socket to the
publisher to generate commands. This setup helped to create simple and complex tests.
Unfortunately Daffodil Replicator in the current version (2.1) has problems deleting rows.
The following exception is thrown after a course with its subscriptions was deleted in the
publisher database and the synchronize method called.
Caught exception: com.daffodilwoods.replication.RepException:
Problem in synchronizing data due to -- 'DELETE on table 'COURSE' caused a
violation of foreign key constraint 'SQL060919051050792' for key (2). The
statement has been rolled back.'.

The problem only occurs when a course row is delete on the publisher side and there is no
clear reason why.
A more minor issue is that Daffodil Replicator adds triggers to tables and sometimes in my
tests, the replication completed without actually performing any changes on the other side.
The reason for this was that the test cases drop and recreate tables for a clean test setup
which caused the deletion of the triggers. Therefore tables should never be dropped and
recreated because Daffodil will not recreate the triggers.

5.2.2 Conclusion
It was quite easy to use the open source edition of Daffodil Replicator. Apart from the
problem with deletes, the replication process works very well. One useful feature is that
the smallest unit of the merge operation is a single cell and not an entire row.
The detection of the delete bug rendered this product in its current version unusable for my
application.

Fabian Merki, merkisoft informatik 5.11.2006 12 / 26


Long Running User Transactions in Database Systems

5.3 Hibernate
Hibernate is an open-source object-relational mapping framework for Java. It is able to
create a database scheme to persist the Java objects and to query the database with
either SQL or HSQL (which is Hibernate's object oriented version of SQL). Model classes
have to be annotated with @Table and fields with @Id, @Basic, @OneToMany etc.
Alternatives to annotated classes exists but has not been considered in this research.
The following code illustrates the usage of model classes with annotations from the
javax.persistence package:
package model;

import javax.persistence.*;

@Entity
public class Student extends BaseEntity {
@Column(nullable = false)
private String name;

@ManyToOne(cascade = CascadeType.ALL)
private Address address = new Address();

@OneToMany(targetEntity = Subscription.class,
cascade = {CascadeType.REMOVE}, mappedBy = "student")
private List<Subscription> subscription = new ArrayList<Subscription>();
// [...]

Hibernate can automatically create or alter tables of model classes.

5.3.1 What does Hibernate offer?


Database access is performed via the Session class which contains methods to insert,
update, insertOrUpdate, delete as well as to obtain a transaction or to query data.
The following extract illustrates how a student is persisted.
Session s = ...;
Student student = new Student();
student.setName("Merki");
// [...]
s.saveOrUpdate(student);

Apart from this basic database access the Session also has support for replication. The
method replicate can take an object from an other database and persists it into the current
database. To maintain key constraints, Hibernate maintains the primary key id even if a
unique key generator is used. It works best when using the UUID key generator. It is very
important to have unique keys over more than one database therefore UUIDs are
considered reasonable.
In contrast to Oracle's Workspace, Hibernate is able to manage the full object relationship
model and will replicate related objects or cascade deletes to child objects.

Fabian Merki, merkisoft informatik 5.11.2006 13 / 26


Long Running User Transactions in Database Systems

5.3.2 Replication modes


When objects are replicated by invoking the Session.replicate() a replication mode can be
passed to tell Hibernate what to do when a conflict is detected.
The replication modes are:
EXCEPTION Throw an exception when a row already exists.
IGNORE Ignore replicated entities when a row already exists.
LATEST_VERSION When a row already exists, choose the latest version.
OVERWRITE Overwrite existing rows when a row already exists.

5.3.3 What is missing in Hibernate?


Hibernate offers less functionality than the previous products. Each object needs to be
replicated individually; no method to replicate everything exists. If an object was deleted
and the delete has to be replicated, different logic and methods must be called.
By not using UUIDs, the framework has difficulty to inform programmer of replication
problems. This issue is not yet solved nor documented. But it must be considered when
using Hibernate for replication.

Fabian Merki, merkisoft informatik 5.11.2006 14 / 26


Long Running User Transactions in Database Systems

5.4 Others

5.4.1 Microsofts SQL Server


Microsofts SQL Server offers a 'Merge Replication' feature which sounds promising but is
quite expensive to use because it is part of the MS SQL Server. Details about Microsoft's
solution can be found at [RES-MSSQL]. The free version cannot be used in commercial
applications.

5.4.2 Slony I / II
Slony-I is a "master to multiple slaves" replication system supporting cascading and
slave promotion.
[..]
But Slony-I, by only having a single origin for each set, is quite unsuitable for really
asynchronous multi-way replication. For those that could use some sort of
"asynchronous multi master replication with conflict resolution" akin to what is
provided by Lotus Notes™ or the "syncing" protocols found on Palm OS systems,
you will really need to look elsewhere. These sorts of replication models are not
without merit, but they represent different replication scenarios that Slony-I does not
attempt to address.
(Source: [RES-SLONY])

It looks as if Slony would not fit my requirements because it only provides “single origin for
each set”. Therefore I did not look more deeply into it.
Nevertheless the following paragraph states very well the issues of conflict resolution:
Some async multimaster systems try to resolve conflicts by finding ways to apply
partial record updates. For instance, with an address update, one user, on one
node, might update the phone number for an address, and another user might
update the street address, and the conflict resolution system might try to apply
these updates in a non-conflicting order.
Conflict resolution systems almost always require some domain knowledge of the
application being used.
It is absolutely true that domain specific knowledge is needed and that a general conflict
resolving mechanism does (most likely) not exist.

Fabian Merki, merkisoft informatik 5.11.2006 15 / 26


Long Running User Transactions in Database Systems

5.5 Conclusion
Because the primary goal of this assignment is a solution to replicate multiple master
databases within Java applications and no standard product fits exactly the requirements
an own solution must be developed. Hibernate was chosen as the basis for this solution.
The reasons of the choice:
● Successful prove of concept
● Database independent (JDBC)
● Free, open-source and 100% pure Java
● No additional server is required
● No additional database mapping required (This was already created for the course
application. No redundancy, reduces the number of required changes when adding,
renaming, removing columns or tables.)

Fabian Merki, merkisoft informatik 5.11.2006 16 / 26


Long Running User Transactions in Database Systems

6 Design and implementation of replication with Hibernate


In this chapter the techniques and the code to implement a generic replication framework
using Hibernate are discribed.

6.1 Methods of replication


The replication process can be designed in a number of different ways:
Data manipulations could be captured to then perform the replication later. This is
achieved preferably by the use of triggers or, better still using Hibernate's trigger
alternative (the Interceptor interface).
One could record changes in a transaction log like database systems do. This log would
include all insert, updates and deletes. On synchronisation, the transaction log of one db
has to be applied to the other db. This can become very complex since the other db might
already have undergone change. Updating deleted rows or inserting child records without
a parent row will most certainly occur.
Another possible solution would be to upate a version column on modification. On
synchronisation, the replicated ids (rows), the system id and the current version would be
stored in a replication table and stored for the next replication. The system id would be a
unique identifier over all databases and enables the central database to support more than
one replicated database. Once a row is deleted, the corresponding id is still available in
the replication table and therefore it can be determined if a row has to be inserted or
delete.
The second approach simplifies the replication because the whole row will be replicated.
Merging a transaction log is not be very simple. Therefore the second approach was
chosen.

Fabian Merki, merkisoft informatik 5.11.2006 17 / 26


Long Running User Transactions in Database Systems

6.2 Replication Framework


The framework should be designed in such a way that the calling application has limited, if
any knowledge of the replication process. The following code will start the process:
Replication.mergeAll(false);

The boolean argument freshDownload is used to determine if a cleanup, prior the


replication is required (see chapter use case). This is as per the snapshot operation used
by the Daffodil Replicator.
The following class diagram provides an overview of the core classes of the developed
framework.

Figure 3

6.3 Algorithm
After performing the cleanup, if freshDownload is set to true, the order of the classes to be
processed is evaluated and stored in a list. Classes, which are not referencing other model
classes, are at the start of this list while the most referenced classes are at the end of the
list. The algorithm to generate this dependency graph will be explained later in this
document.
For each class in the list a replication object is created. In the construction phase the
following query is performened on both databases:
select x.id, (select r.replicatedVersion from ReplicationVersion r where
r.id=x.id and r.system=:SYSTEM), x.version from <tablename> x order by id

Now the results are simultaneously processed to determine if a row has to be inserted,
updated or deleted in one or the other database. Because both results are ordered by id,

Fabian Merki, merkisoft informatik 5.11.2006 18 / 26


Long Running User Transactions in Database Systems

the algorithm is quite simple.


The next step is to perform the delete operations for each replication object. It is important
to address the delete operation before insert operation otherwise newly inserted objects
may conflict with already deleted objects.

6.4 Additional replication table


Because delete operations are allowed on both sides (local and remote) and should be
managed, the replication process must remember what has already been replicated. If a
record exists on side A, but is missing on side B, the program must determine if the record
has to be deleted on side A or inserted on side B.
The class ReplicationVersion holds the following information about a replicated object: the
uuid of the replicated object, the system code to enable multiple master databases and the
replicated version id. In the example where a record exists on side A but is missing on side
B. If the corresponding ReplicationVersion does not yet exist, the record must be inserted
on side B. If it did exist, the record must be deleted on side A, since it was once replicated
and deleted on side B.
The ReplicationVersion can also be used to determine which side was updated. The
objects current version will be compared with the one in the ReplicationVersion. If it is not
the same the object needs to be replicated to the other side.

6.5 Conflict resolution


What if on both sides the object's current version id differs from the ReplicationVersion?
Such a case can occur when on both side, users are updating the same records. In the
course application where this framework will be used it is wise to overwrite the local
changes from the administrator with the one from the customers because the customer
might make changes which should not be lost. In other situations the overwrite might be
performed in the opposite direction or even needed to be decided on a one-by-one basis.
As already discussed conflict resolution requires domain specific logic.
Currently the replication code contains no hook, callback method to allow this but it could
be easily extended when needed.

Fabian Merki, merkisoft informatik 5.11.2006 19 / 26


Long Running User Transactions in Database Systems

6.6 Dependencies
To be able to perform the replication, the relationships between entities must be evaluated.
No child record can be inserted unless the corresponding parent record exists.
The framework should determine how the classes, which are mapped to database tables
by using annotations, are related to each other. The following Java code performs a
dependency sorting. A list of classes which require replication are passed in. The method
returns a list of classes where classes without references to other classes are at the top of
the list followed by classes with references to already processed classes.

Figure 4

This class diagram can be converted into the following list: B, E, C, A, F, D


It is very important to perform merge, insert operations in the correct order (no child record
shall exist without a parent record).
Initially the idea was to perform the deletes in the opposite order. But due to Hibernate's
cascading of deletes, this was not required. As a result, child records could have been
already deleted if the parent was deleted.
public static List<Class> getClassStack(List<Class> l) {

Map<Class, List<Class>> graph = new HashMap<Class, List<Class>>();


for (Class clazz : l) {
graph.put(clazz, getClassStack(clazz));
}

List<Class> classStack = new ArrayList<Class>();

while (!graph.isEmpty()) {
for (Iterator<Class> iterator = graph.keySet().iterator();
iterator.hasNext();) {
Class clazz = iterator.next();
List<Class> list = graph.get(clazz);
if (list.isEmpty()) {
classStack.add(clazz);
iterator.remove();
}
}

Fabian Merki, merkisoft informatik 5.11.2006 20 / 26


Long Running User Transactions in Database Systems

for (List<Class> list : graph.values()) {


list.removeAll(classStack);
}
}

// remove the ones mapped by CascadeStyle.ALL:


classStack.remove(DBFileBuffer.class);
classStack.remove(DBFile.class);
classStack.remove(Adresse.class);

return classStack;
}

public static List<Class> getClassStack(Class c) {


List<Field> f = new ArrayList<Field>();
addFields(f, c);
List<Class> ret = new LinkedList<Class>();
for (Field field : f) {
if (BaseEntity.class.isAssignableFrom(field.getType())) {
ret.add(field.getType());
}
}
return ret;
}
private static void addFields(List<Field> fields, Class c) {
fields.addAll(Arrays.asList(c.getDeclaredFields()));
if (c.getSuperclass() != null) {
addFields(fields, c.getSuperclass());
}
}

The model is an adjacency list or in object oriented manner: a map where the key is the
class and the values are lists of depending classes.
Classes which have no dependency are added one-by-one to the sorted list. When a class
is added because it has no more dependencies, it will be removed from the dependency
lists of the other classes. Remark: this algorithm only works if no cycle exists because
classes in a circle always have a dependency on other classes. Cyclic dependencies are
very unlikely for a database model and therefore not considered in this solution.

6.7 Additional unique constraint vs. UUID


When a class contains additional unique constraint the replication process as described
will eventually fail. Assuming on both sides, a record with the same value for the field with
the unique constraint is inserted. The replication process will see that the object in A must
be replicated to B and vice-versa. But the database will abort the insert because Hibernate
does not consider additional defined unique constraints when performing the merge
operation. Therefore time consuming check was built into the replication process. For each
class with a additional unique constraint (@UniqueConstraint annotation) both tables are
queried. First the two columns for the rows to be inserted are retrieved. Then a query per
row is executed to find a conflicting row. If one is found, the conflict will be resolved by
deleting the older row.

Fabian Merki, merkisoft informatik 5.11.2006 21 / 26


Long Running User Transactions in Database Systems

6.8 Transaction handling


The replication process acts on two databases concurrently. A perfect solution in terms of
committing would be to start a transaction on a central transaction manager, run the
process and finally commit the transaction. The transaction manager would then perform
the committing of changes to both databases and in case of a problem, rollback both. The
Java API contains an javax.transaction.xa package with XAConnection and XAResource
interfaces. Database vendors offer implementation of these to communicate with a
transaction manager.
The current implementation of the replication process does not make use of this feature
but the solution could be extended to use it as required.

Fabian Merki, merkisoft informatik 5.11.2006 22 / 26


Long Running User Transactions in Database Systems

7 Testing
To prove the functionality and correctness of the replication framework JUnit was used.
JUnit is a powerful test framework for Java and was chosen to write the test cases. This
approach helped to develop the software and additionally it allowed quickly regression
testing.
A very simple datamodel was used for the tests. The following diagram shows the
relationship between the classes.

The following extract of the test class shows how the tests are written:
public void testSubscription() throws ParseException {
Session local = DAO.getLocal().openSession();
Session remote = DAO.getRemote().openSession();

subscribe(local, "Anna", "Football");


subscribe(local, "Hans", "Football");
subscribe(local, "Hans", "Diving");

checkSubscription(0, 0, 0, 3, 0, 0);

megaSubscriptionTest(local, remote);
checkSubscription(1, 0, 0, 1, 1, 0);

initLocalDatabase();

megaSubscriptionTest(remote, local);
checkSubscription(0, 1, 1, 0, 0, 1);

Fabian Merki, merkisoft informatik 5.11.2006 23 / 26


Long Running User Transactions in Database Systems

subscribe(local, "Anna", "Diving");


subscribe(remote, "Anna", "Diving");

checkSubscription(0, 0, 1, 0, 0, 0);
}

protected void setUp() throws Exception {


initLocalDatabase();
}

protected void tearDown() throws Exception {


checkCleanup();
}
private void checkSubscription(int dlStudent, int drStudent, int ulSubscription,
int urSubscription, int dlSubscription, int drSubscription) {

Replication[] rr = Replication.mergeAll(false);
check(rr[0], Course.class, 0, 0, 0, 0);
check(rr[1], City.class, 0, 0, 0, 0);
check(rr[2], Student.class, 0, 0, dlStudent, drStudent);
check(rr[3], Subscription.class, ulSubscription, urSubscription,
dlSubscription, drSubscription);
}

The checkSubscription Method replicates the databases and then checks if the expected
amount of records were updated. With this approach it is really simple to do complex test
cases where both sides insert, update and delete records without writing much code.
Please see code for full details of test cases.

Fabian Merki, merkisoft informatik 5.11.2006 24 / 26


Long Running User Transactions in Database Systems

8 Conclusion
Depending on the requirements many products are available to perform replication. In the
situation of the course application the Hibernate solution was a good choice (see chapter
5.5).
The complete course administration software including a homepage and the replication
process described in this document was successfully deployed to administrate more than
700 children in three regions. In this production environment the software worked very
well. A few minor bugs were quickly fixed. The performance of the application was good. A
replication regularly completed within 5-20 seconds. In scenarios where the number of
records exceeded 1000 records i. e. after a mass update it took up to 5 minutes.
To address this performance issue, in parallel to this assignment I under took another
project to develop a zipped tunnel solution. Early tests are looking promising and are
showing a 2-5 fold reduction in communication load can be achieved. If successful, the
zipped tunnel will be integrated with the work of this assignment and used in the course
administration software.

Fabian Merki, merkisoft informatik 5.11.2006 25 / 26


Long Running User Transactions in Database Systems

9 References
[HIBERNATE]
http://www.hibernate.org

[JAVA]
http://java.sun.com

[ORACLE-WS]
http://www.oracle.com/technology/products/database/workspace_manager/index.html

[DAFFODIL]
http://www.daffodildb.com/replicator/dbreplicator.html

[RES-SLONY]
http://developer.postgresql.org/~wieck/slony1/adminguide-1.1.rc1/slonyintro.html

[RES-MSSQL]
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/replsql/repltypes_30z7.asp

[RES-DAFODIL]
http://opensource.replicator.daffodilsw.com/what-is-replicator.html

[RES-ORACLE-WS]
http://www.adp-gmbh.ch/blog/2006/05/09.php
http://www.idevelopment.info/data/Oracle/DBA_tips/Workspace_Manager/WM_1.shtml

Fabian Merki, merkisoft informatik 5.11.2006 26 / 26

S-ar putea să vă placă și