Sunteți pe pagina 1din 19

19/8/2010 Distributed Database Management Sys…

Distributed Database Management Systems in the Modern Enterprise

Jason C. Stollings

Bowie State University

Abstract

This research paper studies the use of distributed database management systems (DDBMSs) in the
information infrastructure of modern organizations. The key purpose of the research is to determine the
feasibility and applicability of DDBMSs for today's business applications. The forces which drove the
selection of this topic were the improvements of distributed features in leading database management systems
(DBMSs) in recent years, as well as the potential of distributed databases to provide competitive advantages
for organizations.

Part one of the paper explores the business and information technology forces driving the development of
DDBMSs. The business forces covered are issues such as requirements to operate a geographically
dispersed organization, and supporting merger and acquisition activity. Information technology forces include
the growth of client-server systems, and the need to integrate data stored on legacy systems.

Part two of the paper studies distributed database systems, first from a general technology perspective and
then from a specific product perspective. The key concepts of distributed database technology are described
and the future of the technology is discussed. The coverage of specific products includes a discussion of the
product offerings from the five major DDBMS vendors and a sample of non-DBMS middleware.

Part three explores the ability of DDBMSs to meet the needs of businesses identified in part one. First, four
short case studies of organizations which have recently implemented distributed databases using DDBMSs
are presented. Second, the paper analyzes the ability of the DDBMS products explored in part two to meet
the specific requirements identified in part one.

Table of Contents
…umuc.edu/…/Stollings 690 Final.htm 1/19
19/8/2010 Distributed Database Management Sys…

Introduction and Thesis Statement

Forces Shaping Demand for DDBMSs


Business Forces
Technology Forces

Distributed Database Systems


Distributed Database Technologies
The Future Prospects for DDBMSs
DDBMSs Currently Available
Middleware Alternatives to DDBMSs

Ability of DDBMSs to meet Business Requirements


Distributed Database Cases
Analysis

Introduction and Thesis Statement

The information requirements of large and medium-size firms and the state of distributed database technology
have both advanced tremendously in recent years. In fact, nearly all modern DBMSs come standard with
powerful distributed features, but these features must be implemented and administered by skilled
professionals. Distributed databases are much more complex than their centralized database cousins, but
when properly implemented in the appropriate enterprise applications, they can provide great benefits to the
organizations they support.

Forces Shaping Demand for DDBMSs

Prior to the popular acceptance of DDBMSs, corporations normally relied on centralized databases designed
to serve very structured information requirements. These centralized databases had some characteristics in
common. First, they ran on powerful and expensive hardware that could handle very large portions of a firm's
data reliably. Second, they were administered by a small number of well-trained people who could manage
the organization's complex mainframe or minicomputer. Third, the dedicated data lines forming the corporate
wide area network (WAN) had to be highly reliable and have a large capacity, because any downtime will
preclude at least one site from operating, and every operation had to be transmitted to and from the central
database in real time. These centralized databases could provide adequate performance to firms able to work
around their shortcomings. These shortcomings include the lack of flexibility in the application of the firm's
information and the requirement to implement a single point of failure for the entire enterprise.

This section explores the lessons learned about the limitations of centralized database systems over the thirty
years they have been in general use. First the business forces are explored. Each of these business issues has
generated information technology requirements that distributed database architectures are uniquely capable of
supporting. Second, the technology issues are explored. These have come about from advances in
information technology that have made the centralized database model less relevant in today's organizations.

Business Forces

Geographic Dispersion
…umuc.edu/…/Stollings 690 Final.htm 2/19
19/8/2010 Distributed Database Management Sys…

Geographic dispersion of organizations is not an entirely new concept. Large firms have connected major
regional offices to their centralized databases using dedicated lines for years. The difference now is that
geographic dispersion is taken to greater extremes to provide cost savings and improved contact with the
firm's customers. Large regional offices are increasingly replaced with smaller locations in all of the firm's
markets. This change greatly increases the number of dedicated lines which, if provided at the same service
levels as in the older centralized systems, could add up to an enormous expense. Clearly, the traditional
centralized database model creates a problem for firms wishing to benefit by such increased geographic
dispersion.

Another aspect of the geographic dispersion problem is the growing abundance of portable computer use by
mobile professionals. A common example of this is the travelling salesperson using a laptop-based database
to query available inventory and take customer orders. The nature of this work prevents a full-time network
connection, and the database on the mobile system must somehow be linked to the firm's master database at
regular intervals to update the distributed copies of any data that has been changed. This is another case
where geographic dispersion has rendered the centralized database architecture obsolete.

Geographically dispersed organizations require an architecture that allows the bulk of data retrieval and
updates to be performed on fast and inexpensive local area networks (LANs). This architecture should
reserve the more expensive WAN for data updates that are relevant to other sites. Mobile users should have
a copy of the data for their local use and an efficient means to update using a part-time connection.

Information as a Resource

Business leaders today understand the importance of information as a business resource. With centralized
database systems, an organization's information is maintained and controlled by a few highly skilled individuals
at one location. Two major factors have led many business users to reject the centralized database model: the
natural tendency for humans not to share and the introduction of personal computer (PC) -based DBMSs
powerful enough to handle many concurrent users. Armed with such tools, departments and workgroups can
easily build their own databases, wresting control of the information resource from the administrators of the
organization's central databases and satisfying their natural tendency not to share.

The explosion of individual databases running on PC platforms can provide new opportunities to heads of
departments, but may also pose problems for the organization as a whole. Information that could benefit the
entire organization often becomes out of reach for users unable to access it or unaware of its existence.
Additionally, because of the cheaper hardware and software used, and generally lower skills of the personnel
administering these systems, reliability can be significantly less than with centralized systems. Data
inconsistency is another problem that occurs in such an environment, as the same data is stored in many
databases with no system for managing the multiple copies.

The centralized and decentralized models described above both generate major problems for large
organizations. Some type of architecture that provides the advantages of both without the drawbacks would
be ideal. This architecture should allow decentralized use of data, while providing for database administration
that can be performed by personnel with the interests of the whole firm in mind.

Mergers and Acquisitions

Mergers and acquisitions have become an increasingly common experience for corporations in recent years.
Figures provided by Thomson Financial Securities Data showed year to date worldwide merger and
acquisition volume for 1999 at a record $1.3 trillion, a 12.87% increase from a year ago. The premise of this
activity is that a newly created organization will be more competitive as a result of the merger or acquisition.
Financial assets, equipment, and personnel can be fairly easily combined, but combining different information
resources into one can be very challenging, especially when large centralized databases are part of the mix.

…umuc.edu/…/Stollings 690 Final.htm 3/19


19/8/2010 Distributed Database Management Sys…
Different hardware and software platforms as well as different database schema pose problems for separate
information technology (IT) departments suddenly faced with the task of combining their information
resources.

Recent mergers at Nortel Networks, Bank of America, PricewaterhouseCoopers, and People's Bank
generated significant challenges for the IT executives involved. Their experiences showed factors such as
accelerated timelines, confusion about goals of the newly formed organization, as well as network,
application, and support issues generated the toughest problems (Schachtman, 1999).

Corporations anticipating such activity require an IT infrastructure that will allow easy interconnection with
other firms. A database structure that is easily integrated into others will ease the troubles IT departments of
merging corporations. Such an information infrastructure can also increase the value of a potential takeover or
merger candidate as potential suitors realize the efficiencies this can generate.

Corporate Rightsizing

Modern corporations expand and contract frequently as they respond to changing competitive pressures. A
study of 3,628 companies between 1980 and 1994 done by Cascio and Young reported in Morris, Cascio,
and Young (1999) found that one third had fired at least 15% of their employees during the period of the
study. The study also concluded that in most cases, the companies had expanded to their original sizes, often
within less than three years. Such activity is referred to as corporate downsizing or rightsizing, not to be
confused with the same terms applied to information technology and client-server systems. It is often through
the use of information technology that executives identify such business opportunities and transmit the
decisions and plans to make the changes very rapidly. Ironically, it is often the information technology
resource of an organization that is usually the least able to respond to such rightsizing decisions.

Centralized databases running on complex and expensive mainframes and minicomputers are usually very
difficult to scale to high degrees. Adding or removing processing capacity and storage can be expensive and
difficult. Many organizations require a scalable database system that can allow system administrators to
handle changing demand with nothing more than the incremental purchase or removal of commodity hardware
and software. Such a solution should provide a growing firm with a solution that allows rapid integration into
the existing architecture and a predictable increase in capacity and performance.

Technology Forces

Infusion of PCs and LANs in the Workplace

The stage for distributed databases was set in the 1980s when PCs began to take hold of the corporate
desktop in large numbers. The natural extension of these machines being on many desks was to connect them
using local networks and servers. Office fileservers provided small organizations with decentralized server
power. The culture and infrastructure of corporate computing reflected an increasingly decentralized bias,
driven in many cases by end-users who began to understand and explore the power of decentralized
computing.

Growth of Client-Server Systems

The introduction of large numbers of PCs and networks into the workplace led to the development and
widespread adoption of client-server systems in the 1990s. This development has been significant to the
growth of distributed databases. Distributed databases require processing power at each site where data is
physically located. Processing power is also usually required at each individual workstation to resolve the
complex issues of where and how data should be stored and retrieved in a distributed database environment.
Client-server architectures provide processing power at all locations. In a traditional mainframe architecture,
the combination of processing power and data storage is located at only one site - implementing a distributed

…umuc.edu/…/Stollings 690 Final.htm 4/19


19/8/2010 Distributed Database Management Sys…
database is not possible.

Although client-server systems are usually identified with distributed data storage, there is no requirement for
data storage to be distributed in client-server environments - data may be centralized on one mainframe,
distributed widely throughout the organization, or anything in between. Many organizations have seen the
ability to move to client-server as an opportunity to replace their expensive mainframe data centers with less
expensive minicomputers and microcomputers. Such a strategy has come to be known as downsizing or
rightsizing. With the cost of good RDBMS software for mainframe systems at around $250,000, the lure of
using smaller Unix-based systems running RDBMSs costing $10,000 is strong (Burleson, 1994, p.91).

The use of client-server has not proven universally successful, however. A 1994 survey by the Gartner Group
(as cited in Applegate, McFarlan, McKenney, 1996, p. 372) estimated the move to client-server to cost an
additional $50,000 to $65,000 per workstation over a five-year period. Many organizations also
underestimate the work involved in training users to operate, and support staff to maintain these more
complex systems and networks. One estimate places the cost and effort of maintaining the network and
distributed information architecture in client-server architectures at 40 percent (Ryan as cited in Applegate,
McFarlan, McKenney, 1996, p. 372). Many IT managers fail to recognize that moving to client-server may
cause their support staff requirements to grow by double or more in size, especially when many
geographically dispersed sites are involved or the user population is not familiar with the use of client
workstations. The added costs of such systems have soured some firms on the technology and slowed its
implementation. These recent lessons highlight the importance of implementing client-server as a strategic
transformation of the organization's IT architecture, rather than simply a replacement for older technologies.

In the last few years client-server systems have grown to include three or more tiers. These three tiers have
also been referred to as front-end software, back-end software, and middleware (Burleson, 1994, p.79).
Such layered architectures allow separate platforms for data storage, processing of data according to
business rules, and user interface. This promotes code reuse and the ability to change or upgrade individual
platforms while requiring little or no modification to the other layers.

The three-tiered client-server architecture promotes the implementation of new distributed database systems
by limiting the impact of the changed database architecture on the application as a whole. Most commonly,
the distributed database architecture can be changed without any modifications to workstation code since the
middle tier handles all interface with the database. Such multi-tiered client-server systems have helped
overcome the drawbacks of the increased cost of client-server systems by adding to the value they bring to
the organization.

Integration of Legacy Systems

Today's typical medium to large size firm has several databases running on different hardware and software
platforms appropriate to the size of the database and skills of the people using them. In addition, because no
single database model can support every application a firm may require, there may be many types of database
systems in use. Buretta points out that her experience has shown most medium and large size firms have an
average of six distinct hardware and software platforms across which they attempt to maintain data
consistency. This mix usually includes mainframes, Unix-based systems, client-server architectures, decision
support systems, groupware, and more (Buretta, 1997, p. 7). Such heterogeneity has developed into a
demand for a solution that allows firms to combine their data resources while retaining their existing legacy
systems. A popular solution for these organizations is to combine their diverse database platforms into one
distributed database system.

Increasing Demands of the Internet

The growth in Internet use and the explosion of web pages with real-time information have dramatically
increased the demands on business web sites in recent years. Web pages full of dynamic content equate to
…umuc.edu/…/Stollings 690 Final.htm 5/19
19/8/2010 Distributed Database Management Sys…
large numbers of select and update operations on the database server. Web site administrators have found it
hard to maintain quality, reliable service in such a demanding environment. Another factor is the increased
importance of reliability of the web site for many organizations, especially in e-commerce where even short
outages can have a tremendous impact. A study by Zona Research, Inc. reported in Computerworld
determined that as much as $362.2 million in U.S. e-commerce sales could be at risk each month as a result
of unacceptably slow download times. This was based on the frequency of greater than eight-second page
download times and an average $200 sale per site visit (Dillon & Silwa, 1999). Slow page generation times
caused by database server overload is a common cause of this problem.

The ability to operate backup servers with information identical to that contained on the primary server would
be an asset to any business web site where high availability is a critical requirement. Even more advantages
would be gained by having many servers geographically dispersed across the globe to provide users at all
locations with faster response times. With a distributed database system, both of these are possible. Rather
than one heavily-used database server that users may have to connect to from halfway around the globe,
organizations can have distributed servers with the same data using a distributed database.

Distributed Database Systems

Distributed Database Technologies

Distributed database systems are based upon several models and their implementations can include a number
of different features. This has developed from the many varied situations and requirements organizations are
faced with in putting the technology to use. The topics described below are key to understanding the
capabilities and limitations of distributed database systems.

Fragmentation

The fragmentation technique for distributed databases involves splitting the centralized database into portions
and moving them to different locations. This distribution is accomplished by horizontal and/or vertical
partitioning. No data is stored redundantly with the exception of primary keys in the case of vertical
fragmentation. Using the relational model, horizontal fragmentation is accomplished by separating rows and
vertical fragmentation is accomplished by separating columns. Data is normally fragmented according to the
section of the organization which uses or modifies the data most frequently. For example, a firm may use a
department code field to determine which department is responsible for each record and where the data
should physically reside in a horizontally fragmented system.

Key principles of the fragmented distribution model are that only one copy of the data exists in the database,
and that ownership and ability to update the database are shared. This model is similar to the centralized
model in that the data is always consistent and current. The only redundancies exist with primary key fields
when using vertical fragmentation. Fragmented database systems are more complex than centralized systems,
but simpler than replicated systems. The issue of a single point of failure is reduced, but not eliminated.
Network usage is generally lower in fragmented systems than in centralized systems.

Replication

Distributed databases can use replication techniques to maintain multiple copies of the same data. Replicated
database systems seek to maintain data consistency while allowing redundancy. There are two basic models
for replicated databases: the master/slave model and the update anywhere model. These replication
techniques can also be combined with fragmentation for a more complex distribution pattern.

The master/slave model maintains a dedicated master for each individual data element of the database. This
master data is used as the source of replication to slave data elements, which are distributed as copies at
…umuc.edu/…/Stollings 690 Final.htm 6/19
master data
19/8/2010 is used as the source of replication to Database
Distributed slave dataManagement
elements, which
Sys… are distributed as copies at
multiple locations. These distributed replicas are used exclusively for read-only access; all updates must be
made directly to the master. All replication is assumed to take place asynchronously to the updating of the
master. This means that all of the data replicas are made consistent with the master at some time after an
update is made to the master.

The critical component of any master/slave system is the asynchronous replication service. The goal of the
replication service is to maintain all replicas in a state of consistency with the master within a timeliness
standard required for the replicas to be useful to the application. This is accomplished through complete or
incremental refresh of the master to slaves, or through delta propogation of events to slaves. These are also
commonly referred to as table-based or transaction-based replication techniques respectively. The choice of
which method to use is one the designer makes, taking into account such factors as the data consistency
requirements of the system, the network's capabilities, and the anticipated volume of updates to the database.
In practice the designer may be constrained, however, since DDBMSs typically employ only one of these two
techniques.

Systems using the master/slave model may allow the designation of the master to change from one node to
another node containing a copy of the data. This allows the system some flexibility to respond to changing
load requirements. For example, the system could move the master designation to a node preparing to
perform a large number of batch updates or to an alternate node in the event the master experiences a failure.
However, the basic principle of the master/slave model is always in effect. Only one node can be the master
at any one time.

The update anywhere model, as its name suggests, allows updates to be performed on any copy of the data in
the distributed database - there is no designated master for any data element. Read and update operations
may be performed at any location. Updates are propogated to all copies within the database. The update-
anywhere model complies with C.J. Date's second rule for distributed databases, "No reliance on a single
site".

Replication in the update-anywhere model can be performed either synchronously or asynchronously.


Synchronous replication requires the use of the two-phase commit protocol, and ensures all atomicity,
consistency, isolation, and durability (ACID) properties of transactions. Asynchronous replication may be
used in some cases, with the loss of isolation between transactions operating on identical data at different sites
at the same time. The choice between the two methods is one the system designer must make based upon the
requirements of the system, the capabilities of the network, and the anticipated volume of updates to the
database. The costs of implementing synchronous replication are higher relative to asynchronous replication
due to the increased requirements on the network and the database system hardware.

Asynchronous replication under the update-anywhere model can cause a great deal of complexity when data
consistency must be maintained. No design using this technique can completely prevent data conflicts from
occurring. Conflicts are only detected after they occur, and must be repaired using manual operations or
system-generated compensating transactions. Because the original transactions are sometimes undone, this
violates the concept of durability of transactions. Undoing transactions can have a ripple effect as undoing one
transaction creates referential integrity or other problems for later transactions. Operating update-anywhere
databases with asynchronous replication can be compared to running a centralized database with all locking
mechanisms off - performance is greatly increased, but transactional integrity is often lost (Buretta, 1997,
p.46).

Concurrency Control

Concurrency control in replicated databases is implemented only in systems using synchronous replication
under the update-anywhere model. Distributed databases which use fragmentation techniques or master/slave
replication do not require concurrency control beyond what is required in traditional centralized database
systems. Distributed databases using asynchronous replication do not implement distributed concurrency
…umuc.edu/…/Stollings 690 Final.htm 7/19
19/8/2010 Distributed Database Management Sys…
control; instead they rely on conflict detection and compensating transactions.

The two-phase commit protocol is a required component of distributed databases that use synchronous
replication. The two-phase commit protocol consists of the coordinating node issuing a transaction to all
affected nodes, waiting for each node to acknowledge that it is prepared to commit, and then issuing a
commit order to the affected nodes. If not all affected nodes acknowledge that they are prepared to commit
after the first phase, then the coordinating node issues a rollback instruction to all nodes. With the two-phase
commit protocol, there is a short period of vulnerability between commit times at different nodes during the
second phase when data consistency could possibly be lost due to a node or network failure. The probability
of this is extremely small, however, and most DDBMSs do not implement any measures to compensate
(Burleson, 1994, pp. 192-193).

Failure Recovery

One of the advantages of replicated database systems is that they can provide a level of fault-tolerance
beyond what can be achieved through more traditional means such as the use of redundant array of
inexpensive disks (RAID). By replicating the database so that it is on two separate machines in different
physical locations on the network, the probability that failures will cause a loss of service is significantly
reduced. Two options for implementing failure recovery through database replication are available: warm
standby and hot standby.

Warm standby uses asynchronous replication to maintain the standby server in a state nearly consistent with
that of the primary server. Due to the lag between transactions being committed on the primary server and
replication to the standby server, a small number of transactions are normally lost during a primary server
failure and switchover to the standby server.

Hot standby uses synchronous replication to maintain the standby server in a state always consistent with the
primary server. From an availability perspective this is the preferred solution, but the higher costs and
potential lower performance of synchronous replication databases cause many organizations to select a warm
standby solution. Buretta recommends a combination of local hot standby, normally RAID, and offsite warm
standby server (Buretta, 1997, pp. 59-61).

OLTP/OLAP

In the earliest days of centralized databases, vendors typically used a one-size fits all approach to DBMS
software. As database systems grew, increased in importance to the organization, and began operating in
diverse applications, it became apparent that DBMSs should be specialized. One of the major shifts occurred
when data centers running mainframes with large relational databases optimized for transaction speed began
to notice poor performance during times when reports and queries were processed. It was at this point that
the different requirements of online transaction processing (OLTP) and online analytical processing (OLAP)
became apparent. Some organizations responded to the performance issues by restricting OLAP to late night
and other off-peak times. This is less than an ideal solution as it limits the use of OLAP for the competitive
advantage it should provide, and may be impossible when an organization operates round the clock or in
many time zones.

The solution most commonly implemented now is the establishment of separate data stores for OLTP and
OLAP. Each data store is optimized for the type of services it provides. Buretta (1997) outlines the
distinctions between the requirements of OLTP and OLAP environments. OLTP environments include
relatively short updating transactions; high number of users sharing either the same physical data store or a
replica with real-time or near-real-time data consistency with the primary sources; and highly-normalized data
structures (usually to third normal form). OLAP environments include complex queries that tend to
summarize, consolidate, and apply complex calculations to data; a small number of users sharing data
structures with a data latency level that generally does not need to have real time or near-real time consistency
…umuc.edu/…/Stollings 690 Final.htm 8/19
19/8/2010 Distributed Database Management Sys…
with the primary source(s); and denormalized data structures with precalculated values that reduce the
number of joins across data structures (Buretta, 1997, pp. 21-24).

Security

Implementing effective security in a widely distributed database is no small task. Buretta notes that possible
security services in a multitier architecture include authentication, authorization, nonrepudiation, confidentiality,
and data integrity. Authentication is the process of having each user, host, or application server prove who it
says it is. Authorization is the process of ensuring that each authenticated user has the necessary permission
level to perform the requested tasks. Nonrepudiation is ensuring that authenticated and authorized users may
not deny that they used a designated resource. Confidentiality prevents unauthorized users from accessing
sensitive data. Data integrity prevents data from being modified in an unauthorized manner (Buretta, 1997, pp.
98-99).

Buretta makes some recommendations for implementing security in a replicated database environment. The
first is that all stored and/or displayed passwords must be encrypted so that unauthorized persons and
processes may not obtain them. Pseudo-user accounts, those established for systems to automatically log on
to the network, are common in distributed database environments. Buretta points out that these accounts must
comply with the firm's security policies and knowledge of their passwords should be limited. All file systems,
raw devices, and/or database structures used to store queued data and/or messages must be secure. This
item points out the many avenues in a distributed system available to unauthorized users, which must be
protected. Finally, encryption techniques must be integrated within the replication service. This prevents
interception of the data transmitted over the network (Buretta, 1997, p. 203).

Burleson makes the point that distributed database systems may use either application- or data-level security.
Application-level security, as its name suggests, is programmed into the application logic. Each application is
responsible for governing user access to the data. Data level security is implemented in the database engine.
Profiles of acceptable data items and operations are stored and checked by the database engine against the
end-user's permission level on each database operation. Burleson recommends that application-level security
be removed and replaced with data level security to make the distributed database more secure. The
argument for this is that a skilled end-user with a workstation and commonly available development tools
could easily write an application that does not follow the organization's security policy. Such a security hole
may be created either unintentionally by a well-meaning employee or intentionally by someone with malicious
intent. When data-level security is implemented, such security holes are not possible (Burleson, 1994, pp.
208-219).

The issues above demonstrate that a good DDBMS must provide security services, and that organizations
must know how to properly implement them. As data is distributed and end-users are given more processing
power, potential for security problems increases. Organizations with distributed databases must be competent
and vigilant in their execution of security.

The Future Prospects for DDBMSs

DDBMS technology has potential, but its continued growth in popularity is not guaranteed. Just as DDBMSs
grew in popularity with client-server, the potential for further growth in their popularity will likely be tied to
client-server. The following section examines the potential for future success or failure of DDBMS systems
with a focus on client-server trends as indicators of what the future may hold.

Growth of Internet Computing

Experience with distributed database client-server architectures has shown that the complexity and expense of
these approaches can be overwhelming. As a result, some firms have decided to go back in the direction of
centralized databases. A 1998 study by Computer Economics Inc. reported in Computer Reseller ews

…umuc.edu/…/Stollings 690 Final.htm 9/19


19/8/2010 Distributed Database Management Sys…
found that 37 percent of companies surveyed were using or implementing server-consolidation strategies
(Burbank & Roberts, 1998). Vendors have responded to this demand by providing solutions using servers
based on mainframe or minicomputer platforms and thin clients, which in most cases run only a web browser.
This architecture is sometimes referred to as Internet computing.

Proponents of Internet computing claim that simplifying the distributed components of the architecture and
moving data to one professionally-managed location provides higher reliability and lower operating costs. One
of the original arguments for client-server was the ability to replace character-based terminals with GUI-
based workstations, which are more flexible and easier to use. Internet computing retains the benefits of a
central data store and GUI-based workstations. The benefits of Internet computing may generate stiff
competition for widely distributed database systems.

Immaturity of Client-Server

Although client-server technology has been in widespread use for over a decade, some argue that it is not yet
developed to the level to provide sufficient advantages to businesses implementing new systems. A panel of
industry experts speaking at the Client-Server Leadership Forum in Toronto in 1996 concluded that the
client-server market is still in adolescence (Williamson, 1996). The panel reiterated the popular view that
three-tiered client-server systems are far superior to two-tiered systems. Three-tiered architectures require
more resources to implement, but are generally more scalable and allow for thinner, easier to maintain clients
than two-tiered architectures. The panel saw the client-server industry as immature, due to the low numbers
of three-tiered systems implemented - only 4.9 percent of all client-server systems. Many panel members felt
that until three-tiered systems become the norm, the benefits of client-server architectures cannot be realized.
Some of the members felt that this weakness may lead to client-server being replaced by Internet computing
(Williamson, 1996).

Lack of DDBMS Standards

Of C.J. Date's twelve commandments for a distributed database, four refer to open standards issues required
for DDBMSs to reach their full potential. These are hardware independence, operating system independence,
network independence, and database independence. Today's DDBMS products are still do not meet these
four standards.

Burleson described the causes of lack of open standards in the DDBMS market and the impact this has had
on DDBMS growth (Burleson, 1994, pp. 72-73). Five years later, this is still a fact. DDBMS technology is
relatively new, and is still suffering from vendors fighting to develop and hold on to proprietary features.
Today the situation is improving, but cross-vendor connectivity is sometimes limited, especially for legacy
systems that do not implement newer standards.

DDBMSs Currently Available

Several years ago finding the right tools for implementing distributed databases was a challenge due to the
lack of robust DDBMSs available. Today distributed features are common in the latest DBMS offerings from
all major vendors. In fact, it is rare for DBMSs with distributed features to be referred to as "distributed"
databases at all - the feature is so prevalent that it does not distinguish one product from another. The major
differences between products now are the technical details of how the data distribution is performed and the
special features the DBMS provides.

The DBMS market is fiercely competitive, with no one vendor dominating completely. According to
Dataquest Inc. figures reported in Computer Reseller ews, the 1998 database license revenue leaders
were IBM with 32.3%, Oracle with 29.4%, Microsoft with 10.2%, Informix with 4.4%, and Sybase with
3.5% (Willett, 1999). IBM's lead is due primarily to its dominance in the mainframe and AS/400 platforms;
on all other platforms Oracle is the leader. This section first provides an overview of features common among

…umuc.edu/…/Stollings 690 Final.htm 10/19


19/8/2010 Distributed Database Management Sys…
today's leading distributed database product offerings. Later it examines the differences between each
vendor's products as well as their diverse strategies to provide an indication of the future offerings that may
become available in the future.

Modern Distributed Features

In today's DBMS market, it would be difficult for a product to compete without distributed features, as they
are frequently used and nearly all products have them in some form. The most common distribution model
employed in today's DBMSs is update anywhere asynchronous replication. Other forms of replication are
also popular, while the purely fragmented model is not generally supported. Some products allow multiple
replication models to be used simultaneously, such as update anywhere synchronous and asynchronous
replication. Combinations of replication and fragmentation are also usually allowed. Most products provide
sophisticated replication management features to define replication rules, detect and handle conflicts, schedule
updates of occasionally connected systems, and other tasks. Typically this comes in the form of a replication
manager with a graphical user interface.

All of the major DBMS vendors offer products for multiple platforms. This is a common method for providing
a means of scalability. Often a database can be moved from a version running on a single-central processing
unit (CPU) Pentium server to a multi-CPU minicomputer with very few modifications by using different
versions of the same vendor's DBMS. This scalability feature is enhanced by the ability of all major DBMSs
to replicate between different platform versions of the same software. Cross-vendor replication is also
possible, though normally only for DBMSs that support standards such as open database connectivity
(ODBC), or those that are designed to work with another specific DBMS.

One of the reasons for using distributed database systems is to allow small portable systems to use a copy of
the database, normally a subset, for a mobile worker who connects to the network on an infrequent basis. To
support this, most vendors offer small single user versions of their databases for laptop, palmtop, or even
smaller devices. These single user versions can be thought of as fat clients that know how to send and receive
updates to the master database when a connection is available.

Security in modern DDBMSs is normally implemented using data-level techniques. Often this employs
validation, authentication, and encryption built into modern network operating systems. These DBMSs
normally allow administrators to assign access privileges for particular data elements and operations. With
these systems, security of the distributed database is as much a responsibility of the network administrator as
the database administrator.

Object-oriented databases do not figure prominently into the distributed database market. As with stand-
alone DBMSs, the relational model is by far the most popular. These DDBMSs do, however, offer some
object-oriented features within the relational model. Such products are typically referred to as object-
relational DBMSs. They typically provide the ability to store and access data types such as sound and video.
These features may be significant to a distributed database designer as these large data types can generate a
large load on the network when propogated.

Microsoft

Microsoft offers RDBMS software with very robust distributed features. Microsoft's Structured Query
Language (SQL) Server 7.0, targeted for enterprise application servers, and Access 97, targeted for the
desktop environment, both include replication services in the basic package. While the distributed features in
Access are somewhat limited, many are unique in the desktop environment. SQL Server, the flagship
Microsoft product, provides support for nearly every type of data replication used today.

These products reflect Microsoft's strategy, which is focused on growing the demand for robust Windows-
based PCs. Microsoft sees the Internet computing move to thin clients as a threat. To support its strategy and

…umuc.edu/…/Stollings 690 Final.htm 11/19


19/8/2010 Distributed Database Management Sys…
counter the Internet computing movement, Microsoft has designed its DBMSs to be distributed widely in an
organization's information infrastructure. As a result, SQL Server and Access run on Windows 95/98 or NT
PCs and servers, and provide robust replication features that support very widely distributed databases.
Microsoft is the only major DBMS vendor selling DBMSs for only its own operating system platforms.

SQL Server offers replication techniques that Microsoft refers to as snapshot replication, transactional
replication, and merge replication. Snapshot replication is a form of asynchronous table-based replication.
Merge replication is another form of asynchronous table-based replication for occasionally connected systems
that require a high degree of autonomy. Under all asynchronous models conflicts in replication are resolved by
either numeric priority assigned to each site or a user-defined custom procedure (Replication for Microsoft
SQL Server 7.0, 1998).

Microsoft's distributed database solutions also provide a fair degree of interoperability with DBMSs from
other vendors using Microsoft's ODBC and OLE DB standards. Replication both to and from heterogeneous
data sources is supported. The catch here is that this interoperability functions only with Windows operating
systems and with DBMSs that conform to the Microsoft standards.

Oracle

Oracle, like Microsoft, offers DBMSs in both server and workstation versions with powerful replication
features. Oracle's 8i is available for many platforms ranging from Unix-based minicomputers down to
Windows-based single-CPU Pentium systems. Oracle 8i Lite is a small-footprint Java and Hypertext Transfer
Protocol (HTTP) based client object-relational database that comes in versions for Windows, Windows CE,
Palm OS, and other mobile platforms.

Oracle's strategy follows the Internet computing movement. The 8i DBMS is rich with support for networked
architectures and distributed data. At first glance, the distributed features of 8i seem very similar to SQL
Server. In fact the strategy differs in that Oracle sees advantages in using fewer servers rather than more.
Oracle 8i will run on small Windows NT servers, but performs best in a distributed enterprise environment
with only a one or a handful of Unix servers. In an interview with Bull and Vizard in InfoWorld, Oracle
Chairman and Chief Executive Officer (CEO) Larry Ellison explained the relative disadvantages of
Microsoft's distributed data architecture:

Having complicated desktop software spread across your company is bad enough. But server
software requires a lot of talented, expensive labor to manage. If you think client/server is labor-
intensive, wait until you distribute servers all over the place. The small PC servers don't cost as
much as Unix servers, but the labor cost to run them is exactly the same.

Oracle is generally acknowledged to provide superior scalability compared to SQL Server. A recent article in
Computerworld described the situation at three firms that have replaced existing SQL Server infrastructure
with Oracle in an effort to handle increased demands. One such firm, Insurance Holdings of America Inc.,
uses an extranet that allows agents to get quotes and sell policies, originally built on SQL Server 6.5. CEO
Brian McCarthy summed up his experience by saying, "SQL 7.0 is great for nonenterprise applications, but if
you're going to run a heavy transaction-intense platform you've gotta go with IBM DB2 or Oracle." (Deck,
1999)

Asynchronous replication in Oracle 8i is accomplished through what Oracle calls "triggers", which is a form of
transaction-based replication, or through "snapshot refresh", which is a form of table-based replication
optimized to reduce network traffic. Synchronous replication is also available. Combinations of synchronous
and asynchronous replication are possible within the same database. Another replication option in Oracle is
called procedural replication, which allows a predefined batch procedure to be stored at each location and
executed locally on demand, rather than propogated as data updates. This method can significantly minimize
network traffic. Asynchronous replication conflicts are resolved by one of Oracle's built-in conflict resolution
…umuc.edu/…/Stollings 690 Final.htm 12/19
19/8/2010 Distributed Database Management Sys…
routines or by a user-defined routine (Oracle 8i Advanced Replication, 1999).

Oracle 8i is designed for a high level of integration with 8i Lite. 8i Lite allows small distributed systems to have
access to data, perform asynchronous updates, and perform fast snapshot updates when an Internet
connection is available. With 8I Lite, data can be replicated between the PDA versions and PC versions, but
replication between PCs is not possible - it must be performed with a server running 8i. The column- and
row-level subsetting features of 8i allow a combination of fragmentation and replication - a distributed copy
may be restricted to only a relevant portion of a table fragmented by rows and/or columns.

Sybase

The Sybase strategy is focused on middleware and integrating multi-vendor heterogeneous systems into one
distributed database. The Sybase product line is diverse, with separate components available to perform the
database management functions a firm may require. Sybase customers select the components they need to
perform their necessary database functions.

Sybase Adaptive Server Enterprise Replication is the component that adds replication features to Sybase's
flagship Adaptive Server Enterprise 11.9.2. The product provides a great deal of heterogeneous data
replication capability as it provides access to more than 25 different enterprise databases. With the addition of
the Mainframe Connect option, replication with large IBM mainframe DBMSs is possible. Detailed
information regarding the replication features of Sybase products is not readily available. Sybase also receives
little attention in industry journals relative to IBM, Oracle, and Microsoft. Buretta notes that at the time of her
writing, Sybase supported only the master/slave model, expecting each piece of data will have only a single
primary source at any point in time (Buretta, 1997, p. 75).

Informix

Informix Dynamic Server runs on Windows NT and Unix systems and provides some data replication
features. Informix is a relatively smaller DBMS developer and focuses its resources on producing DBMS
servers with very high performance and reliability. Multithreading, parallel processing, and multi-drive data
fragmentation are prominent Informix features. Informix claims that Dynamic Server supports a full peer-to-
peer replication model with update anywhere capability, but the replication features are marketed principally
as a means for maintaining backup servers that can take over in the event of a primary server failure. Little
information related to Dynamic Server's database distribution features is available from Informix or in recent
industry periodicals.

IBM

IBM's DB2 Universal Database runs on systems ranging from Windows 95 laptops to mainframes. Versions
are available for OS/390, AS/400, AIX, HP-UX, Linux, OS/2, SCO UnixWare, Sun Solaris, and Windows
NT/98/95. IBM DB2 Everywhere, a 50KB embedded version, runs on mobile phones, Palm OS, and
Windows CE devices. IBM markets DB2 as a total solution for any database requirement, covering the full
range of every category. Databases built using DB2 are scalable from small portable systems to enterprise
systems on mainframes. Extenders for DB2 are available which support storing text, image, audio, video,
spatial, and other data types. DB2 Enterprise-Extended Edition is designed to handle both OLTP and OLAP
concurrently using symmetric multiprocessing on Windows NT, Solaris, and AIX. Support for Internet
computing is provided through Java and Java Database Connectivity (JDBC).

IBM is strongest in its traditional market area of large systems. As a vendor of hardware, it has an interest in
growing the demand for its profitable mainframe and minicomputer systems through the development of very
capable DBMSs, which increase the value of these hardware platforms. By also developing DB2 versions for
other vendors' products, IBM is in a position to market DB2 as a uniquely universal and scalable DBMS.
Brokerage Solomon Smith Barney, which runs the DB2 Universal Database OS/390 mainframe version, as

…umuc.edu/…/Stollings 690 Final.htm 13/19


19/8/2010 Distributed Database Management Sys…
well as on smaller platforms, is a good example of the demand for IBM's popular database. In an interview
with Information Week, Robert Perih, database systems VP, explained that when applications outgrow
Windows NT or Unix, they can be moved to the mainframe without recoding the whole system (Whiting,
1999). Such examples are typical of the demand created by IBM's strength in the mainframe database
market. The high end of this scalability comes with a high price tag, however - Whiting also reported that the
license fee for DB2 Universal Database for OS/390 is $3000 per month.

Data distribution in DB2 is provided through IBM's Data Propogator Relational, which provides very robust
synchronous and asynchronous replication features and also operates on systems from mainframes to laptops.
This product provides replication to and from a wide variety of DBMSs from other vendors including Oracle,
Sybase, Informix, and Microsoft SQL Server. DB2 DataJoiner is an optional component that allows users to
view and perform SQL queries on heterogeneous distributed databases as if the data were all on a local
relational database server. DataJoiner runs on AIX, HP/UX, and Windows NT. The DB2 Universal
Database Control Center allows GUI-based management of replication. Data Propogator Relational also
provides support for database updating on mobile and other occasionally connected systems. Data
Propogator Relational allows data subsetting, or replication of only a portion of the database to a target. This
functionality has the unique feature of allowing data to be subsetted based on join predicates or subselects,
rather than only field values as other DBMSs do (DB2 Data Propogator Relational, 1997).

Middleware Alternatives to DDBMSs

The large number of firms operating legacy databases that are unable to function together in a distributed
database has generated a demand for products that can provide connectivity between different database
servers and clients which do not add another database platform to existing architectures. Such server-based
middleware products can solve the data transfer problems for many organizations while keeping the demands
on developers and administrators down. A leading example of this product class is OpenLink Software Inc.'s
Virtuoso (Morgan, 1999). This section will provide an overview of Virtuoso as a case in point of the
distributed features such non-DBMS products can provide. As with most such products available today,
Virtuoso lacks some of the distributed features many firms require. IT managers should be wary of promises
made by vendors of these products and verify that they can meet the firm's specific needs before committing
to such a solution.

Virtuoso

OpenLink's Virtuoso is targeted to meet the need for Universal Data Access (UDA) middleware which
integrates disparate database applications from different vendors. OpenLink describes Virtuoso as a "virtual
database" that abstracts the details of separate databases into a single application or development
environment for end users and application developers. Virtuoso can integrate only ODBC-compliant
databases such as Oracle, SQL Server, DB2, Informix, PostgressSQL, CA-Ingress, Sybase, and others. It
includes managers for security, queries, metadata, transactions, concurrency, and replication. It can provide
cross-platform support for Windows 95/98/NT, Linux, Solaris, HP-UX, AIX, Digital Unix, and others.

The Virtuoso replication manager provides for bi-directional (update anywhere) asynchronous replication
across the supported database servers. The current version does not implement a two-phase commit protocol
necessary to perform synchronous replication. Virtuoso's transaction manager maintains a transaction log to
ensure a consistent state across replicated nodes. Multiple Virtuoso servers may be established within an
enterprise to provide increased performance and lower network traffic for geographically dispersed locations.
Virtuoso servers can be managed through an included web-based administration tool (Idehen, 1999).

Ability of DDBMSs to meet Business Requirements

…umuc.edu/…/Stollings 690 Final.htm 14/19


19/8/2010 Distributed Database Management Sys…
This section will explore the cases of four organizations that have recently implemented distributed databases
and the results they have obtained. These cases are representative of the use of distributed databases in
business today. Each case illustrates specific capabilities of today's DDBMSs. Later this section will analyze
the capabilities of DDBMS products to meet the business and technology demands for them outlined earlier.

Distributed Database Cases

Sea-Land Service, Inc.

Sea-Land Service Inc. is the largest U.S.-based ocean carrier with a significant worldwide presence. Sea-
Land uses its integrated network of ships, railroads, barge lines, and trucking operations to provide efficient
containerized transportation to nearly any location in the world. To manage the large volumes of cargo data
necessary to support this operation, Sea-Land recently replaced much of its mainframe architecture with a
client-server system based on 42 Microsoft SQL Server and quad-processor Dell PowerEdge 6300 servers
distributed at locations throughout the world. The distributed database also includes some legacy IBM DB2
mainframe servers in a heterogeneous architecture. Sea-Land plans to phase out the mainframes completely
by the year 2000 to lower costs.

Sea-Land's Shipment Management system continuously tracks the location of its 220,000 containers to
provide real-time updates on the location of cargo and transportation assets. Sea-Land provides access to its
customers through a web site that allows users to obtain shipment status and complete bookings online. These
services are key factors in Sea-Land's leading competitive position in the transportation industry. The SQL
Server databases handle heavy loads of between 30 and 175 concurrent user connections on nodes typically
containing 4-6GB of data. SQL Server's replication features synchronize corporate data and move shipment
data between locations (Sea-Land Service, 1999).

E-Plus Mobilfunk GmbH

E-Plus Mobilfunk GmbH has grown from a startup company with zero customers in 1994 to become
Germany's largest mobile phone provider by 1998. In early 1999 it had over two million customers. E-Plus
has differentiated itself from its competitors by providing very fast activation times for new accounts -
customers can have an active mobile phone within minutes of walking into an outlet and signing a contract.
The information system that supports account activation runs on four quad-CPU DEC 4100 Alpha servers
running SQL Server. The system has scaled up over the years from a single 90 MHz Intel server running an
earlier version of SQL Server. E-Plus anticipates a doubling of customers every six to twelve months.

SQL Server replication functions allow E-Plus to store up to date mission-critical data on redundant warm
standby servers to ensure high availability. It also allows relevant data to be replicated to servers performing
other business functions for E-Plus such as management reporting and product distribution. Key requirements
of this system are very high availability and scalability. E-Plus has been very successful in both these areas,
using the same DBMS software platform through explosive growth while providing excellent customer service
(E-Plus Mobilfunk GmbH, 1999).

Surridge Dawson Ltd.

Surridge Dawson, Ltd. is the United Kingdom's third largest wholesale distributor of newspapers, magazines,
and periodicals. Surridge Dawson commands a 20 percent market share and approximately œ550 million in
annual sales from approximately 13,000 customers. The firm's 25 warehouses receive and distribute
thousands of copies of various publications daily. Like many periodical distributors, Surridge Dawson
generates revenue through sales but must pay for unsold copies returned from retailers. To optimize its
distribution and reduce returns, Surridge Dawson uses a custom-designed application that reviews 60 weeks
of sales history to allocate new issues to retailers.

…umuc.edu/…/Stollings 690 Final.htm 15/19


19/8/2010 Distributed Database Management Sys…
The firm's database architecture consists of one master server running Oracle on a Compaq Alpha 4100 and
21 slave servers running Oracle on Alpha 1200 machines. The master server contains 25GB of replicated
data while the slaves each contain 2GB. The servers perform asynchronous replication through commercial
integrated services digital network (ISDN) lines. Each day's local sales and returns information consisting of
hundreds of thousands of transactions is replicated to the master server nightly, a process that typically takes
only two hours. Surridge Dawson's custom distribution software runs on each slave server, providing daily
issue allocation instructions for each warehouse. In turn, special instructions and changes are replicated from
the headquarters server to appropriate slave servers. The master server also sends sales and returns data to
publishers for analysis and decision making. The distributed database is a critical component of the enterprise
- even a short failure could have a significant financial impact on the firm. Surridge Dawson has found Oracle
to be stable enough to run their distributed database with only two database administrators, saving IT support
costs (Oracle at Work with Surridge Dawson, 1999).

Northwest Airlines

Northwest Airlines is the world's fourth largest airline, serving more than 400 cities. Its 415 airplanes fly over
1,700 flights daily. The airline uses a central DB2 database to track flight routes, schedules, and crew lists. To
maximize the value of this information, Northwest Airlines had to make this data easily accessible to its
various divisions from their local databases. This required a heterogeneous replication solution that would
allow the operations data on DB2 to other DB2, Sybase, and Oracle databases. Initially the airline used a
combination of customized programs, file transfer protocol (FTP), and middleware to implement a simple
form of data replication. This project replaced Northwest's locally developed solution with one based on
IBM's DataPropogator Relational and DB2 DataJoiner.

With this new seamless distributed database solution, Northwest Airlines was able to implement an OLAP
application to plan future flight schedules running on the operations control group's Sybase server. Because of
DataPropogator Relational's automatic log-based replication facility, the change required no changes to the
existing application code. With the success of this project, Northwest Airlines is planning similar projects to
support OLAP for strategic planning in other divisions of the airline (DB2 replication flies data to its
destination at orthwest Airlines, 1998).

Analysis

Thanks to recent developments in DDBMS technology, many of the business and technology requirements
for distributed databases can be met. Below are the business and technology requirements discussed earlier
with an analysis of the ability of today's DDBMS products to adequately meet the needs.

Geographic Dispersion

DDBMS vendors have done a remarkable job of meeting the demands of businesses to support their
geographically dispersed operations. Many of the advances in this area are due to the work in making efficient
use of network connections. Network loads resulting from replication activity in recent versions is significantly
reduced. Surridge Dawson's use of ISDN to replicate hundreds of thousands of transactions in only two
hours every day illustrates this point. Replication to small client databases using occasional dial-up
connections, typically with laptop computers, is also an area where recent work on lightweight replicating
DBMS versions has provided organizations with a powerful tool for meeting the need to support mobile
workers. With a careful analysis of business requirements and proper network design, DDBMSs can support
most geographically dispersed business operations.

Control of the Information Resource

DDBMSs now provide information managers with a means for centrally controlling and exploiting information
scattered by server proliferation. Robust tools for handling heterogeneous server platforms and replication

…umuc.edu/…/Stollings 690 Final.htm 16/19


19/8/2010 Distributed Database Management Sys…
that is transparent to legacy applications are key factors in this. Northwest Airlines is a good example of how
organizations can use the latest distributed database products to leverage existing hardware and software
originally meant for use by only one element of the organization to make the entire organization more
competitive. Because the distributed database model works well with centralized planning and decentralized
operations, database designers and administrators can maintain control of the firm's information while allowing
it to be used flexibly.

Mergers and Acquisitions

The distributed database products available now can facilitate mergers and acquisitions, especially if the
organization is anticipating such activity and plans ahead. However, the previously cited experiences of
PricewaterhouseCoopers and other large corporations attempting to merge information resources with other
firms on short notice (Shachtman, 1999) highlights the limitations of DDBMSs and middleware to solve very
complex problems. Issues with different database schema, incompatible network infrastructure, and pressure
to implement a solution rapidly are factors that will pose significant challenges to a smooth merger of
information resources for many years to come. In these cases DDBMS software features are no substitute for
high quality personnel and proper planning.

Corporate Rightsizing

DBMS vendors continue to make advances in the scalability of their products - both in the capacity of
individual servers and the quantity of distributed servers that may be included in a distributed database.
Modern database products give firms various options for growing or reducing their deployed databases. E-
Plus is an example of a firm that was able to manage explosive growth while staying with one DBMS product
family. Replication allows organizations to easily handle moving into new market areas. Surridge Dawson, for
example, could open a new warehouse quickly by adding another slave server identical to the 21 it already
has and connecting it with commercially available ISDN service. Current DDBMSs provide scalability
adequate for most business applications, and the emphasis placed on this by DDBMS developers will ensure
that those available in coming years will allow high degrees of scalability.

Client-Server Systems

The latest versions of DDBMSs and middleware make a developer's task of implementing a three-tier client-
server architecture much simpler. Many of the components that formerly required a heavy programming effort
are now available in off-the-shelf versions robust and flexible enough to handle most tasks. Organizations that
integrate such products into their information architecture will reap the benefits of three-tier client-server.
These architectures will allow for more flexibility and the ability to rapidly take advantage of business and
technology opportunities that arise in the future. However, three tier client server systems will always require
skilled planning and implementation to ensure the present and future needs of the firm can be met. This is
another area where DDBMS features will never replace talented people.

Integration of Legacy Systems

The Northwest Airlines case exemplifies the ability of modern DDBMS products and middleware to integrate
heterogeneous legacy database systems into a single distributed database. As standards such as ODBC, OLE
DB, and JDBC become more widely accepted, such integration is becoming more common. Burleson (1994)
devotes a great deal of attention to the problems of heterogeneity and lack of standards, but since 1994 these
issues have diminished significantly. Examples of heterogeneous database integration like the Northwest
Airlines case are common now. Continued development of heterogeneous integration features in DDBMSs
will minimize the significance of such issues for IT managers in the coming years.

Demands of the Internet

…umuc.edu/…/Stollings 690 Final.htm 17/19


19/8/2010 Distributed Database Management Sys…
Sea-Land Services is in a business normally considered low-tech, but it has exploited the Internet as a source
of competitive advantage through its web site that allows customers to track shipments and enter bookings
online. This is an example of new online services increasing loads on database servers as large numbers of
firms leverage the Internet to provide improved service. By replicating the relevant data to a server dedicated
for providing service to the web site, firms can reduce the impact of this increased demand on internal
operations while not losing online customers as a result of excessive page generation times. DBMS
developers have latched on to the Internet as a market for their products, and we will continue to see an
increase in the power of these products to support e-business on the Internet.

OLTP/OLAP

All of the major DBMS developers have made significant improvements to their newer products in the area of
handling high loads of simultaneous OLTP and OLAP operations on the same server. Recent advances such
as improved use of multiprocessor hardware, mutithreading, and row-level locking have allowed this
improved performance. However, there are still OLAP applications that generate such high system demands
that they cannot function together effectively with OLTP applications on the same server. The replication
features of today's major DBMSs fill this need nicely. Firms can use asynchronous replication to maintain an
OLAP server separate from the OLTP server and provide high performance for both applications. Future
advances in individual server capabilities to simultaneously support OLTP and OLAP plus improved
replication performance will mean that IT managers will not need to compromise to provide high performance
in both these areas.

References

Applegate, Lynda M., McFarlan, F. Warren, McKenney, James L. (1996) Corporate Information
Systems Management, Chicago: Irwin

Bull, Katherine, & Vizard, Michael. (1998, Sept 14) Oracle's Mission. InfoWorld [Online], 20(37), 5
paragraphs. Available: Infotrac SearchBank/A21134348 [1999, June 19]

Burbank, Keith, & Roberts, John. (1998, March 9) Corporate Server Consolidation -- Percent Of
Companies Citing Each Choice. Computer Reseller ews [Online], (779), 1 paragraph. Available: Infotrac
SearchBank/A20397031 [1999, June 19]

Buretta, Marie. (1997) Data Replication, New York: John Wiley & Sons

Burleson, Donald K. (1994) Managing Distributed Databases. New York: John Wiley & Sons

Deck, Stewart. (1998, May 10) SQL users turn to Oracle8 for bulk. Computerworld [Online], 2
paragraphs. Available: http://www.computerworld.com/home/print.nsf/all/990510A506 [1999, June 21]

Dillon, Nancy, & Silwa, Carol. (1999, July 2) Sluggish site performance could cost millions. Computerworld
[Online], 1 paragraph. Available: http://www.computerworld.com/home/news.nsf/all/9907025zona [1999,
July 3]

Idehen, Kingsley. (1999) Introducing OpenLink Virtuoso. [Online]. Available:


http://www.openlinksw.com/virtuoso/virtuowp/right.htm [1999, July 3]

IBM Corporation. (1997, September) DB2 Data Propogator Relational. [Online]. Available:
http://www.software.ibm.com/data/datapropr/index.html [1999, June 26]

IBM Corporation (1998, July 10) DB2 replication flies data to its destination at orthwest Airlines.

…umuc.edu/…/Stollings 690 Final.htm 18/19


19/8/2010 Distributed Database Management Sys…
[Online]. Available:
http://www2.software.ibm.com/casestudies/swcsdm.nsf/customername/83A2DD537FDC17EC0025672100626DD
[1999, July 4]

IBM Corporation (1999) DB2 Universal Database. [Online]. Available:


http://www.software.ibm.com/data/db2/udb/ [1999, June 26]

Microsoft Corporation (1999) E-Plus Mobilfunk GmbH. [Online]. Available:


http://www.microsoft.com/sql/70/casestudies/Eplus.htm [1999, July 5]

Microsoft Corporation. (1998, December) Replication for SQL Server 7.0 White Paper. [Online].
Available: http://www.microsoft.com/SQL/70/whpprs/repwp.htm [1999, June 21]

Microsoft Corporation. (1999) Sea-Land Service. [Online]. Available:


http://www.microsoft.com/sql/70/casestudies/SeaLand.htm [1999, July 5]

Microsoft Corporation. (1998) Using Replication in Your Application. [Online]. Available:


http://www.microsoft.com/ACCESSDEV/Articles/Bapp97/Chapters/ba20_1.htm [1999, June 21]

Morgan, Cynthia. (1999, February 22) Middleware promises to ease data transfer. Computerworld
[Online], 2 paragraphs. Available: http://www.computerworld.com/home/print.nsf/all/9902229176 [1999,
July 3]

Morris, James R., Cascio, Wayne F., Young, Clifford E. (1999, Winter) Downsizing after all these years:
Questions and answers about who did it, how many did it, and who benefited from it. Organizational
Dynamics [Online], 27(3), 16 Paragraphs. Available: ABI/INFORM Global/ 00902616 [1999, June 17]

Oracle Corporation. (1999) Oracle 8i Advanced Replication. [Online]. Available:


http://www.oracle.com/database/features/advrepl.html [1999, June 21]

Oracle Corporation (1999) Oracle at Work with Surridge Dawson. [Online]. Available:
http://www.oracle.com/corporate/oracle_at_work/html/surridge.html [1999, July 4]

Shachtman, Noah. (1999, May 17) Managing Mergers -- Make It Work -- When A Merger Hits, There's
No Time To Think Long Term. Internetweek [Online], (765), 32 paragraphs. Available: ABI/INFORM
Global/10969969 [1999, June 17]

Thomson Financial Securities Data. (1999, June 18) YTD Market Totals - Worldwide M&A. [Online].
Available: http://www.secdata.com/ [1999, June 18]

Whiting, Rick (1999, May 24) Changes Due for IBM's DB2. Information Week [Online], 4 paragraphs.
Available: http://www.informationweek.com/735/db2.htm [1999, June 30]

Willett, Shawn. (1999, March 22) Dataquest: IBM Overtakes Oracle In Database Race. Computer Reseller
ews [Online], 3 paragraphs. Available: http://www.crn.com/dailies/digest/breakingnews.asp?
ArticleID=2806 [1999, June 26]

Williamson, Margaret. (1996, March 28) Panel contends client-server still immature. Computing Canada
[Online], 22(7), 8 paragraphs. Available: Infotrac SearchBank/A18165877 [1999, June 19]

…umuc.edu/…/Stollings 690 Final.htm 19/19

S-ar putea să vă placă și