Sunteți pe pagina 1din 8

JDBC Drivers: How Do You Know What You Need?

JDBC Drivers: How Do You Know What You Need?


Comparing features across various JDBC data access drivers isn't easy if you don't know what all
the features mean—or which ones you're supposed to care about. We lend you a hand.

Software drivers, if I may borrow a quote from Rodney Dangerfield, "Don't get no respect." Most
developers think about drivers as minor utilities with limited functionality. But drivers often do
more than serve as simple data translators or pipes. They have important features that can
impact both the performance and the functionality of your applications.

Java developers often require access to a variety of data sources, including relational databases.
JDBC drivers connect Java programs to these data sources using the JDBC standard. Before
JDBC, Java developers accessing data sources, such as relational databases, were forced to
wrestle with complex SQL statements to write applications with intensive database transactions.
So, Sun Microsystems and DataDirect Technologies (formerly the DataDirect division of
MERANT) developed JDBC as an API intended to make it easier to write Java applications that
could access a variety of data sources.

Once you start to investigate JDBC drivers, you will soon discover just how overwhelming are the
choices you have. Choice is good; but the Sun Microsystems web site, for example, lists over 150
JDBC drivers without much guidance on how to tell a good driver from a bad one! Although each
driver is described as supporting some of the main feature sets, you are left with the question:
what do those features really mean to me in a practical way? That's what we'll discuss in this
article.

So Many Choices
Let's begin with some of the fundamentals of the JDBC standard and drivers that support it. First,
JDBC provides methods that do these basic things:

Create and manage connections to data sources based on a URL or DataSource object
registered with a Java Naming and Directory Interface (JNDI) naming service. Thus, no client-
side configuration is required.

Compose and send SQL statements to the data sources.

Retrieve and process result sets that are returned to the Java application or applet.

JDBC allows developers to make Java program calls that are translated into SQL statements to
access tabular data sources. These data sources include enterprise SQL databases such as
Oracle, SQL Server, IBM UDB, Informix, Sybase, and others, but also include spreadsheets and
flat-file data sources such as CSV, TSV, and mainframe data files.

With JDBC, it's possible to update data from multiple data fields with a single command and
multiple data sources (tables) in a single transaction. This has the following implications:

It means you can optimize data access performance through batching.

It provides a common framework by which multiple vendors' products may be accessed,


regardless of whether these products interoperate with one another
It provides a means of abstracting the data source, making it easier to re-specify a data source if
it is changed or moved.

JDBC provides a way to access data stores over the Internet when the software used by the
client or the middleware used by middle-tier servers is out of the control of the developer. It
leverages the portable, self-contained nature of Java and obviates the need for multiple versions
of the same application.

JDBC offers the following benefits:

Access to existing and legacy data sets

Database connectivity to a variety of industry-leading databases

A simplified and standardized API based on Java that provides a "responsible broker" to middle-
tier applications

Zero configuration for network computers, with no client-side support, leading to centralized
software maintenance and reduced Total Cost of Ownership (TCO).

The JDBC Specification


Now let's take a look at the specifics. JDBC has gone through several major version releases:
 JDBC 1.0 was designed to provide basic functionality with an emphasis on ease of use.
 JDBC 2.0 offered more advanced features and server-side capabilities.
 JDBC 3.0 "rounded out" the API by providing performance optimization. It added
improvements in the areas of connection pooling, statement pooling, and provided a
migration path to Sun Microsystems' connector architecture.

The optional features in JDBC 2.0 (such as connection and distributed transactions) are now
required features in JDBC 3.0, along with the new features in JDBC 3.0, such as prepared
statement pooling.

JDBC Driver Types—Know the Difference


Database access by Java applications wasn't part of the original Java specification. It didn't take
long for Sun Microsystems and other vendors to fill the gap. Early data-access Java methods
relied on bridging the Microsoft-sponsored ODBC (Open Database Connectivity) standard for
access to data sources, resulting in the JDBC-ODBC bridge drivers.

Today, there are four types of JDBC drivers in use:

 Type 1: JDBC-ODBC bridge, plus an ODBC driver


 Type 2: native API, part-Java driver
 Type 3: pure Java driver for database middleware
 Type 4: pure Java driver for direct-to-database

Type 3 and Type 4 JDBC drivers are both pure Java drivers, and therefore, offer the best
performance, portability, and range of features for Java developers.

Type 1
A Type 1 JDBC driver is a JDBC-ODBC Bridge, plus an ODBC driver. Sun Microsystems
recommends using Type 1 drivers for prototyping only and not for production purposes. The
bridge driver is provided by Sun without support to developers and is intended to support legacy
products. When a major patch is required, Sun provides it, but they do not provide end-user
support for this software. Typically, the bridge is used when there is already an investment in
ODBC technology, such as in Windows application servers.

Sun Microsystems provides a JDBC-ODBC Bridge driver, but because ODBC loads binary code
and database client code on the client using the bridge, this technology isn't suitable for a high-
transaction environment. Type 1 drivers also don't support the complete Java command set and
are limited by the functionality of the ODBC driver.

Type 2
A Type 2 JDBC driver is a native-API, part-Java driver. Type 2 drivers are used to convert JDBC
calls into native calls of the major database vendor APIs. These drivers suffer from the same
performance issues as Type 1 drivers, namely binary-code client loading, and they are platform-
specific.

Type 2 drivers force developers to write platform-specific code, something no Java developer
really wants to do. But, because large database vendors, such as Oracle and IBM, use Type 2
drivers for their enterprise databases, developers who use these drivers must keep up with
different driver releases for each database vendor's product release and each operating system.

Also, because Type 2 drivers don't use the full Java API, developers find themselves having to
perform additional configuration when connecting Java applications to data sources. Often, Type
2 drivers aren't architecturally compatible with mainframe data sources, and when they are, they
are less than ideal.

For these reasons and others, most Java database developers opt for either Type 3 or the newer
and more flexible Type 4 pure Java JDBC drivers.

Figure 1. The Architectural Difference Between Type 1 and Type 2 Drivers. While Type 1
JDBC drivers offer convenience of access to ODBC data sources, they are limited in their
functionality and performance. Type 2 JDBC drivers are OS-specific and compiled, and although
they offer more Java functionality and higher performance than Type 1 drivers, still require a
controlled environment. Figure courtesy of Sun Microsystems .
Type 3
Type 3 JDBC drivers are pure Java drivers for database middleware. JDBC calls are translated
into a middleware vendor's protocol, and the middleware converts those calls into a database's
API. Type 3 JDBC drivers offer the advantage of being server-based, meaning that they do not
require native client code, which makes Type 3 drivers faster than Type 1 and Type 2 drivers.
Developers can also use a single JDBC driver to connect to multiple databases.

Type 4
Type 4 JDBC drivers are direct-to-database pure Java drivers ("thin" drivers). A Type 4 driver
takes JDBC calls and translates them into the network protocol (proprietary protocol) used
directly by the DBMS. Thus, client machines or application servers can make direct calls to the
DBMS server. Each DBMS requires its own Type 4 driver; therefore, there are more drivers to
manage in a heterogeneous computing environment, but this is outweighed by the fact that Type
4 drivers provide faster performance and direct access to DBMS features.

Figure 2. The architectural difference between Type 3 and Type 4 drivers. Type 3 drivers leverage
the advantages of middleware products to supply heterogeneous database access, and are a
strong server-side solution when that middleware product runs on a single OS. Type 4 drivers
provide fast and powerful direct access from Java clients to the databases themselves, but do not
provide some of the server-side OS optimization found in Type 3 drivers. Figure courtesy of: Sun
Microsystems.

JDBC Driver Features


Sun Microsystems maintains a listing of JDBC drivers at:
http://industry.java.sun.com/products/jdbc/drivers, shown in Figure 3. This web page offers a
driver selector tool that allows you to generate a list of drivers that support specific features,
including driver types, JDBC version support, and access to one or multiple specific databases.
As of the writing of this article (March 2002) there were 155 drivers listed on this page!

Note: The driver selector tool shown in Figure 3 hasn't yet been updated to account for JDBC
version 3.0, so expect to see some changes in this tool in the months to come. What you see in
Figure 3 might not match what is on the Sun Microsystems web page at a later date.
Let's consider the feature choices listed on the Sun Microsystems JDBC driver web page in a little
detail, because, as we said, many people use this tool to select one driver over another:

 JDBC API version number (Any, 1.x or 2.x (select one)).


 Certified for J2EE: J2EE 1.3 or J2EE 1.2 (select any or all).
 Driver Type: 1, 2, 3, 4, All, or Any (select any or all).
 Supported DBMS. There are 83 choices, mostly vendor products, but a few of which are
industry standards like JDBC, LDAP, ODBC, OLE DB Provider, Text (TSV), SQL/DS, and
XML. One or a group of selections are supported.
 Required Features: DataSource, Connection Pooling, Distributed Transactions, and
RowSets (pick any or all).
 Returns per page (specifies the number of matches returned).

Figure 3. Sun's JDBC Driver home page offers a selection tool that helps a developer select a
driver based on its type and a number of other factors.

Let's take a look at each feature in turn, and see if we can make sense out of each feature and
their options.

JDBC API
The JDBC API is important because it determines the Java functionality available to a developer.
While older Java applications may not be able to take advantage of the more advanced features
provided by JDBC 3.0, any high-transaction, distributed application certainly would. What
developers get when they use later versions of the API is any new DBMS and operating-system
security enhancements, and the latest performance improvements such as advancements in
connection pooling, statement pooling, RowSet objects, and so forth.

Certification
Remember that JDBC is a specification, and not a standardized piece of software. Vendors are
free to implement their JDBC drivers to that specification as they see fit, and while some JDBC
drivers are fully J2EE compliant; many other drivers are not. When considering one driver over
another, the Certified for J2EE logo, as specified by the Sun Certification Test Suite (CTS) and
administered by Key Labs, is one yardstick a company can use to measure a driver's quality. Any
company who undertakes the certification process is more likely to pay attention to quality control
in their product. You will find a listing of vendors who have endorsed the JDBC standard and have
products in this area at: http://java.sun.com/products/jdbc/industry.html.

Required Features
The "Required Features" section shown in Figure 3 lists some features, introduced in JDBC 2.0,
but required for JDBC version 3.0.

A DataSource is an object containing the connection information to a database that is managed


by a JDBC driver. Data sources work with a JNDI (the Java Naming Directory Interface) service,
and a connection is instantiated and managed independently of the applications that use it.
Connection information, such as path and port number, can be quickly changed in the properties
of the DataSource object without requiring code changes in the applications that use the data
source. Currently 50 of the 155 listings (not all are drivers) on the Sun Microsystems web page
support this feature.

JDBC supports connection pooling, which essentially involves keeping open a cache of
database connection objects and making them available for immediate use for any application
that requests a connection. Instead of performing expensive network roundtrips to the database
server, a connection attempt results in the re-assignment of a connection from the local cache to
the application. When the application disconnects, the physical tie to the database server is not
severed, but instead, the connection is placed back into the cache for immediate re-use,
substantially improving data access performance.

Opening a connection is the most resource-expensive step in database transactions. When an


application creates a connection, multiple separate network roundtrips must be performed to
establish that connection (for an Oracle connection, that number is nine). However, once the
connection object has been created, there is little penalty in leaving the connection object in place
and reusing it for future connections. This feature offers significant potential to improve data
transfer performance and scalability, especially for application servers. This feature is supported
by 50 of the listings on the Sun Microsystems web page.

In a distributed system, applications often must retrieve data using multiple transactions, and
often from multiple data sources. Any transaction that requires the coordination of independent
cooperating transactional systems is referred to as a distributed transaction. Aside from the
performance characteristics of JDBC drivers, distributed transaction support is probably the most
requested feature by Java database developers.

Distributed transactions require additional resources to be reliably supported, primarily because


of the differences in latencies when retrieving data from different data sources, and because of
interoperability issues. For example, a failed transaction is not easily differentiated from a slow
transaction, requiring resource managers in a DTS be both registered and coordinated to handle
ROLLBACK or COMMIT operations correctly without a lot of code development.

Typically, distributed transactions use a transaction manager to provide the coordination that the
different resource managers need. With a transaction manager in a Distributed Transaction
Processing (DTP) architecture providing the mediation between applications and resources, it's
possible to provide applications with ACID transactions across multiple data sources.

In a situation where multiple steps are required to produce the desired result, there needs to be a
software module, which is often a transaction manager, coordinating the process. A product like
iPlanet Trustbase Transaction Manager is one example of this type of software. It's used in global
banking and B2B systems to provide the services required to secure processing and routing
based on a key structure, and includes cryptographic support and identity checking. Only 34 of
the drivers on the Sun Microsystems JDBC driver web page offer this feature.
A RowSets object is the result set of a query containing rows from a tabular data source.
RowSets have properties and event notification similar to JavaBeans and are a JavaBean
component that can be created and used programmatically in a development tool. There are
connected and disconnected rowsets. Disconnected rowsets are connected to a data source that
is populated and do not require a JDBC driver when disconnected. These types of rowsets are
small, and often are used to send data to a thin client. Disconnected rowsets are stored in
memory along with their metadata and connection and execution instructions. A connected rowset
maintains an open connection while the rowset is being used. Only 18 of the drivers now listed
support the rowsets feature, which as this small number suggests, isn't an often used feature.

A note worth mentioning: On the Sun JDBC Driver site, Sun defines "support" for RowSets to
mean actually packaging and shipping the RowSets with the JDBC driver. Thus, you will find that
there are drivers on this site that, while not showing a check mark next to "RowSets," in fact do
support multiple types of RowSets.

Not on the current list, but sure to be added as a searchable feature because it's in JDBC version
3.0, is a feature called prepared statement pooling. Essentially, statement pooling caches SQL
queries that have been previously optimized and run so that, should they be needed again, they
do not have to go through optimization pre-processing again. Statement pooling can be a
significant performance booster and is something to look for in any JDBC driver.

Prepared statements are particularly valuable when the same SQL statement is executed many
times. A query that populates the current status of a particular category of inventory, run many
times during the day, would be one example. To execute the statement a second time, only the
values of the statement's parameters need to be passed to it. All the optimization steps, such as
checking syntax, validating addresses, and optimizing access paths and execution plans, are
already cached in memory. Statement caching is a JDBC driver feature.

What's even better from a developer's standpoint is that statement pooling operates without a
developer having to code for it (as does connection pooling). You get the performance
enhancement when you buy a driver that supports this feature.

Statement pooling and connection pooling can work together in your applications, as long as your
driver supports them. When a J2EE Enterprise JavaBean and session bean makes use of a
connection from the connection pool, any connection that was previously used to run a SQL
statement will have its statement pool already defined. Therefore, even the application's first
attempt to prepare a statement for a particular connection will not require the statement
preparation process. Statement pooling can be used on any connection, regardless of whether
the connection came from a non-pooled data source, or from an application server that uses a
pooled data source or a JDBC driver's connection pool.

Why Drivers Matter


A JDBC driver can make a world of difference in how well your application performs. Features
supported by your JDBC driver are in fact critical to system performance. This is particularly true
in enterprise database applications, and more so in applications accessing distributed data
sources.

Many database products ship with their own JDBC drivers. So why should you purchase a
separate driver when you already may get one free in your DBMS? There are several reasons.
First, the handful of vendors who actually develop these drivers (most database vendors license
their drivers) can offer you better support for their drivers than the database vendors can offer for
drivers they only license. This is not surprising: Creating commercial-quality drivers is hard work.
The driver creator would be expected to know the driver details better than anyone else—Driver
design is what they do.
Many developers don't realize that their JDBC driver can be a performance bottleneck for their
application because of issues like connection handling. They are pleasantly surprised, for
example, when they upgrade their native database JDBC driver to a Type 3 or Type 4 JDBC
product from a dedicated driver vendor. Many companies make major purchases of a JDBC
driver vendor's product even when a version of that product is part of the DBMS that they already
own.

As you look to the future, consider the features that JDBC 3.0 provides as a marker in deciding
which driver to consider, including DataSource objects, connection pooling, distributed transaction
support, RowSets, and prepared statement pooling. When possible, choose a vendor whose
drivers work directly with database vendors APIs, and not drivers that have been reversed
engineered to work with those APIs. The former type is typically a faster performer.

Type 3 and Type 4 drivers are the drivers to use when performance and compliance with Java
standards is a must. It's extremely worthwhile to find a vendor whose drivers support the very
latest versions of JDBC and the widest array of JDBC features. Another factor to consider is that
different JDBC drivers offer tools that don't normally come with native DBMS drivers—tools that
can be quite important in your development work.

S-ar putea să vă placă și