Sunteți pe pagina 1din 7

Five Performance Hints for Efficient SQL

Welcome to OReview's new performance-tuning column. In this column I hope to


provide regular and reliable performance-related strategies, news, and information that
will supplement - although not replace - the more in-depth performance-related feature
articles that appear periodically in OReview. In particular, I'll provide:

 advice on new performance-related features of the Oracle server and tools


 coverage of Oracle performance-optimization techniques at beginner,
intermediate, and advanced levels
 information about undocumented or little-known performance-tuning facilities

Most performance experts agree that tuning application SQL is usually the most effective
way of improving performance, and that SQL tuning is an essential prerequisite to
effective server or operating-system tuning. Unfortunately, SQL tuning often takes
second seat to server tuning because SQL tuning is less of an exact science and often out
of a DBA's control.

I have 10 hints that can help you tune your SQL and improve your application
performance. In this inaugural column, I give you five hints; look for five more hints in
my August column.

Hint #1: Establish a Tuning Environment

It's not uncommon for SQL that works well in a development environment to exhibit poor
performance once it is in a production system. One of the primary causes of such
unpleasant surprises is an inadequate development or volume-testing environment. In
particular, environments without realistic or representative data volumes are bound to
lead to unrealistic SQL performance.

The ideal tuning or development environment is one in which:

 data volumes are realistic or at least proportional - With today's large production
databases, it's often not possible to duplicate production data volumes exactly.
However, you should always be able to construct a reasonable subset of the data.
For example, a 20 percent sample of a 1GB database may be adequate for
performance tuning. You must, however, avoid the situation in which SQL
developers are constructing and testing code against empty tables - even the
proverbial "SQL query from hell" will run fast in these environments.
 tuning facilities are available - Supply your volume and development
environments with as many tuning tools as you have available. This may involve
third-party SQL tuning tools, but it will at least involve enabling the default
Oracle tools. Make sure that developers know how to use EXPLAIN PLAN,
SQL_TRACE, and tkprof facilities. Ensure that relevant database options are
effectively set.
 documentation is available - Documentation should include the database design,
index details, performance requirements, and volume metrics. The SQL
programmer needs all of this information in order to produce efficient SQL.

Hint #2: Use SQL Tuning Tools

Oracle provides powerful facilities for measuring the execution of SQL statements, such
as the EXPLAIN PLAN command, which can generate the execution plan for a SQL
statement. To use EXPLAIN PLAN:

 Create the PLAN_TABLE using the script in


$Oracle_HOME/rdbms/admin/utlxplan.sql (the location might be slightly
different under non-Unix hosts).
 Issue an EXPLAIN PLAN FOR sql_statement statement.
 Display the execution plan for the statement using a query such as:

SELECT rtrim(lpad(' ',2*level)||


rtrim(operation)||' '||
rtrim(options)||' '||
object_name) query_plan
FROM plan_table
CONNECT BY prior id=parent_id
START WITH id=0;
 Delete the contents of the PLAN_TABLE.

A more powerful relative of the EXPLAIN PLAN command is the SQL_TRACE


facility. To use SQL_TRACE you do the following:

 Execute the statement ALTER SESSION SET SQL_TRACE=TRUE from within an


Oracle session.
 Locate the trace file - its location is specified by the server parameter
USER_DUMP_DEST.
 Use the tkprof program to format the trace file. For example:

tkprof input_trace_file output_report_file


sort='(prsela,exeela,fchela)'
explain=username/password

Interpreting tkprof output and execution plans is a complex topic that I can't give
adequate coverage here.

Hint #3: Index Carefully

Indexes exist to improve the performance of SQL statements. In order to establish the
"best" indexes to achieve optimal performance, you should do the following:
 Index Selectively. A "selective" index is an index with columns that have a large
number of distinct values. Such indexes are efficient because each index key
lookup identifies only a small number of rows and hence requires less database
I/O. The most selective index is a unique index - each index key points to only
one row. It's usually not worth indexing on columns with only a few distinct
values, unless one of the values is very rare - and even then you may need to
implement histograms (which I discuss below) to use the index appropriately.
 Use Concatenated Indexes. Try not to use two indexes when one will suffice.
For example, if you are searching for SURNAME and FIRSTNAME, don't create
separate indexes for each column. Instead, create a concatenated index on both
SURNAME and FIRSTNAME. You can then use the "leading" portion of the
concatenated index on its own; thus, if you sometimes query on the SURNAME
column without supplying the FIRSTNAME, SURNAME should come first in the
index. Note that you can't use the trailing portions of a concatenated index. For
example, the concatenated index on SURNAME and FIRSTNAME can't help you find
all people with a first name of "Fred."
 Overindex to Avoid a Table Lookup. You can sometimes improve SQL execution
by "overindexing." Overindexing involves concatenating columns that appear in
the SELECT clause to the index, but not in the WHERE clause. Imagine that you are
searching on SURNAME and FIRSTNAME in order to find EMPLOYEE_ID. Your
concatenated index on SURNAME and FIRSTNAME will let you quickly locate the
row containing the appropriate EMPLOYEE_ID, but you will need to access both the
index and the table. If there is an index on SURNAME, FIRSTNAME, and
EMPLOYEE_ID, then the query can be satisfied using the index alone. This
technique is particularly useful when optimizing joins, because intermediate
tables in a join are sometimes queried merely to obtain the join key for the next
table.
 Consider Advanced Indexing Options. Oracle's default B-tree indexes are
flexible, efficient, and suitable for the majority of situations you will encounter.
However, Oracle offers a number of alternative indexing schemes that can
improve your performance in specific situations:

- Index clusters let you locate rows from one or more tables in cluster key order.
Clustering tables can result in a substantial improvement in join performance.
However, the performance of table scans of individual tables in the cluster may be
severely degraded. Index clusters are usually recommended only for tables that
are always accessed together. Even then, you should consider alternatives such as
denormalization.

- In hash clusters, the key values are translated mathematically to a hash value.
Rows are stored in the hash cluster based on this hash value. Locating a row when
the hash key is known may require only a single I/O rather than the two or three
I/Os required by an index lookup. However, you cannot perform range scans of
the hash key. Furthermore, if the cluster is poorly configured or the size of the
cluster changes, overflows on the hash keys may occur or the cluster may become
sparsely populated. In the first case, the performance of hash-key retrieval can
degrade; in the second case, table scans will be less efficient.

- Bitmapped indexes were introduced in Oracle 7.3. These indexes suit queries on
multiple columns that each have only a few distinct values. They are more
compact than a corresponding concatenated index and, unlike the concatenated
index, you may query columns in any combination. However, bitmapped indexes
are not suitable for tables with high modification rates, because locking of
bitmapped indexes occurs at the block level rather than the row level. Bitmapped
indexes are also not suitable for columns with large numbers of distinct values.

 Ensure that your query uses the best index. Novice SQL programmers are
often satisfied that the execution plan for their SQL statement uses an index - any
index. However, sometimes you have a choice of indexed retrievals, and the
Oracle optimizer - especially the older rule-based optimizer - will not always
choose the best index. Make sure that the indexes selected by Oracle are the most
appropriate, and use hints (which I discuss later in this article) to change the index
if necessary.

Hint #4: Reduce Parsing

Parsing a SQL statement includes the process of validating the SQL and determining the
optimal execution plan. For SQL that has low I/O requirements but is frequently executed
(for example, SQL generated by OLTP-type applications), reducing the overhead of SQL
parsing is extremely important.

When an Oracle session needs to parse an SQL statement, it first looks for an identical
shared SQL statement in the Oracle shared pool. If it cannot find a matching statement,
Oracle will determine the optimal execution plan for the statement and store the parsed
representation in the shared pool.

The process of parsing SQL is CPU-intensive. When I/O is well tuned, the overhead of
parsing a SQL statement can be a significant portion of the total overhead of executing
that statement. You can reduce the parsing overhead by:

 using bind variables - Bind variables allow the variable part of a query to be
represented by "pointers" to program variables. If you use bind variables, the text
of a SQL statement will not change from execution to execution, and Oracle will
usually find a match in the shared pool, which dramatically reduces parsing
overhead.
 reusing cursors - Cursors (or context areas) are areas of memory that store the
parsed representation of SQL statements. If you execute the same SQL statement
more than once, you can reopen an existing cursor and completely avoid issuing a
parse call. The mechanism of reusing cursors varies from tool to tool. Table 1
summarizes the techniques you can use in some popular development tools.
 using a cursor cache - If your development tool makes it difficult or impossible to
reuse cursors, you can instruct Oracle to create a cursor "cache" for each session
using the SESSION_CACHED_CURSORS server parameter. If
SESSION_CACHED_CURSORS is greater than zero, Oracle will store recently
executed cursors in a cache and, if you reexecute such a SQL statement, you can
avoid reparsing the statement.

Hint #5: Take Advantage of the Cost-Based Optimizer

The Oracle software component that determines the execution plan for a SQL statement
is called the optimizer. Oracle supports two approaches to query optimization: rule-based
and cost-based. The rule-based optimizer determines the execution plan based on a set of
rules that rank various access paths. For example, an index-based retrieval has a lower
rank than a full table scan; therefore, the rule-based optimizer will use indexes wherever
possible.

The cost-based optimizer determines the execution plan based on an estimate of the
computer resources (the cost) required to satisfy various access methods. The cost-based
optimizer uses statistics, including the number of rows in a table and the number of
distinct values in indexes, to determine this ideal plan.

Many people's early experiences with the cost-based optimizer in Oracle 7.0 and 7.1 were
disappointing, which gave the cost-based optimizer a bad reputation in some quarters.
However, the cost-based optimizer has been improving with each release of Oracle7, and
many advanced SQL access methods, such as star and hash joins, are only available when
you use the cost-based optimizer.

The cost-based optimizer is your best choice for almost all new projects, and it may be
cost-effective to convert your existing projects from rule-based to cost-based
optimization. Consider the following guidelines for getting the most from the cost-based
optimizer:

 The default mode of the cost-based optimizer (OPTIMIZER_MODE=CHOOSE)


attempts to optimize the throughput - the time taken to retrieve all rows - of SQL
statements and often favors full table scans over index lookups. When converting
to cost-based optimization, many users are disappointed to find that previously
well-tuned index lookups change to long-running table scans. To avoid this
situation, set OPTIMIZER_MODE=FIRST_ROWS in the server's parameter file or
ALTER SESSION SET OPTIMIZE=FIRST_ROWS in your code. These settings
instruct the cost-based optimizer to minimize the time taken to retrieve the first
row in your result set and encourages the use of indexes.
 No matter how sophisticated the cost-based optimizer becomes, there will still be
occasions when you need to modify its execution plan. SQL "hints" are usually
the best way to do this. Using hints you can instruct the optimizer to pursue your
preferred access paths (such as a preferred index), use the parallel query option,
select a join order, and so on. You enter hints as comments following the first
word in a SQL statement. The plus sign "+" in the comment lets Oracle know that
the comment contains a hint. Hints are fully documented in the Oracle Server
Tuning guide, and some of the most popular hints are summarized in Table 2.

The following example uses a hint to instruct the optimizer to use the CUST_I2
index:

SELECT /*+ INDEX(customers cust_i2) */ * FROM customers WHERE


name=:cust_name
 The cost-based optimizer's execution plans are calculated using table statistics
collected by the ANALYZE command. Make sure that you analyze all of your tables
regularly and analyze them at peak volumes (for example, don't analyze a table
just before it is about to be loaded by a batch job). For small or medium-sized
tables, use the ANALYZE TABLE table_name COMPUTE STATISTICS statement.
For larger tables, analyzing every row of the table may be too time-consuming. In
this case, analyze a random sample of rows using a statement such as ANALYZE
TABLE table_name ESTIMATE STATISTICS SAMPLE 20 PERCENT.
 Prior to Oracle7.3, the cost-based optimizer could determine the number of
distinct values in a column but not the distribution of data within the column. This
meant that it might decline to use an index on a column with only a few values,
even if the particular value in question was rare and would benefit from an index
lookup. Histograms, introduced in Oracle 7.3, enable you to collect column
distribution data and let the cost-based optimizer make better decisions. You
create histograms with the FOR COLUMNS clause of the ANALYZE command (for
example, ANALYZE TABLE table_name COMPUTE STATISTICS FOR ALL
INDEXED COLUMNS). Note that you can't take advantage of histograms if you are
using bind variables (which I discussed earlier).

In Retrospect

To briefly review this month's SQL tuning hints:

Hint #1. Establish a tuning environment that will enable you to effectively write and tune
efficient SQL. This environment should contain realistic or representative data volumes,
include tuning and diagnostic tools, and have application design documentation available.

Hint #2. Use the tuning tools that are included in Oracle. In particular, learn to use
EXPLAIN PLAN, SQL_TRACE, and tkprof.

Hint #3. Make effective use of Oracle indexes. Take advantage of concatenated indexes,
overindexing, and "advanced" indexing options, such as bitmapped indexes and hash and
index clusters.
Hint #4. Reduce the overhead of SQL statement parsing by using bind variables,
retaining SQL "cursors," and implementing an Oracle cursor cache.

Hint #5. Take advantage of the Oracle cost-based optimizer. If you're still using the older
rule-based optimizer, consider switching to the cost-based optimizer. When using the
cost-based optimizer, ensure that you analyze all of your tables regularly. Also, make sure
that OPTIMIZER_MODE is set appropriately. Using hints and column histograms can
improve the performance of SQL statements that are particularly difficult to optimize.

Next month, I'll look at some of the performance features of Oracle version 8. In August,
I'll present a final list of five hints for optimizing SQL, which will include tips on
avoiding accidental table scans, optimizing unavoidable scans, using array processing,
effective locking strategies, and leveraging features introduced in recent releases of
Oracle.

S-ar putea să vă placă și