An Introduction To Oracle SQL Tuning

An Introduction To Oracle SQL Tuning
Chris Adkin 30th May 2008
31/01/2012
Some Inspirational Thoughts Before We Begin . . .

The following Ask Tom excerpt comes in response to a You Asked Can u give a methodology of tuning the sql statements. question. The link to the full answer is at: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_I D:8764517459743 Despite being five years old, in the intervening time, artificial intelligence has not been built into the RDBMS where by it can re-write your SQL such that it will run in the most efficient manner possible. Advisors take some of the leg work out of tuning and the tools such as DBMS_XPLAN, v$ views etc constantly change, evolve and improve, however SQL tuning and writing efficient SQL is not a prescriptive process that can be captured in a process. I will however try to present useful techniques and good practise to demystify some of the art behind this.
2

1.1 Efficient SQL This was probably the hardest part of the book to write - this chapter. That is not because the material is all that complex, rather because I know what people want - and I know what can be delivered. What people want: The 10 step process by which you can tune any query. What can be delivered: Knowledge about how queries are processed, knowledge you can use and apply day to day as you develop them. Think about it for a moment.
3

If there were a 10 step or even 1,000,000 step process by which any query can be tuned (or even X% of queries for that matter), we would write a program to do it. Oh don't get me wrong, there are many programs that actually try to do this - Oracle Enterprise Manager with its tuning pack, SQL Navigator and others. What they do is primarily recommend indexing schemes to tune a query, suggest materialized views, offer to add hints to the query to try other access plans.

They show you different query plans for the same statement and allow you to pick one. They offer "rules of thumb" (what I generally call ROT since the acronym and the word is maps to are so appropriate for each other) SQL optimizations - which if they were universally applicable - the optimizer would do it as a matter of fact. In fact, the cost based optimizer does that already - it rewrites our queries all of the time. These tuning tools use a very limited set of rules that sometimes can suggest that index or set of indexes you really should have thought of during your design.
5

I'll close this idea out with this thought - if there were an N step process to tuning a query, to writing efficient SQL - the optimizer would incorporate it all and we would not be having a discussion about this topic at all. It is like the search for the holy grail - maybe someday the software will be sophisticated enough to be perfect in this regards, it will be able to take our SQL, understand the question being asked and process the question - rather then syntax.
6
Section 1
Mechanics Of The Cost Based Optimizer
31/01/2012
What Is The Optimizer

I will focus on the Cost Based Optimizer. Around since Oracle 7. Devises the best plan for executing a query based on cost. Transparent to application and users, except when the wrong plan is selected. The stages of optimisation will be covered on the next set of slides.
8
Stages Of Optimisation

On 10g parsed SQL statements are assigned unique identifiers called sql ids, prior releases use a combination of SQL address and hash value. If a query does not exist in parsed form in the shared pool, a hard parse or optimisation takes place (a gross over simplification, but basically what happens). Oracle can also perform optimisations at run time. A parsed SQL statement (cursor) can have multiple child cursors, these are affectively different plans for the same SQL text when different variables are supplied (an over simplification, but basically what happens).
9

Establish the environment

Ascertain what parameters are set Ascertain what bug fixes are in place based on the setting of the compatible parameter

Query transformation
Sub query un-nesting Complex view merging Set Join Conversion Predicate move around

10

Establish the base statistics of all the relevant tables and indexes. Single table access cardinality estimation. Join order consideration.

Record will be kept of the best join order maintained so far.
11
What Influences The Cost Based Optimizer ?

Object statistics Object data types Oracle initialisation parameters, refer to:v$sys_optimizer_env v$ses_optimizer_env V$sql_optimizer_env

12

System statistics:COMPLETED 11-26-2006 19:53 11-26-2006 19:53 1 1081.76881 10 4096
1* select * from sys.aux_stats$ SQL> / SYSSTATS_INFO STATUS SYSSTATS_INFO DSTART SYSSTATS_INFO DSTOP SYSSTATS_INFO FLAGS SYSSTATS_MAIN CPUSPEEDNW SYSSTATS_MAIN IOSEEKTIM SYSSTATS_MAIN IOTFRSPEED SYSSTATS_MAIN SREADTIM SYSSTATS_MAIN MREADTIM SYSSTATS_MAIN CPUSPEED SYSSTATS_MAIN MBRC SYSSTATS_MAIN MAXTHR SYSSTATS_MAIN SLAVETHR
13

System statistics facilitate something called the CPU costing model introduced in Oracle 9i. Until this came along the optimizer did not take into account CPU performance nor the difference in performance between single and multi block reads. In Oracle 9i, no system statistics are present. In 10g out of the box statistics called no workload statistics are provided. Stats that are not populated in aux_stats$ only appear after system statistics have been gathered. The optimizer cost model can be set via _optim_cost_model, set this to IO or CPU.
14

!!! Warning !!!

You have to be very careful when gathering system statistics. If you have any I/O sub system with a cache you can end up getting single block and multi block I/O times that reflect the speed of the cache rather than the under lying disks. This can seriously skew a plans costings. Oracle supply an I/O calibration tool called Orion which might help here: http://www.oracle.com/technology/software/tech/orion/index.html

15
What Influences The Cost Based Optimizer

SQL Usage
Unless the NO_MONITOR hint is used, col_usage$ in the data dictionary will be updated whenever a statement is parsed. DBMS_STATS uses this in order to determine whether a histogram should be created on a column when size = AUTO is specified in the method_opt. Oracle can get confused with what to do for columns where the LIKE and col_usage$ and hence not create histograms when size auto is specified.

16
What Influences The Cost Based Optimizer

Hints and profiles

SQL hints Stored out lines (essentially hints) SQL profiles as created by the Oracle 10g tuning advisor

Objects, presence of indexes, partitioning, constraints etc . . .
17
Section 2
A Plan Of Attack For Investigating A Bad Plan
31/01/2012
18
Identify The SQL To Be Tuned

Many ways of doing this:Automatic Database Diagnostic Manager $ORACLE_HOME/rdbms/admin/addmrpt Automatic Workload Repository reports $ORACLE_HOME/rdbms/admin/awrrpt Toad SQL Tuning Advisor $ORACLE_HOME/rdbms/admin/sqltrpt

19
A Simple Script For Identifyng SQL To Be Tuned

SELECT * FROM (SELECT sql_id, elapsed_time, cpu_time, user_io_wait_time, concurrency_wait_time FROM v$sql ORDER BY elapsed_time DESC) WHERE rownum <= 10 In 10 onwards SQL statements are uniquely identified by sql ids, for previous releases this a combination of hash value and sql address. As a first pass tuning try running the SQL tuning advisor on the query, my preferred way of doing this is via the sqltrpt script in $ORACLE_HOME/rdbms/admin.
20
Causes Of Poor Performance

Stale or missing statistics (includes missing histograms). Lack of appropriate indexes to support your query Bugs in the optimizer, there are the best part of 100 bug fixes related to the optimizer in 10.2.0.4.0. Optimizer flaws, such as the predicate independence assumption, more on this later. SQL that is inefficient by design. SQL that is optimizer unfriendly, more on this later.
21
Causes Of Poor Performance

Index, table type or partitioning schemes that are not conducive to good performance with the query mix you are running. Misuse or poor use of Oracle features. Abuse of the Oracle optimizer environment. Poor schema design.
22
The Tuning Ethos

Ask the following questions:Has what I need to tune ever ran fast, if so what has changed in the environment. Do not confuse something that is running slow because it has an abnormally high workload for something that is inefficient at what it does. Is the part of the application running slowly because it is not designed to deal with the shape and / or size of the data it is processing ?.

23
The Tuning Ethos

When tuning

Only make one change at a time so as to be able to accurately gauge what affects your query. Try to make the scope of any changes match the scope of the problem, e.g. if you have a problem with one query, a global change such as a change to an initialisation parameter is unlikely to address the root cause of the issue and it may cause performance regression for queries that are currently performing well. Use realistic test data. Try to use a clean environment without factors that may skew your results, e.g. for accurate and consistent response times avoid database and servers which are being used by other users if possible.
24
The Tuning Ethos

Tuning a piece of SQL may result in:Indexes being added or modified. Statistic gathering regimes being modified. SQL being re-written. Object types being changed, e.g. heap to index organized tables. Schema design changing, e.g. extra columns added to tables in order to reduce numbers of joins. The use of hints.

25
Tuning Is Dead We Have The Tuning Advisor !!!

Based on my experience this:Recommends indexes Identifies stale statistics Notifies you of Cartesian joins Spots predicates not conducive to the efficient use of indexes, e.g. col <>, col NOT IN (1, 2, . . . Identifies parts of statements where view mergining cannot take place. Recommends SQL profiles.

26
Tuning Is Dead We Have The Tuning Advisor !!!

If the tuning advisor does anything else, I am yet to see it. SQL profiles are enhanced hints that provide the optimizer with extra information with which to get cardinality estimates correct with. The use of SQL profiles can cause a problem if your base data changes such that they are no longer relevant. It is the best tool of its type that Oracle have produced to date, it is not artificially intelligent and it smacks of the ROT tools that Tom Kyte mentioned. This is good for a first pass at identifying more obvious causes of a query running sub optimally.
27
A Simple Methodology For Query Tuning

Despite what the Oracle marketing machine may say, tuning is not a prescriptive process. The tuning advisor in its 10g incarnation is no substitute for a knowledgeable DBA at best it semi automates the task of tuning. Some people instantly assume that because a plan contains a full table scan that the execution plan is therefore bad. This is an out dated method of tuning with origins based around the rule based optimizer.
28
Good And Bad Plans

Do not automatically jump to the conclusion that a bad plan contains full table scans and a good plan only uses index range scans. Reasons for full table scans might include:

Queries that use un-selective predicates, I.e. result in all or a large proportion of a tables data being returned. Queries return most of the columns in a table. The table is small in size. The indexes you think the query should use have poor clustering factors.
29

What I will outline is a plan of attack for tackling query issues. There is no such thing as a definitive tuning methodology as alluded to by Tom Kyte. Even when Oracle support is engaged with a tuning related service request, you will not always be asked for the same information or to use the same diagnostic tools depending on who the service request is assigned to. This may change with the DBMS_SQLDIAG package in 11g, aka the SQL test case builder.
30

My plan of attack will not cover the writing of efficient SQL, this would require a different presentation entirely.
I will assume:
All reasonable indexes, partitioning schemes etc . . . are in place. Nothing daft has been done to abuse the optimizer environment. Oracle has produced sub optimal plans due to discrepancies between predicated cardinalities and actual cardinalities.
31

1. Obtain the execution plan for your query

If you run an explain plan from the SQL prompt this will give you the plan that Oracle predicts the statement will use, I.e.:SQL> EXPLAIN PLAN FOR SELECT * FROM dual SQL> SELECT * FROM table(dbms_xplan.display)
A better way of explaining the execution plan is by taking it straight from the shared pool:SQL> SELECT * FROM table(dbms_xplan.display_cursor(sqlid))
Sql ids are new to Oracle 10g and uniquely identify parsed SQL statements
32

1. Obtain the execution plan for your query

The DBMS_XPLAN package is the best tool here. Where possible obtain the plan after running the query:
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, ALLSTATS LAST)
Obtain the plan from the shared pool

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(<your SQL id goes here>)
Obtain the plan from the workload repository

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_AWR(<your SQL id goes here>)
If you use explain plan, this will give you what the optimizer predicts the plan will be, this may not necessarily but what the plan is when the query runs due to such things as bind variable peeking.
33
Understanding A Basic Execution Plan

PLAN_TABLE_OUTPUT ----------------------------------------------------------------| Id | Operation |Name |Rows|Bytes|Cost | ----------------------------------------------------------------| 0 | SELECT STATEMENT | | | | | | 1 | NESTED LOOPS | | | | | | 2 | NESTED LOOPS | | | | | | 3 | TABLE ACCESS FULL | SALGRADE| | | | |* 4 | TABLE ACCESS FULL | EMP | | | | | 5 | TABLE ACCESS BY INDEX ROWID| DEPT | | | | |* 6 | INDEX UNIQUE SCAN | DEPT_PK | | | | ----------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------4 - filter("EMP"."SAL"<="SALGRADE"."HISAL" AND "EMP"."SAL">="SALGRADE"."LOSAL") 6 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
34

2. Visually sanity check your execution plan

Does the plan contain any Cartesian joins ?. Is the most restricted table the driving table, this is the first table in the join order and usually the table that is furthest indented into the plan, I.e. the table for which the filter predicates will reduce the most. Is the access for the driving table via a full table scan, not always an issue, but could indicate that the join order is incorrect
Continued . . .
35

2. Visually sanity check your execution plan Are you battling against the schema design, an example:SELECT DISTINCT A.ID_JOB FROM TR_JOB A, TR_OFFCYCLE_JOB_DETAILS F, TR_CYCLIC_JOB_DETAILS C, TR_CONSUMPTION_HISTORY B WHERE A.ID_JOB = B.ID_JOB AND A.ID_SERVICEPOINT =:1 AND A.SD_JOB_STATUS = :2 AND B.SD_READING_TYPE = :3 AND (( F.APPMT_START_TIME < to_date(:4,'yyyy-MM-dd hh24:MI:SS') AND F.APPMT_END_TIME > to_date(:5,'yyyy-MM-dd hh24:MI:SS') AND F.ID_JOB=A.ID_JOB ) OR ( C.VISIT_DATE_FROM < to_date(:6,'yyyy-MM-dd hh24:MI:SS') AND C.VISIT_DATE_TO > to_date(:7,'yyyy-MM-ddhh24:MI:SS') AND C.ID_JOB=A.ID_JOB )) ---------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost | ---------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 151 | 1187 | | 1 | HASH UNIQUE | | 1 | 151 | 1187 | | 2 | NESTED LOOPS | | 1 | 151 | 1176 | | 3 | MERGE JOIN CARTESIAN | | 1 | 104 | 684 | | 4 | NESTED LOOPS | | 1 | 51 | 665 | |* 5 | INDEX FAST FULL SCAN | TR_CONSUMPTION_HISTORY_IX2 | 1 | 5 | 663 | |* 6 | TABLE ACCESS BY INDEX ROWID| TR_JOB | 1 | 46 | 2 | |* 7 | INDEX UNIQUE SCAN | TR_JOB_ID_PK | 1 | | 1 | | 8 | BUFFER SORT | | 32474 | 1680K| 682 | | 9 | TABLE ACCESS FULL | TR_OFFCYCLE_JOB_DETAILS | 32474 | 1680K| 19 | |* 10 | TABLE ACCESS FULL | TR_CYCLIC_JOB_DETAILS | 1 | 47 | 492 | ----------------------------------------------------------------------------------------------
Continued . . .
36

2. Visually sanity check your execution plan In the previous example there are two issues:
There is a Cartesian join due to the OR condition, specifically if one of the branches evaluates to TRUE the join predicates do not get performed for two tables. We have an index fast full scan (index equivalent of a full table scan) on TR_CONSUMPTION_HISTORY_IX2, note that the only reason we are going to the TR_CONSUMPTION_HISTORY table is to get the contents of the sd_reading_type column.
37

3. Check the environment is correct for the query:
Are statistics up to date for all the objects used by the query. Do appropriate indexes exists, you can run the sqltrpt script from $ORACLE_HOME/rdbms/admin, supply the sql id for your query and the tuning advisor will go to work for you, this will also highlight any tables with stale statistics.
Continued . . .
38

3. Checking the environment is correct for the query From Greg Rahn of the Oracle Real World performance group:I think its important to understand what variables influence the Optimizer in order to focus the debugging effort. There are quite a number of variables, but frequently the cause of the problem ones are: (1) non default optimizer parameters and (2) nonRepresentative object/system statistics. Based on my observations I would say that the most abused Optimizer parameters are:

OPTIMIZER_INDEX_CACHING OPTIMIZER_INDEX_COST_ADJ DB_FILE_MULTIBLOCK_READ_COUNT

39

3. Checking the environment is correct for the query
Many see setting these as a solution to get the Optimizer to choose an index plan over a table scan plan,but this is problematic in several ways: This is a global change to a local problem Although it appears to solve one problem, it is unknown how many bad execution plans resulted from this change. The root cause of why the index plan was not chosen is unknown, just that tweaking parameters gave the desired result Using non-default parameters makes it almost impossible to correctly and effectively troubleshoot the root cause
40

4. Understand your data, how Selective is it ?

Are the predicates and variables used in your statement going to retrieve a significant or small proportion of rows from the queries base tables ?. The following predicates come from a query I have investigated the performance of:UPPER (a.walkroute_reviewed_flag) = 'Y' ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) OR ( a.sd_service_type = '10001' AND a.sd_occupied_status = '10001' AND a.sd_servicepoint_status = '10001' ) OR a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003
It returns a third of the data in the AD_SERVICE_POINT table, therefore if the clustering factors of the appropriate indexes are not particularly good, a full table scan is probably the most efficient way to retrieve the relevant data.41

5. Are the predicted cardinalities for for the execution plan accurate
If the predicted cardinalities for the row sources in the plan are accurate and the appropriate access paths (indexes) exist, the optimizer will usually come up with a good plan. Bad cardinality estimates, when bad, will be out by an order of magnitude and ripple throughout the rest of the place. If predicted cardinalities are accurate, detailed analysis of access path costing may need to be performed. This method of trouble shooting is sometimes referred to as Tuning by cardinality feedback.
42
Tuning By Cardinality Feedback

Generally speaking if the predicated cardinalities and actual cardinalities for a row source are close, the optimizer will pick the best plan for the query. To start down this route of tuning we need to obtain the SQL text for the query in question along with any bind variable values it uses. The best way of doing this is via one of the most useful tools in our tuning toolbox, the DBMS_XPLAN package.
43
Tuning By Cardinality Feedback

Tuning by cardinality feedback is borrowed from one of the reviewers of Jonathan Lewiss Cost Based Fundamentals book: www.centrexcc.com/Tuning by Cardinality Feedback.pdf. Other Oracle professionals may call this something different. This is also endorsed by the Oracle Real World Performance Group http://structureddata.org/2007/11/21/troubleshooting-bad-executionplans/ With the GATHER_PLAN_STATISTICS hint and ALLSTATS LAST format string in DBMS_XPLAN (Oracle 10g), this is a really easy method to use. I would not advocate playing around with density settings as this may fix queries suffering from the Predicate Independence Assumption but cause bad plans for other queries.
44
Obtaining Your SQL Text + Binds

A trivial example to whet the appetite
SQL> select * from table(dbms_xplan.display_cursor('ft33c3agapy0k',0,'TYPICAL +PEEKED_BINDS')) 2 / SQL_ID ft33c3agapy0k, child number 0 ------------------------------------UPDATE TR_CYCLIC_WORK_PACK SET PACK_JOB_COUNTER =PACK_JOB_COUNTER +1 WHERE ID_PACK=:1 Plan hash value: 115273857 -------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------------| 0 | UPDATE STATEMENT | | | | 2 | | 1 | UPDATE | TR_CYCLIC_WORK_PACK | | | | |* 2 | INDEX UNIQUE SCAN| TR_CYC_WORKPACK_PK | 1 | 9 | 1 | -------------------------------------------------------------------------Peeked Binds (identified by position): -------------------------------------1 - :1 (NUMBER): 80310011 Predicate Information (identified by operation id): --------------------------------------------------2 - access("ID_PACK"=:1) Note ----- cpu costing is off (consider enabling it)
45
Checking Predicated Versus Estimated Cardinalities

SQL> SQL> SQL> SQL> 2 3 SQL> var b1 number exec :b1 := 803100113 exec dbms_stats.gather_table_stats(user,'TR_CYCLIC_WORK_PACK',ESTIMATE_PERCENT=>NULL,CASCADE=>TRUE); UPDATE /*+ GATHER_PLAN_STATISTICS */ TR_CYCLIC_WORK_PACK SET PACK_JOB_COUNTER =PACK_JOB_COUNTER +1 WHERE ID_PACK=:b1 / select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST'));
SQL_ID cqnxyqmp08rtu, child number 0 ------------------------------------UPDATE /*+ GATHER_PLAN_STATISTICS */ TR_CYCLIC_WORK_PACK =PACK_JOB_COUNTER +1 WHERE ID_PACK=:b1 Plan hash value: 115273857
SET PACK_JOB_COUNTER
---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------------------| 1 | UPDATE | TR_CYCLIC_WORK_PACK | 1 | | 1 |00:00:00.01 | 4 | |* 2 | INDEX UNIQUE SCAN| TR_CYC_WORKPACK_PK | 1 | 1 | 1 |00:00:00.01 | 2 | ----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id): --------------------------------------------------2 - access("ID_PACK"=:B1) Note ----- cpu costing is off (consider enabling it)
46
Checking Predicated Versus Estimated Cardinalities

The example on the previous slide was the simplest example that can be provided. Line 2 in the execution plan is what is known as a row source. Note that for line 2 the values in the E-Rows and A-Rows columns matched. If nested loop joins are used:starts x E-Rows = A-Rows otherwise E-Rows = A-Rows
Usually for a bad plan the estimated cardinalities will differ from the predicated cardinalities by orders of magnitude, this inaccuracy in will ripple throughout the rest of the plan and lead to poor performance. On the next slide a more complex example will be provided.
47
Reasons For Estimated Cardinalities Being Inaccurate

Statistics gathering limitations:
Columns with skewed containing more than 254 distinct values => can be rectified with the DBMS_STATS API. DBMS_STATS in 10g can produce statistically incorrect estimates of distinct values when auto sampling / sampling is used. Not possible until 11g to gather statistics on correlated columns, this is due to what is sometimes called data correlation or the predicate independence assumption, more on this later. In 10g dynamic sampling or hints can help here. Not possible until 11g to gather statistics on functions unless function based indexes are used.
48

None representative statistics:
Statistics are missing or stale. During the course of processing data changes such that it no longer reflects the statistics in the data dictionary, a particular problem with scratch data, dynamic sampling can help here. Lack of histograms or histograms with too few buckets on columns with skewed data. Sampled statistics taken with too small a sample size, in 11g auto sampling gives 100% statistically accurate statistics.
49

Optimizer unfriendly SQL

Using values in the WHERE clauses with data types which are different to those used by the columns comparisons are being made against.
50

Other factors:
Optimizer bugs.
51
Fixing Bad Cardinality Estimates

Hints may work fine for the data in your database at the time of testing, however, as soon as the data changes, the plan you have forced via hints may no longer be the best plan. #1 In the first place your stats should be correct. #2 write SQL that gives the optimizer a fighting chance of gets cardinality estimates correct, I.e. watch out for the use of functions and expressions. #3 If you run into the Predicate independence assumption issue this is a tough nut to crack, look at using dynamic_sampling, read this article first:http://structureddata.org/2008/03/05/there-is-no-time-like-now-to-use-dynamic-sampling/
#4 Use hints and profiles when all else fails.

52
Fixing Bad Cardinality Estimates

Histograms should be present for columns with skewed data with the correct number of buckets. Only create histograms where they are required, they can have the side affect of increased hard parsing through bind peeking. !!! A histogram with only two end points stores no information on data distribution !!!. Unless a hint is for DYNAMIC_SAMPLING only use access path hints as a last resort.
53
Section 3
Worked Examples Of Tuning BY Cardinality Feedback
31/01/2012
54
A Simple Example Of Bad Cardinality Estimate Trouble Shooting

SELECT * FROM (SELECT a.*, ROWNUM rnum FROM (SELECT DISTINCT a.id_servicepoint, 'WALKROUTEREFID' AS walkrouterefid, a.target_read_date, a.do_not_visit_flag, a.servicepoint_refid, c.customer_ref, c.sd_customer_type, c.id_personal_details, c.business_name, c.trading_name, d.surname, d.middlename, d.forename, b.out_post_code, b.in_post_code, b.incomp_postcode, b.better_address_flag, b.address_line1, b.address_line2, b.address_line3, b.address_line4, b.address_line5, b.address_line6, b.address_line7, b.address_line8, b.address_line9, b.address_line10, b.address_line11, b.address_line12, g.out_post_code AS best_out_post_code, g.in_post_code AS best_in_post_code, g.address_line1 AS best_address_line1, g.address_line2 AS best_address_line2, g.address_line3 AS best_address_line3, g.address_line4 AS best_address_line4, g.address_line5 AS best_address_line5, g.address_line6 AS best_address_line6, g.address_line7 AS best_address_line7, g.address_line8 AS best_address_line8, g.address_line9 AS best_address_line9, g.address_line10 AS best_address_line10, g.address_line11 AS best_address_line11, g.address_line12 AS best_address_line12, f.description, a.id_customer, f.ID AS customerid, CASE WHEN b.better_address_flag = 'N' THEN b.out_post_code || ' ' || b.in_post_code ELSE g.out_post_code || ' ' || g.in_post_code END AS out_in_postcode FROM ad_service_point a, ad_portfolio_address b, ad_customer c, ad_personal_details d, sd_customer_type f, ad_best_address g WHERE a.id_customer = c.id_customer AND a.id_portfolio_address = b.id_portfolio_address AND a.id_best_address = g.id_best_address(+) AND c.id_personal_details = d.id_personal_details AND c.sd_customer_type = f.ID AND b.incomp_postcode IN ('n', 'N') AND a.id_servicepoint NOT IN ( SELECT id_servicepoint FROM tr_job WHERE sd_job_status IN (10003, 10011) AND sd_job_type IN (10002, 10003)) AND UPPER (b.out_post_code) = 'FY1' ORDER BY out_in_postcode) a WHERE ROWNUM <= 26) WHERE rnum > 0
55

---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time ---------------------------------------------------------------------------------------------------|* 1 | VIEW | | 1 | 1 | 26 |00:00:38.90 | |* 2 | COUNT STOPKEY | | 1 | | 26 |00:00:38.90 | | 3 | VIEW | | 1 | 1 | 26 |00:00:38.90 | |* 4 | SORT ORDER BY STOPKEY | | 1 | 1 | 26 |00:00:38.90 | 5 | HASH UNIQUE | | 1 | 1 | 43983 |00:00:38.75 | 6 | NESTED LOOPS | | 1 | 1 | 43983 |00:00:38.36 | 7 | NESTED LOOPS | | 1 | 1 | 43983 |00:00:37.92 | 8 | NESTED LOOPS | | 1 | 1 | 43983 |00:00:37.66 | |* 9 | HASH JOIN ANTI | | 1 | 1 | 43983 |00:00:36.60 | 10 | NESTED LOOPS OUTER | | 1 | 26626 | 43983 |00:00:23.58 |* 11 | HASH JOIN | | 1 | 26626 | 43983 |00:00:23.40 |* 12 | INDEX FAST FULL SCAN | AD_PORTADD_PK | 1 | 26626 | 43983 |00:00:13.04 | 13 | TABLE ACCESS FULL | AD_SERVICE_POINT | 1 | 6762K| 6762K|00:00:20.30 | 14 | TABLE ACCESS BY INDEX ROWID| AD_BEST_ADDRESS | 43983 | 1 | 825 |00:00:00.13 |* 15 | INDEX UNIQUE SCAN | AD_BESTADD_PK | 43983 | 1 | 825 |00:00:00.06 |* 16 | INDEX FAST FULL SCAN | TR_JOB_IX13 | 1 | 266K| 3 |00:00:04.43 |* 17 | INDEX UNIQUE SCAN | AD_CUST_ID_PK | 43983 | 1 | 43983 |00:00:01.11 | 18 | TABLE ACCESS BY INDEX ROWID | SD_CUSTOMER_TYPE | 43983 | 1 | 43983 |00:00:00.22 |* 19 | INDEX UNIQUE SCAN | SD_CUSTTYPE_ID_PK | 43983 | 1 | 43983 |00:00:00.09 |* 20 | INDEX UNIQUE SCAN | AD_PERDET_ID_PK | 43983 | 1 | 43983 |00:00:00.39 ---------------------------------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------1 - filter("RNUM">0) 2 - filter(ROWNUM<=26) 4 - filter(ROWNUM<=26) 9 - access("A"."ID_SERVICEPOINT"="ID_SERVICEPOINT") PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------11 - access("A"."ID_PORTFOLIO_ADDRESS"="B"."ID_PORTFOLIO_ADDRESS") 12 - filter((UPPER("B"."OUT_POST_CODE")='FY1' AND INTERNAL_FUNCTION("B"."INCOMP_POSTCODE"))) 15 - access("A"."ID_BEST_ADDRESS"="G"."ID_BEST_ADDRESS") 16 - filter((INTERNAL_FUNCTION("SD_JOB_STATUS") AND INTERNAL_FUNCTION("SD_JOB_TYPE"))) 17 - access("A"."ID_CUSTOMER"="C"."ID_CUSTOMER") 19 - access("C"."SD_CUSTOMER_TYPE"="F"."ID") 20 - access("C"."ID_PERSONAL_DETAILS"="D"."ID_PERSONAL_DETAILS") | Buffers | Reads | | | | | | | | | | | | | | | | | | | | 385K| 385K| 385K| 385K| 385K| 385K| 297K| 253K| 165K| 148K| 147K| 564 90832 1 826 | 16759 | 87968 | 2 | | Writ 637 | 637 | 637 | 637 637 0 0 0 | | | | | 139K| 139K| 139K| 139K| 139K| 138K| 138K| 138K| 124K| 107K| 107K|
12M
0 | 0 | 0 |
9005K 6444K
1 16
56

Lets create a histogram on sd_job_type Lets look at the distinct values in the sd_job_type and sd_job_status columns:-
SQL> select count(distinct sd_job_type), count(distinct sd_job_status) from tr_job; COUNT(DISTINCTSD_JOB_TYPE) COUNT(DISTINCTSD_JOB_STATUS) -------------------------- ---------------------------3 5

1
Now lets look at the histograms on these columns:-
select column_name, count(*) from user_tab_histograms where column_name in ('SD_JOB_TYPE', 'SD_JOB_STATUS) 2* group by column_name SQL> / COLUMN_NAME COUNT(*) ---------------- -------SD_JOB_STATUS 5 SD_JOB_TYPE 2
57

The estimated cardinality is going badly wrong on line 16 of the plan. Well flush the shared pool, re run the query and get the estimated and actual cardinalities from the shared pool with DBMS_STATS. You will see that on line 13 of the plan that the estimated and actual cardinalities are closer. We now need to look at line 12 on the plan, but this illustrates the general principle of tuning by this approach.
58
SQL> exec dbms_stats.gather_table_stats(user,'TR_JOB',method_opt=>'FOR COLUMNS SD_JOB_TYPE SIZE 3',estimate_percent=>NULL);

---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time ---------------------------------------------------------------------------------------------------|* 1 | VIEW | | 1 | 1 | 26 |00:03:24.00 | |* 2 | COUNT STOPKEY | | 1 | | 26 |00:03:24.00 | 3 | VIEW | | 1 | 1 | 26 |00:03:24.00 | |* 4 | SORT ORDER BY STOPKEY | | 1 | 1 | 26 |00:03:24.00 | 5 | HASH UNIQUE | | 1 | 1 | 43983 |00:03:23.86 | 6 | NESTED LOOPS | | 1 | 1 | 43983 |00:02:36.66 | 7 | NESTED LOOPS | | 1 | 1 | 43983 |00:02:36.14 | 8 | NESTED LOOPS | | 1 | 1 | 43983 |00:02:35.78 | 9 | NESTED LOOPS | | 1 | 1 | 43983 |00:02:29.76 |* 10 | HASH JOIN RIGHT ANTI | | 1 | 1 | 6762K|00:00:35.11 | 11 | INLIST ITERATOR | | 1 | | 3 |00:00:00.06 |* 12 | TABLE ACCESS BY INDEX ROWID| TR_JOB | 2 | 8237 | 3 |00:00:00.06 |* 13 | INDEX RANGE SCAN | TR_JOB_IX3 | 2 | 45251 | 36201 |00:00:00.01 |* 14 | HASH JOIN RIGHT OUTER | | 1 | 6762K| 6762K|00:00:28.35 | 15 | TABLE ACCESS FULL | AD_BEST_ADDRESS | 1 | 102K| 102K|00:00:00.01 | 16 | TABLE ACCESS FULL | AD_SERVICE_POINT | 1 | 6762K| 6762K|00:00:14.18 |* 17 | INDEX UNIQUE SCAN | AD_PORTADD_PK | 6762K| 1 | 43983 |00:02:32.81 |* 18 | INDEX UNIQUE SCAN | AD_CUST_ID_PK | 43983 | 1 | 43983 |00:00:05.50 | 19 | TABLE ACCESS BY INDEX ROWID | SD_CUSTOMER_TYPE | 43983 | 1 | 43983 |00:00:00.30 |* 20 | INDEX UNIQUE SCAN | SD_CUSTTYPE_ID_PK | 43983 | 1 | 43983 |00:00:00.11 |* 21 | INDEX UNIQUE SCAN | AD_PERDET_ID_PK | 43983 | 1 | 43983 |00:00:00.50 ---------------------------------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------1 - filter("RNUM">0) 2 - filter(ROWNUM<=26) 4 - filter(ROWNUM<=26) PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------10 - access("A"."ID_SERVICEPOINT"="ID_SERVICEPOINT") 12 - filter(("SD_JOB_STATUS"=10003 OR "SD_JOB_STATUS"=10011)) 13 - access(("SD_JOB_TYPE"=10002 OR "SD_JOB_TYPE"=10003)) 14 - access("A"."ID_BEST_ADDRESS"="G"."ID_BEST_ADDRESS") 17 - access("A"."ID_PORTFOLIO_ADDRESS"="B"."ID_PORTFOLIO_ADDRESS") filter((UPPER("B"."OUT_POST_CODE")='FY1' AND INTERNAL_FUNCTION("B"."INCOMP_POSTCODE"))) 18 - access("A"."ID_CUSTOMER"="C"."ID_CUSTOMER") 20 - access("C"."SD_CUSTOMER_TYPE"="F"."ID") 21 - access("C"."ID_PERSONAL_DETAILS"="D"."ID_PERSONAL_DETAILS") | Buffers | Reads | | | | | | | | | | | | | | | | | | | | | 13M| 13M| 13M| 13M| 13M| 13M| 13M| 13M| 13M| 91941 | 409 | 409 | 34 | 91532 | 700 | 9083 13M| 87968 2 | | W 637 | 637 | 637 | 148K| 148K| 148K| 148K| 148K| 147K| 147K| 147K| 131K| 91844 | 376 | 37 91468 |
637 0 0 0 0
| | | |
0 |
59
A More Complex Example

Lets turn the complexity setting up a notch. We will use a divide and conquer strategy to work out where the predicted cardinality is going wrong. In cases where queries may take hours to run, running statements with the GATHER_PLAN_STATISTICS hint may not be practical without taking the query apart.
60
A Complex Example Of Bad Cardinality Estimate Trouble Shooting

I said I would provide a more involved example of plan for working through bad cardinality estimate trouble shooting . . . :---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time ---------------------------------------------------------------------------------------------------| 1 | VIEW | | 1 | 27 | 52 |00:01:33.74 | 2 | SORT UNIQUE | | 1 | 27 | 52 |00:01:33.74 | 3 | UNION-ALL | | 1 | | 52 |00:00:02.11 |* 4 | VIEW | | 1 | 1 | 26 |00:00:02.11 |* 5 | COUNT STOPKEY | | 1 | | 26 |00:00:02.11 | 6 | VIEW | | 1 | 1 | 26 |00:00:02.11 |* 7 | SORT UNIQUE STOPKEY | | 1 | 1 | 26 |00:00:02.11 | 8 | NESTED LOOPS | | 1 | 1 | 107 |00:00:02.93 | 9 | NESTED LOOPS | | 1 | 1 | 107 |00:00:02.14 | 10 | NESTED LOOPS | | 1 | 1 | 107 |00:00:01.55 | 11 | NESTED LOOPS | | 1 | 1 | 107 |00:00:01.54 |* 12 | TABLE ACCESS BY INDEX ROWID| AD_SERVICE_POINT | 1 | 1 | 107 |00:00:00. |* 13 | INDEX FULL SCAN | AD_SERVICE_POINT_IX6 | 1 | 1 | 1109 |00:00:00.28 | 14 | TABLE ACCESS BY INDEX ROWID| AD_WALKROUTE | 107 | 1 | 107 |00:00:00.04 |* 15 | INDEX UNIQUE SCAN | AD_WALKROUTE_PK | 107 | 1 | 107 |00:00:00.0 |* 16 | INDEX UNIQUE SCAN | SD_FIELDREG_ID_PK | 107 | 1 | 107 |00:00:00.01 |* 17 | INDEX UNIQUE SCAN | AD_CUST_ID_PK | 107 | 1 | 107 |00:00:00.52 |* 18 | TABLE ACCESS BY INDEX ROWID | AD_BEST_ADDRESS | 107 | 1 | 107 |00:00:00. |* 19 | INDEX UNIQUE SCAN | AD_BESTADD_PK | 107 | 1 | 107 |00:00:00.01 |* 20 | VIEW | | 1 | 26 | 26 |00:01:31.63 |* 21 | COUNT STOPKEY | | 1 | | 26 |00:01:31.63 | 22 | VIEW | | 1 | 5945 | 26 |00:01:31.63 |* 23 | SORT UNIQUE STOPKEY | | 1 | 5945 | 26 |00:01:31.63 | 24 | NESTED LOOPS | | 1 | 5945 | 1799K|00:01:30.02 |* 25 | HASH JOIN | | 1 | 5945 | 1799K|00:01:24.62 | 26 | TABLE ACCESS FULL | AD_WALKROUTE | 1 | 10061 | 10061 |00:00:00.01 |* 27 | HASH JOIN | | 1 | 5945 | 1799K|00:01:20.99 |* 28 | HASH JOIN | | 1 | 5945 | 1800K|00:00:34.85 |* 29 | TABLE ACCESS FULL | AD_SERVICE_POINT | 1 | 5945 | 2043K|00:00:16.37 | 30 | INDEX FAST FULL SCAN | AD_CUSTOMER_IX1 | 1 | 6296K| 6296K|00:00:00. |* 31 | INDEX FAST FULL SCAN | AD_PORTADD_PK | 1 | 67137 | 6668K|00:00:06.6 |* 32 | INDEX UNIQUE SCAN | SD_FIELDREG_ID_PK | 1799K| 1 | 1799K|00:00:03.20 ---------------------------------------------------------------------------------------------------| Buffers | Reads | | | | | | | | | | | 199K| 199K| 199K| 4301 | 4301 | 4301 | 4301 4290 | 3967 | 3751 | 3749 | 304K| 304K| 304K| 3703 | 3703 | 3703 | 3692 | 3674 | 3547 3546 62937 62937 0 | 0 0
| | | | | | | | | | |
195K| 195K| 195K| 195 159K| 159K| 159K| 102K|
300K| 300K| 300K| 300K| 300K| 300K| 137K|
629 6 6
continued . . .
61

Predicate Information (identified by operation id): --------------------------------------------------4 5 7 12 filter("RNUM">0) filter(ROWNUM<=26) filter(ROWNUM<=26) filter(("A"."ID_WALKROUTE" IS NOT NULL AND UPPER("A"."WALKROUTE_REVIEWED_FLAG")='Y' AND (("A"
PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------("A"."SD_SERVICEPOINT_STATUS"=10001 AND "A"."SD_OCCUPIED_STATUS"=10001 AND "A"."SD_SERVICE_TY 13 - filter("A"."ID_BEST_ADDRESS" IS NOT NULL) 15 - access("A"."ID_WALKROUTE"="C"."ID_WALKROUTE") 16 - access("C"."SD_FIELD_REGION"="ID") 17 - access("A"."ID_CUSTOMER"="D"."ID_CUSTOMER") 18 - filter(UPPER("B"."INCOMP_POSTCODE")='N') 19 - access("A"."ID_BEST_ADDRESS"="B"."ID_BEST_ADDRESS") 20 - filter("RNUM">0) 21 - filter(ROWNUM<=26) 23 - filter(ROWNUM<=26) 25 - access("A"."ID_WALKROUTE"="C"."ID_WALKROUTE") PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------27 - access("A"."ID_PORTFOLIO_ADDRESS"="B"."ID_PORTFOLIO_ADDRESS") 28 - access("A"."ID_CUSTOMER"="D"."ID_CUSTOMER") 29 - filter(("A"."ID_WALKROUTE" IS NOT NULL AND "A"."ID_BEST_ADDRESS" IS NULL AND UPPER("A"."WALKR AND "A"."SD_SERVICE_TYPE"=10000) OR ("A"."SD_SERVICEPOINT_STATUS"=10001 AND "A"."SD_OCCUPIED_ INTERNAL_FUNCTION("A"."SD_SERVICEPOINT_STATUS")))) 31 - filter(UPPER("B"."INCOMP_POSTCODE")='N') 32 - access("C"."SD_FIELD_REGION"="ID") Note ----- cpu costing is off (consider enabling it) PLAN_TABLE_OUTPUT
62

Start by looking for the row source which is the furthest into the execution plan for where the cardinality is out by an order of magnitude. This is line 28 in our plan:HASH JOIN | | 1 | 5945 | 1800K|00:00:34.85 | 102K| 137K|
|* 28 |
We have an estimated cardinality of 5945 versus an actual cardinality of 1,800,000. Notice the * at the beginning of the line, this means that there are predicates associated with this table, lets look at these
29 - filter(("A"."ID_WALKROUTE" IS NOT NULL AND "A"."ID_BEST_ADDRESS" IS NULL AND UPPER("A"."WALKR AND "A"."SD_SERVICE_TYPE"=10000) OR ("A"."SD_SERVICEPOINT_STATUS"=10001 AND "A"."SD_OCCUPIED_ INTERNAL_FUNCTION("A"."SD_SERVICEPOINT_STATUS"))))
Due to a possible bug, Oracle thinks that the predicates associated with line 28 are for line 29. Now lets use a Divide and conquer strategy for working out where exactly in this list of predicates that are OR-ed and AND-ed together the estimate is going wrong.
63
Trouble Shooting 1st Pass

1 SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a a.id_best_address IS NULL UPPER (a.walkroute_reviewed_flag) = 'Y' ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) OR ( a.sd_service_type = '10001' AND a.sd_occupied_status = '10001' AND a.sd_servicepoint_status = '10001' ) OR a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003' ) 2 3 FROM 4 WHERE 5 AND 6 AND 7 8 9 10 11 12 13* 14 / COUNT(*) ---------2043175 SQL> select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------SQL_ID 9qvfwxgnv1scj, child number 0 ------------------------------------SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM ad_service_point a WHERE a.id_best_address IS NULL AND UPPER (a.walkroute_reviewed_flag) = 'Y' AND ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) OR ( a.sd_service_type = '10001' AND a.sd_occupied_status = '10001' AND a.sd_servicepoint_status = '10001' ) OR a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003' ) Plan hash value: 185642247
PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:17.77 | 92450 | 92426 | |* 2 | TABLE ACCESS FULL| AD_SERVICE_POINT | 1 | 19524 | 2043K|00:00:18.42 | 92450 | 9 ---------------------------------------------------------------------------------------------------. . . .
64
Analysis Of 1st Pass

We have two sets of predicates the id_best_address IS NULL and UPPER(a.walkroute_reviewed_flag) = Y and a load of stuff relating to standing data columns within brackets.
1 SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a a.id_best_address IS NULL UPPER (a.walkroute_reviewed_flag) = 'Y' ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) OR ( a.sd_service_type = '10001' AND a.sd_occupied_status = '10001' AND a.sd_servicepoint_status = '10001' ) OR a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003' )
2 3 FROM 4 WHERE 5 AND 6 AND 7 8 9 10 11 12 13* 14 /
For our second pass lets work which one of these sections is causing the cardinality estimate to go wrong.
65
Analysis Of 1st Pass

We have two sets of predicates the id_best_address IS NULL and UPPER(a.walkroute_reviewed_flag) = Y and a load of stuff relating to standing data columns within brackets.
1 SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a a.id_best_address IS NULL UPPER (a.walkroute_reviewed_flag) = 'Y' ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) OR ( a.sd_service_type = '10001' AND a.sd_occupied_status = '10001' AND a.sd_servicepoint_status = '10001' ) OR a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003' )
2 3 FROM 4 WHERE 5 AND 6 AND 7 8 9 10 11 12 13* 14 /
For our second pass lets work which one of these sections is causing the cardinality estimate to go wrong.
66
Trouble Shooting - 2nd Pass

It appears that the cardinality estimate is going wrong in the section in brackets.
1* select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST')) /*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a a.id_best_address IS NULL
SQL> SELECT 2 3 FROM 4 WHERE 5 / COUNT(*) ---------6712561 SQL>
select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------SQL_ID 2545fjyq0m5nm, child number 0 ------------------------------------SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM ad_service_point a WHERE a.id_best_address IS NULL Plan hash value: 185642247 ---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:16.37 | 92450 | 92426 | PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------|* 2 | TABLE ACCESS FULL| AD_SERVICE_POINT | 1 | 6712K| 6712K|00:00:13.46 | 92450 | 9 ---------------------------------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------2 - filter("A"."ID_BEST_ADDRESS" IS NULL) Note ----- cpu costing is off (consider enabling it)
67
Trouble Shooting 3rd Pass

1
Lets break down the section in brackets.

SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a a.sd_servicepoint_status = '10000' a.sd_servicepoint_status = '10003'
2 3 FROM 4 WHERE 5* OR SQL> / COUNT(*) ---------282
SQL> select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------SQL_ID 4rjg28wg48zv7, child number 0 ------------------------------------SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM ad_service_point a WHERE a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003' Plan hash value: 2502039880 ---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.03 | 6 | 5 | PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------| 2 | INLIST ITERATOR | | 1 | | 282 |00:00:00.02 | 6 | |* 3 | INDEX RANGE SCAN| AD_SERVICE_POINT_IX5 | 2 | 282 | 282 |00:00:00.03 | 6 | ---------------------------------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------3 - access(("A"."SD_SERVICEPOINT_STATUS"=10000 OR "A"."SD_SERVICEPOINT_STATUS"=10003)) Note -----
5 | 5 |
68
Trouble Shooting 4th Pass

1
Lets break down the section in brackets.

SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a a.sd_servicepoint_status = '10000' a.sd_servicepoint_status = '10003'
2 3 FROM 4 WHERE 5* OR SQL> / COUNT(*) ---------282
SQL> select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------SQL_ID 4rjg28wg48zv7, child number 0 ------------------------------------SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM ad_service_point a WHERE a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003' Plan hash value: 2502039880 ---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.03 | 6 | 5 | PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------| 2 | INLIST ITERATOR | | 1 | | 282 |00:00:00.02 | 6 | |* 3 | INDEX RANGE SCAN| AD_SERVICE_POINT_IX5 | 2 | 282 | 282 |00:00:00.03 | 6 | ---------------------------------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------3 - access(("A"."SD_SERVICEPOINT_STATUS"=10000 OR "A"."SD_SERVICEPOINT_STATUS"=10003)) Note -----
5 | 5 |
69

1 SELECT 2 3 FROM 4 WHERE 5 6 7 8* SQL> / COUNT(*) ---------2070646 SQL> select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------SQL_ID 17vy5ysvdqgxd, child number 0 ------------------------------------SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM ad_service_point a WHERE ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) OR ( a.sd_service_type = '10001' AND a.sd_occupied_status = '10001' AND a.sd_servicepoint_status = '10001' ) ) Plan hash value: 185642247 ---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads /*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a ( ( a.sd_service_type AND a.sd_servicepoint_status OR ( a.sd_service_type AND a.sd_occupied_status AND a.sd_servicepoint_status = = = = = '10000' '10001' ) '10001' '10001' '10001' ) )
PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:17.90 | 92450 | 92426 | |* 2 | TABLE ACCESS FULL| AD_SERVICE_POINT | 1 | 1952K| 2070K|00:00:18.67 | 92450 | 9 ---------------------------------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------2 - filter(("A"."SD_SERVICEPOINT_STATUS"=10001 AND ("A"."SD_SERVICE_TYPE"=10000 OR ("A"."SD_OCCUPIED_STATUS"=10001 AND "A"."SD_SERVICE_TYPE"=10001))))
70

1 SELECT 2 3 FROM 4 WHERE 5* SQL> / COUNT(*) ---------1679148 SQL> select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------SQL_ID 2qd1zzbj3qgf7, child number 0 ------------------------------------SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM ad_service_point a WHERE ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) ) Plan hash value: 1416185116 ---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:04.74 | 18921 | 18859 | |* 2 | INDEX FAST FULL SCAN| AD_SERVICE_POINT_IX11 | 1 | 1679K| 1679K|00:00:01.71 | 18 ---------------------------------------------------------------------------------------------------/*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) )
71

SELECT /*+ GATHER_PLAN_STATISTICS */ 2 COUNT(*) 3 FROM ad_service_point a 4 WHERE ( ( a.sd_service_type = '10001' 5 AND a.sd_occupied_status = '10001' 6* AND a.sd_servicepoint_status = '10001' ) ) SQL> / COUNT(*) ---------391498 SQL> select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------SQL_ID 186k84m0t9rmx, child number 0 ------------------------------------SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM ad_service_point a WHERE ( ( a.sd_service_type = '10001' AND a.sd_occupied_status = '10001' AND a.sd_servicepoint_status = '10001' ) ) Plan hash value: 185642247 ---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:16.47 | 92450 | 92426 | |* 2 | TABLE ACCESS FULL| AD_SERVICE_POINT | 1 | 364K| 391K|00:00:18.17 | 92450 | 9 ---------------------------------------------------------------------------------------------------1
72
Analysis Of 7th Pass

It appears that the cardinality estimate goes awry when Oracle is combining the cardinalities of:1 2 3 4 5 6* and:1 SELECT 2 3 FROM 4 WHERE 5* SQL> / /*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) ) SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM ad_service_point a WHERE ( ( a.sd_service_type = '10001' AND a.sd_occupied_status = '10001' AND a.sd_servicepoint_status = '10001' ) )
73

To rule out statistics:
We could re-compute statistics on the AD_SERVICE_POINT table, this may ok for a test / development environment but it may not be practical to do this on a production environment on ad-hoc basis. If table monitoring is enabled, which it should be by default you flush the monitoring information:-
SQL> execute DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO

Then run:-
SELECT num_rows, last_analyzed, inserts, updates, deletes FROM user_tables t, user_tab_modifications m WHERE t.table_name = m.table_name AND table_name = < your table name >
74

To rule out statistics:
We could re-compute statistics on the AD_SERVICE_POINT table, this may ok for a test / development environment but it may not be practical to do this on a production environment on ad-hoc basis. If table monitoring is enabled, which it should be by default you flush the monitoring information:-
SQL> execute DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO

Then run:-
SELECT t.table_name, last_analyzed, SUM(inserts), SUM(updates), SUM(deletes) FROM user_tables t, user_tab_modifications m WHERE t.table_name = m.table_name AND timestamp > last_analyzed AND t.table_name = <your table name> GROUP BY t.table_name, last_analyzed /
You can see from this that it wouldnt be that difficult to produce something similar that works for indexes.
75

This particular issue is down to what:
people from Oracle call data correlation and what some people from outside of Oracle coin the predicate independence assumption.
The exact nature of the problem is that if there are predicates on the same table for which the data in the relevant columns is related, the optimizer always assumes that these are independent. This culminates in:=> incorrect selectivitys
=> incorrect cardinality estimates => sub-optimal plans
76

This issue has been covered in great depth by the Oracle community that focus on the CBO, Jonathan Lewis describes this as: . . .Assume everyone in the audience knows which star sign they were born under . . . If I ask all the people born under Aries to raise their hands, I expect to see 100 hands, there are 12 star signs, and we assume even distribution of data selectivity is 1/12, cardinality is 1,200 / 12 = 100. How many people will raise their hands if I ask for all the people born under Aries and in December to raise their hands? What about all the people born under Aries in March? What about the people born under Aries in April ? According to Oracle the answer will be the same for all three questions:

Selectivity (month AND star sign) = selectivity (month) * selectivity (star sign) = 1/12 * 1/12 = 1 /144 Cardinality = 1,200 * 1/144 = 8.5 (rounded to 8 or 9 depending on version of Oracle). . .
77

Solutions to the data correlation / predicate independence assumption issue include: dynamic sampling (9i onwards), this causes Oracle to sample the data being queried, the level of sampling performed depends on the sampling level. Hints to force the optimizer down the appropriate execution path. SQL profiles (Oracle 10g onwards), Oracle uses what is known as offline optimization to sample data and partially execute the query in order to create a profile containing cardinality scaling information. Extended statistics (Oracle 11g onwards) allows you to gather statistics on columns containing related data. Oracle 11g automatic tuning ?.
78
Section 4
Summary & Wrap Up
31/01/2012
79
Summary
Having good technical knowledge will only get you so far. To be affective at tuning, your technical knowledge needs to be augmented with good practise and knowledge of common pit falls. This will be covered on the remaining slides.
80
Tuning Objectives

Set clear and well defined performance goals, in terms of data volume, concurrent users, timings, CPU / IO loading. How do you know you have met your goals without targets ?. Remember that timings are ultimately what is important to the business.
81
Tuning By Magic Bullets

Do not tune using magic bullets, when faced with a problem, be aware of likely causes, but appreciate that one size does not fit all. Do not pick up text book or article from the internet describing a performance enhancing feature and apply it blindly across an application. Do not assume you can use the tuning advisor to solve all your ills because Oracle have mentioned this. Do not use dynamic_sampling all over the place because I have mentioned this. Etc . . . !!! ONE SIZE DOES NOT ALWAYS FIT ALL !!!
82
Tuning By Hit Ratios

Hit ratios can hide a multitude of sins. Take the buffer cache hit ratio (bchr) for example. A piece of SQL performing meaningless work can perform lots of logical reads and hence lead to a good BCHR. Refer to custom hit ratio from www.oracledba.co.uk.
83
Hit Ratios According To Tom Kyte

The following was posted on Ask Tom 4 December 2002 . . . . You Asked If LIOs are bad, 1). is it possible to reduce the LIOs programatically ? 2). If yes, what steps/guidelines are to be followed, while coding, inorder to reduce/avoid LIOs ? 3). what is a "OK" number of the LIOs (some percentage of SGA or PIO etc ?) If possible, can you please demonstrate with an example ? (I will be using PL/SQL for coding) Thanks, Uday
84
Hit Ratios According To Tom Kyte

And we said . . .
1) by rewriting your SQL, tuning your SQL, setting environmental things like sort_area_size or db_file_multi_block_read_count or the optimize_index_* parameters. 2) the piece of advice I have for everyone: totally and forever forget the concept that "if my query ain't using an index, it must be broken". Massive nested loop joins are the main culprit of this. If you are joining a 1,000,000 row table to a 2,000,000 row table and trying to get 2,000,000 rows back -- it is very doubtful that an index should be used. 3) ratios STINK ( i have stronger words but this is a public forum after all ). There is one ratio I use -- soft parse ratio (the ratio of soft to hard parses). It should be near 100 for most systems. All other ratios -- forget about them. There is no magic "good number". It is like the stupidest ratio of them all - cache hit. I have a theory that systems with high cache hit ratios, over 95, 96% -- are among the most poorly tuned systems. They are Experiencing excessive LIO's due to massive nested loop joins. Yet, their DBA's sit there and say "well, my cache hit is 99% so all is well in the world". I have no ratios for you. If you have a query that you need to execute and the best we can do is 1,000,000 LIO's -- then SO BE IT, that is that. However, if that query could be executing without doing 1,000,000 LIO's then we need to fix it.
85
The Fan Hit Ratio

This is a meaningful hit ratio that I first came across in a Jonathan Lewis presentation http://www.jlcomp.demon.co.uk/hit_ratio.pdf 100 * ( 1 - least (1, desired response time / actual response time ) )
86
Tuning By Oracle Marketing

Oracle 11g is self tuning => tuning is dead !!! So why has Oracle produced an SQL test case packager in 11g => http://optimizermagic.blogspot.com/2008/03/oraclesupport-keeps-closing-my-tar.html. Every Oracle release has bugs and flaws, 11g will be no exception. When 11.1.0.7 comes out, digest and the list of bugs fixed in the optimizer . . . . New optimizer features are constantly introduced, every feature has its quirks and boundary cases under which things can break.
87
Tuning By The Internet

Some organisations publish on the internet are more interested in dominating search engine searches and advertising their books than disseminating advice grounded on well documented test cases and evidence. Prefer web sites and experts that provide worked examples, e.g. Tom Kyte and Jonathan Lewis.
88
Tuning By Problem Scope

The scope of a solution to a performance problem should match the scope of the problem. For example, an issue is caused by one particular SQL statement is more likely to be resolved by something such as a new index, index change, histogram than a system wide parameter change. Yes, there are DBAs out there who when faced with a performance issue will look at adjusting obscure parameters which most DBAs in their professional careers will never have any need to touch.
89
Tuning What You Know

Always tune what you know, find where the time is going to first and then apply tuning techniques accordingly. Understand the flight envelope of you application, how it behaves under load. Refer to Cary Milsaps presentation on performance and skew from Oracle Open World 2008. Do not blindly pick an Oracle feature conducive to good performance, pick the bottle necks off one by one.
90
Tuning Is A Moving Target

The RDBMS is constantly evolving, in 11g release 1 alone we have:Extended statistics Null aware joins Adaptive cursors Plan base lining A new method of calculating density

Protect yourself from this by thorough testing

91
Understand Response Time

Response time = wait time + service time Service time = CPU time => joining, parsing etc Wait time = waiting on an event I/O, contention etc Understand this basic equation and its context with your application. Avoid blind scatter gun techniques such as looking at your software with a monitoring tool and saying hey there is a lot of latching taking place, therefore that must be my problem, work off the top waits events and drill down.
92
Tuning Via Flat Statistics

You get a mechanic to look at your car, you say there is a performance issue with the engine, it is doing 3000 Rpm at 60 Mph. The mechanic will probably ask:When did first start happening Under what driving conditions does this happen Does this happen consistently Have there been any changes to your car, when was it last serviced.

93
Tuning Via Flat Statistics

The moral of this story is that a single statistic in isolation is not that useful. In order to put statistics into context you need to also understand things such as:where is most of the time going in your application ? What is the normal performance base line or Flight envelope for your application Is you think you are seeing strange and anomalous behaviour, what conditions does it occur under ?

94
Know Your Tool Kit

Performance views 10g tuning infra structure, ASH, advisors, ADDM and the time model. SQL trace, SQL extended trace, tkprof, trcsess. O/s tools, iostat, sar etc . . . DBMS_XPLAN Your tuning arsenal should consist of more than explain plan and the ability to spot full table scans.
95
Understand Key Object Statistics

Histograms, frequency and height balanced, MetaLink note 72539.1. Index quality => clustering factors, many DBAs do not understand this, this gets a whole chapter in Jonathan Lewiss Cost Based Fundamentals book. MetaLink notes 39836.1 and 223117.1 Predicate selectivity, MetaLink note 68992.1 System statistics aux_stats$, MetaLink note 153761.1 Sample sizes Numbers of distinct values. Etc . . .
96
Useful Internet Resources

The Oracle Real World performance group blog http:/www.structureddata.org Jonathan Lewiss blog http://jonathanlewis.wordpress.com/ The Oracle Optimizer Group Blog http://optimizermagic.blogspot.com/ Wolfgang Breitlings web site (a reviewer of Jonathan Lewiss CBO book) www.centrexcc.com General Oracle papers, look out for those in particular from Chritian Antognini http://www.trivadis.com/en/know-how-community/download-area.html The Search For Intelligent Life In The Cost-Based Optimizer by Anjo Kolk: http://www.evdbt.com/SearchIntelligenceCBO.doc
97

An Introduction To Oracle SQL Tuning

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

An Introduction To Oracle SQL Tuning

Încărcat de

Drepturi de autor:

Formate disponibile

An Introduction To Oracle SQL Tuning

Chris Adkin 30th May 2008

Some Inspirational Thoughts Before We Begin . . .

Some Inspirational Thoughts Before We Begin . . .

Some Inspirational Thoughts Before We Begin . . .

Some Inspirational Thoughts Before We Begin . . .

Some Inspirational Thoughts Before We Begin . . .

What Is The Optimizer

Establish the environment

Record will be kept of the best join order maintained so far.

What Influences The Cost Based Optimizer ?

What Influences The Cost Based Optimizer ?

System statistics:COMPLETED 11-26-2006 19:53 11-26-2006 19:53 1 1081.76881 10 4096

What Influences The Cost Based Optimizer ?

What Influences The Cost Based Optimizer ?

!!! Warning !!!

What Influences The Cost Based Optimizer

What Influences The Cost Based Optimizer

Hints and profiles

Objects, presence of indexes, partitioning, constraints etc . . .

Identify The SQL To Be Tuned

A Simple Script For Identifyng SQL To Be Tuned

Causes Of Poor Performance

Causes Of Poor Performance

The Tuning Ethos

The Tuning Ethos

The Tuning Ethos

Tuning Is Dead We Have The Tuning Advisor !!!

Tuning Is Dead We Have The Tuning Advisor !!!

A Simple Methodology For Query Tuning

Good And Bad Plans

A Simple Methodology For Query Tuning

A Simple Methodology For Query Tuning

A Simple Methodology For Query Tuning

A Simple Methodology For Query Tuning

Obtain the plan from the shared pool

Obtain the plan from the workload repository

Understanding A Basic Execution Plan

A Simple Methodology For Query Tuning

A Simple Methodology For Query Tuning

A Simple Methodology For Query Tuning

A Simple Methodology For Query Tuning

A Simple Methodology For Query Tuning

OPTIMIZER_INDEX_CACHING OPTIMIZER_INDEX_COST_ADJ DB_FILE_MULTIBLOCK_READ_COUNT

A Simple Methodology For Query Tuning

A Simple Methodology For Query Tuning

A Simple Methodology For Query Tuning

Tuning By Cardinality Feedback

Tuning By Cardinality Feedback

Obtaining Your SQL Text + Binds

A trivial example to whet the appetite

Checking Predicated Versus Estimated Cardinalities

Checking Predicated Versus Estimated Cardinalities

Reasons For Estimated Cardinalities Being Inaccurate

Statistics gathering limitations:

Reasons For Estimated Cardinalities Being Inaccurate

None representative statistics: 

Reasons For Estimated Cardinalities Being Inaccurate

Optimizer unfriendly SQL

Reasons For Estimated Cardinalities Being Inaccurate

Fixing Bad Cardinality Estimates

#4 Use hints and profiles when all else fails.

Statistics gathering limitations:

None representative statistics:

To rule out statistics:

To rule out statistics:

This particular issue is down to what: