Documente Academic
Documente Profesional
Documente Cultură
31/01/2012
The following Ask Tom excerpt comes in response to a You Asked Can u give a methodology of tuning the sql statements. question. The link to the full answer is at: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_I D:8764517459743 Despite being five years old, in the intervening time, artificial intelligence has not been built into the RDBMS where by it can re-write your SQL such that it will run in the most efficient manner possible. Advisors take some of the leg work out of tuning and the tools such as DBMS_XPLAN, v$ views etc constantly change, evolve and improve, however SQL tuning and writing efficient SQL is not a prescriptive process that can be captured in a process. I will however try to present useful techniques and good practise to demystify some of the art behind this.
2
1.1 Efficient SQL This was probably the hardest part of the book to write - this chapter. That is not because the material is all that complex, rather because I know what people want - and I know what can be delivered. What people want: The 10 step process by which you can tune any query. What can be delivered: Knowledge about how queries are processed, knowledge you can use and apply day to day as you develop them. Think about it for a moment.
3
If there were a 10 step or even 1,000,000 step process by which any query can be tuned (or even X% of queries for that matter), we would write a program to do it. Oh don't get me wrong, there are many programs that actually try to do this - Oracle Enterprise Manager with its tuning pack, SQL Navigator and others. What they do is primarily recommend indexing schemes to tune a query, suggest materialized views, offer to add hints to the query to try other access plans.
They show you different query plans for the same statement and allow you to pick one. They offer "rules of thumb" (what I generally call ROT since the acronym and the word is maps to are so appropriate for each other) SQL optimizations - which if they were universally applicable - the optimizer would do it as a matter of fact. In fact, the cost based optimizer does that already - it rewrites our queries all of the time. These tuning tools use a very limited set of rules that sometimes can suggest that index or set of indexes you really should have thought of during your design.
5
I'll close this idea out with this thought - if there were an N step process to tuning a query, to writing efficient SQL - the optimizer would incorporate it all and we would not be having a discussion about this topic at all. It is like the search for the holy grail - maybe someday the software will be sophisticated enough to be perfect in this regards, it will be able to take our SQL, understand the question being asked and process the question - rather then syntax.
6
Section 1
Mechanics Of The Cost Based Optimizer
31/01/2012
I will focus on the Cost Based Optimizer. Around since Oracle 7. Devises the best plan for executing a query based on cost. Transparent to application and users, except when the wrong plan is selected. The stages of optimisation will be covered on the next set of slides.
8
Stages Of Optimisation
On 10g parsed SQL statements are assigned unique identifiers called sql ids, prior releases use a combination of SQL address and hash value. If a query does not exist in parsed form in the shared pool, a hard parse or optimisation takes place (a gross over simplification, but basically what happens). Oracle can also perform optimisations at run time. A parsed SQL statement (cursor) can have multiple child cursors, these are affectively different plans for the same SQL text when different variables are supplied (an over simplification, but basically what happens).
9
Stages Of Optimisation
Query transformation
Sub query un-nesting Complex view merging Set Join Conversion Predicate move around
10
Stages Of Optimisation
Establish the base statistics of all the relevant tables and indexes. Single table access cardinality estimation. Join order consideration.
11
Object statistics Object data types Oracle initialisation parameters, refer to:v$sys_optimizer_env v$ses_optimizer_env V$sql_optimizer_env
12
1* select * from sys.aux_stats$ SQL> / SYSSTATS_INFO STATUS SYSSTATS_INFO DSTART SYSSTATS_INFO DSTOP SYSSTATS_INFO FLAGS SYSSTATS_MAIN CPUSPEEDNW SYSSTATS_MAIN IOSEEKTIM SYSSTATS_MAIN IOTFRSPEED SYSSTATS_MAIN SREADTIM SYSSTATS_MAIN MREADTIM SYSSTATS_MAIN CPUSPEED SYSSTATS_MAIN MBRC SYSSTATS_MAIN MAXTHR SYSSTATS_MAIN SLAVETHR
13
System statistics facilitate something called the CPU costing model introduced in Oracle 9i. Until this came along the optimizer did not take into account CPU performance nor the difference in performance between single and multi block reads. In Oracle 9i, no system statistics are present. In 10g out of the box statistics called no workload statistics are provided. Stats that are not populated in aux_stats$ only appear after system statistics have been gathered. The optimizer cost model can be set via _optim_cost_model, set this to IO or CPU.
14
15
SQL Usage
Unless the NO_MONITOR hint is used, col_usage$ in the data dictionary will be updated whenever a statement is parsed. DBMS_STATS uses this in order to determine whether a histogram should be created on a column when size = AUTO is specified in the method_opt. Oracle can get confused with what to do for columns where the LIKE and col_usage$ and hence not create histograms when size auto is specified.
16
17
Section 2
A Plan Of Attack For Investigating A Bad Plan
31/01/2012
18
Many ways of doing this:Automatic Database Diagnostic Manager $ORACLE_HOME/rdbms/admin/addmrpt Automatic Workload Repository reports $ORACLE_HOME/rdbms/admin/awrrpt Toad SQL Tuning Advisor $ORACLE_HOME/rdbms/admin/sqltrpt
19
SELECT * FROM (SELECT sql_id, elapsed_time, cpu_time, user_io_wait_time, concurrency_wait_time FROM v$sql ORDER BY elapsed_time DESC) WHERE rownum <= 10 In 10 onwards SQL statements are uniquely identified by sql ids, for previous releases this a combination of hash value and sql address. As a first pass tuning try running the SQL tuning advisor on the query, my preferred way of doing this is via the sqltrpt script in $ORACLE_HOME/rdbms/admin.
20
Stale or missing statistics (includes missing histograms). Lack of appropriate indexes to support your query Bugs in the optimizer, there are the best part of 100 bug fixes related to the optimizer in 10.2.0.4.0. Optimizer flaws, such as the predicate independence assumption, more on this later. SQL that is inefficient by design. SQL that is optimizer unfriendly, more on this later.
21
Index, table type or partitioning schemes that are not conducive to good performance with the query mix you are running. Misuse or poor use of Oracle features. Abuse of the Oracle optimizer environment. Poor schema design.
22
Ask the following questions:Has what I need to tune ever ran fast, if so what has changed in the environment. Do not confuse something that is running slow because it has an abnormally high workload for something that is inefficient at what it does. Is the part of the application running slowly because it is not designed to deal with the shape and / or size of the data it is processing ?.
23
When tuning
Only make one change at a time so as to be able to accurately gauge what affects your query. Try to make the scope of any changes match the scope of the problem, e.g. if you have a problem with one query, a global change such as a change to an initialisation parameter is unlikely to address the root cause of the issue and it may cause performance regression for queries that are currently performing well. Use realistic test data. Try to use a clean environment without factors that may skew your results, e.g. for accurate and consistent response times avoid database and servers which are being used by other users if possible.
24
Tuning a piece of SQL may result in:Indexes being added or modified. Statistic gathering regimes being modified. SQL being re-written. Object types being changed, e.g. heap to index organized tables. Schema design changing, e.g. extra columns added to tables in order to reduce numbers of joins. The use of hints.
25
Based on my experience this:Recommends indexes Identifies stale statistics Notifies you of Cartesian joins Spots predicates not conducive to the efficient use of indexes, e.g. col <>, col NOT IN (1, 2, . . . Identifies parts of statements where view mergining cannot take place. Recommends SQL profiles.
26
If the tuning advisor does anything else, I am yet to see it. SQL profiles are enhanced hints that provide the optimizer with extra information with which to get cardinality estimates correct with. The use of SQL profiles can cause a problem if your base data changes such that they are no longer relevant. It is the best tool of its type that Oracle have produced to date, it is not artificially intelligent and it smacks of the ROT tools that Tom Kyte mentioned. This is good for a first pass at identifying more obvious causes of a query running sub optimally.
27
Despite what the Oracle marketing machine may say, tuning is not a prescriptive process. The tuning advisor in its 10g incarnation is no substitute for a knowledgeable DBA at best it semi automates the task of tuning. Some people instantly assume that because a plan contains a full table scan that the execution plan is therefore bad. This is an out dated method of tuning with origins based around the rule based optimizer.
28
Do not automatically jump to the conclusion that a bad plan contains full table scans and a good plan only uses index range scans. Reasons for full table scans might include:
Queries that use un-selective predicates, I.e. result in all or a large proportion of a tables data being returned. Queries return most of the columns in a table. The table is small in size. The indexes you think the query should use have poor clustering factors.
29
What I will outline is a plan of attack for tackling query issues. There is no such thing as a definitive tuning methodology as alluded to by Tom Kyte. Even when Oracle support is engaged with a tuning related service request, you will not always be asked for the same information or to use the same diagnostic tools depending on who the service request is assigned to. This may change with the DBMS_SQLDIAG package in 11g, aka the SQL test case builder.
30
My plan of attack will not cover the writing of efficient SQL, this would require a different presentation entirely.
I will assume:
All reasonable indexes, partitioning schemes etc . . . are in place. Nothing daft has been done to abuse the optimizer environment. Oracle has produced sub optimal plans due to discrepancies between predicated cardinalities and actual cardinalities.
31
If you run an explain plan from the SQL prompt this will give you the plan that Oracle predicts the statement will use, I.e.:SQL> EXPLAIN PLAN FOR SELECT * FROM dual SQL> SELECT * FROM table(dbms_xplan.display)
A better way of explaining the execution plan is by taking it straight from the shared pool:SQL> SELECT * FROM table(dbms_xplan.display_cursor(sqlid))
Sql ids are new to Oracle 10g and uniquely identify parsed SQL statements
32
The DBMS_XPLAN package is the best tool here. Where possible obtain the plan after running the query:
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, ALLSTATS LAST)
If you use explain plan, this will give you what the optimizer predicts the plan will be, this may not necessarily but what the plan is when the query runs due to such things as bind variable peeking.
33
34
Does the plan contain any Cartesian joins ?. Is the most restricted table the driving table, this is the first table in the join order and usually the table that is furthest indented into the plan, I.e. the table for which the filter predicates will reduce the most. Is the access for the driving table via a full table scan, not always an issue, but could indicate that the join order is incorrect
Continued . . .
35
Continued . . .
36
There is a Cartesian join due to the OR condition, specifically if one of the branches evaluates to TRUE the join predicates do not get performed for two tables. We have an index fast full scan (index equivalent of a full table scan) on TR_CONSUMPTION_HISTORY_IX2, note that the only reason we are going to the TR_CONSUMPTION_HISTORY table is to get the contents of the sd_reading_type column.
37
Are statistics up to date for all the objects used by the query. Do appropriate indexes exists, you can run the sqltrpt script from $ORACLE_HOME/rdbms/admin, supply the sql id for your query and the tuning advisor will go to work for you, this will also highlight any tables with stale statistics.
Continued . . .
38
40
Are the predicates and variables used in your statement going to retrieve a significant or small proportion of rows from the queries base tables ?. The following predicates come from a query I have investigated the performance of:UPPER (a.walkroute_reviewed_flag) = 'Y' ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) OR ( a.sd_service_type = '10001' AND a.sd_occupied_status = '10001' AND a.sd_servicepoint_status = '10001' ) OR a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003
It returns a third of the data in the AD_SERVICE_POINT table, therefore if the clustering factors of the appropriate indexes are not particularly good, a full table scan is probably the most efficient way to retrieve the relevant data.41
42
Generally speaking if the predicated cardinalities and actual cardinalities for a row source are close, the optimizer will pick the best plan for the query. To start down this route of tuning we need to obtain the SQL text for the query in question along with any bind variable values it uses. The best way of doing this is via one of the most useful tools in our tuning toolbox, the DBMS_XPLAN package.
43
Tuning by cardinality feedback is borrowed from one of the reviewers of Jonathan Lewiss Cost Based Fundamentals book: www.centrexcc.com/Tuning by Cardinality Feedback.pdf. Other Oracle professionals may call this something different. This is also endorsed by the Oracle Real World Performance Group http://structureddata.org/2007/11/21/troubleshooting-bad-executionplans/ With the GATHER_PLAN_STATISTICS hint and ALLSTATS LAST format string in DBMS_XPLAN (Oracle 10g), this is a really easy method to use. I would not advocate playing around with density settings as this may fix queries suffering from the Predicate Independence Assumption but cause bad plans for other queries.
44
SQL> select * from table(dbms_xplan.display_cursor('ft33c3agapy0k',0,'TYPICAL +PEEKED_BINDS')) 2 / SQL_ID ft33c3agapy0k, child number 0 ------------------------------------UPDATE TR_CYCLIC_WORK_PACK SET PACK_JOB_COUNTER =PACK_JOB_COUNTER +1 WHERE ID_PACK=:1 Plan hash value: 115273857 -------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------------| 0 | UPDATE STATEMENT | | | | 2 | | 1 | UPDATE | TR_CYCLIC_WORK_PACK | | | | |* 2 | INDEX UNIQUE SCAN| TR_CYC_WORKPACK_PK | 1 | 9 | 1 | -------------------------------------------------------------------------Peeked Binds (identified by position): -------------------------------------1 - :1 (NUMBER): 80310011 Predicate Information (identified by operation id): --------------------------------------------------2 - access("ID_PACK"=:1) Note ----- cpu costing is off (consider enabling it)
45
SQL_ID cqnxyqmp08rtu, child number 0 ------------------------------------UPDATE /*+ GATHER_PLAN_STATISTICS */ TR_CYCLIC_WORK_PACK =PACK_JOB_COUNTER +1 WHERE ID_PACK=:b1 Plan hash value: 115273857
SET PACK_JOB_COUNTER
---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------------------| 1 | UPDATE | TR_CYCLIC_WORK_PACK | 1 | | 1 |00:00:00.01 | 4 | |* 2 | INDEX UNIQUE SCAN| TR_CYC_WORKPACK_PK | 1 | 1 | 1 |00:00:00.01 | 2 | ----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id): --------------------------------------------------2 - access("ID_PACK"=:B1) Note ----- cpu costing is off (consider enabling it)
46
The example on the previous slide was the simplest example that can be provided. Line 2 in the execution plan is what is known as a row source. Note that for line 2 the values in the E-Rows and A-Rows columns matched. If nested loop joins are used:starts x E-Rows = A-Rows otherwise E-Rows = A-Rows
Usually for a bad plan the estimated cardinalities will differ from the predicated cardinalities by orders of magnitude, this inaccuracy in will ripple throughout the rest of the plan and lead to poor performance. On the next slide a more complex example will be provided.
47
Columns with skewed containing more than 254 distinct values => can be rectified with the DBMS_STATS API. DBMS_STATS in 10g can produce statistically incorrect estimates of distinct values when auto sampling / sampling is used. Not possible until 11g to gather statistics on correlated columns, this is due to what is sometimes called data correlation or the predicate independence assumption, more on this later. In 10g dynamic sampling or hints can help here. Not possible until 11g to gather statistics on functions unless function based indexes are used.
48
Statistics are missing or stale. During the course of processing data changes such that it no longer reflects the statistics in the data dictionary, a particular problem with scratch data, dynamic sampling can help here. Lack of histograms or histograms with too few buckets on columns with skewed data. Sampled statistics taken with too small a sample size, in 11g auto sampling gives 100% statistically accurate statistics.
49
Using values in the WHERE clauses with data types which are different to those used by the columns comparisons are being made against.
50
Other factors:
Optimizer bugs.
51
Hints may work fine for the data in your database at the time of testing, however, as soon as the data changes, the plan you have forced via hints may no longer be the best plan. #1 In the first place your stats should be correct. #2 write SQL that gives the optimizer a fighting chance of gets cardinality estimates correct, I.e. watch out for the use of functions and expressions. #3 If you run into the Predicate independence assumption issue this is a tough nut to crack, look at using dynamic_sampling, read this article first:http://structureddata.org/2008/03/05/there-is-no-time-like-now-to-use-dynamic-sampling/
Histograms should be present for columns with skewed data with the correct number of buckets. Only create histograms where they are required, they can have the side affect of increased hard parsing through bind peeking. !!! A histogram with only two end points stores no information on data distribution !!!. Unless a hint is for DYNAMIC_SAMPLING only use access path hints as a last resort.
53
Section 3
Worked Examples Of Tuning BY Cardinality Feedback
31/01/2012
54
55
12M
0 | 0 | 0 |
9005K 6444K
1 16
56
Lets create a histogram on sd_job_type Lets look at the distinct values in the sd_job_type and sd_job_status columns:-
SQL> select count(distinct sd_job_type), count(distinct sd_job_status) from tr_job; COUNT(DISTINCTSD_JOB_TYPE) COUNT(DISTINCTSD_JOB_STATUS) -------------------------- ---------------------------3 5
1
select column_name, count(*) from user_tab_histograms where column_name in ('SD_JOB_TYPE', 'SD_JOB_STATUS) 2* group by column_name SQL> / COLUMN_NAME COUNT(*) ---------------- -------SD_JOB_STATUS 5 SD_JOB_TYPE 2
57
The estimated cardinality is going badly wrong on line 16 of the plan. Well flush the shared pool, re run the query and get the estimated and actual cardinalities from the shared pool with DBMS_STATS. You will see that on line 13 of the plan that the estimated and actual cardinalities are closer. We now need to look at line 12 on the plan, but this illustrates the general principle of tuning by this approach.
58
637 0 0 0 0
| | | |
0 |
59
Lets turn the complexity setting up a notch. We will use a divide and conquer strategy to work out where the predicted cardinality is going wrong. In cases where queries may take hours to run, running statements with the GATHER_PLAN_STATISTICS hint may not be practical without taking the query apart.
60
| | | | | | | | | | |
629 6 6
continued . . .
61
PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------("A"."SD_SERVICEPOINT_STATUS"=10001 AND "A"."SD_OCCUPIED_STATUS"=10001 AND "A"."SD_SERVICE_TY 13 - filter("A"."ID_BEST_ADDRESS" IS NOT NULL) 15 - access("A"."ID_WALKROUTE"="C"."ID_WALKROUTE") 16 - access("C"."SD_FIELD_REGION"="ID") 17 - access("A"."ID_CUSTOMER"="D"."ID_CUSTOMER") 18 - filter(UPPER("B"."INCOMP_POSTCODE")='N') 19 - access("A"."ID_BEST_ADDRESS"="B"."ID_BEST_ADDRESS") 20 - filter("RNUM">0) 21 - filter(ROWNUM<=26) 23 - filter(ROWNUM<=26) 25 - access("A"."ID_WALKROUTE"="C"."ID_WALKROUTE") PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------27 - access("A"."ID_PORTFOLIO_ADDRESS"="B"."ID_PORTFOLIO_ADDRESS") 28 - access("A"."ID_CUSTOMER"="D"."ID_CUSTOMER") 29 - filter(("A"."ID_WALKROUTE" IS NOT NULL AND "A"."ID_BEST_ADDRESS" IS NULL AND UPPER("A"."WALKR AND "A"."SD_SERVICE_TYPE"=10000) OR ("A"."SD_SERVICEPOINT_STATUS"=10001 AND "A"."SD_OCCUPIED_ INTERNAL_FUNCTION("A"."SD_SERVICEPOINT_STATUS")))) 31 - filter(UPPER("B"."INCOMP_POSTCODE")='N') 32 - access("C"."SD_FIELD_REGION"="ID") Note ----- cpu costing is off (consider enabling it) PLAN_TABLE_OUTPUT
62
Start by looking for the row source which is the furthest into the execution plan for where the cardinality is out by an order of magnitude. This is line 28 in our plan:HASH JOIN | | 1 | 5945 | 1800K|00:00:34.85 | 102K| 137K|
|* 28 |
We have an estimated cardinality of 5945 versus an actual cardinality of 1,800,000. Notice the * at the beginning of the line, this means that there are predicates associated with this table, lets look at these
29 - filter(("A"."ID_WALKROUTE" IS NOT NULL AND "A"."ID_BEST_ADDRESS" IS NULL AND UPPER("A"."WALKR AND "A"."SD_SERVICE_TYPE"=10000) OR ("A"."SD_SERVICEPOINT_STATUS"=10001 AND "A"."SD_OCCUPIED_ INTERNAL_FUNCTION("A"."SD_SERVICEPOINT_STATUS"))))
Due to a possible bug, Oracle thinks that the predicates associated with line 28 are for line 29. Now lets use a Divide and conquer strategy for working out where exactly in this list of predicates that are OR-ed and AND-ed together the estimate is going wrong.
63
PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:17.77 | 92450 | 92426 | |* 2 | TABLE ACCESS FULL| AD_SERVICE_POINT | 1 | 19524 | 2043K|00:00:18.42 | 92450 | 9 ---------------------------------------------------------------------------------------------------. . . .
64
We have two sets of predicates the id_best_address IS NULL and UPPER(a.walkroute_reviewed_flag) = Y and a load of stuff relating to standing data columns within brackets.
1 SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a a.id_best_address IS NULL UPPER (a.walkroute_reviewed_flag) = 'Y' ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) OR ( a.sd_service_type = '10001' AND a.sd_occupied_status = '10001' AND a.sd_servicepoint_status = '10001' ) OR a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003' )
For our second pass lets work which one of these sections is causing the cardinality estimate to go wrong.
65
We have two sets of predicates the id_best_address IS NULL and UPPER(a.walkroute_reviewed_flag) = Y and a load of stuff relating to standing data columns within brackets.
1 SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a a.id_best_address IS NULL UPPER (a.walkroute_reviewed_flag) = 'Y' ( ( a.sd_service_type = '10000' AND a.sd_servicepoint_status = '10001' ) OR ( a.sd_service_type = '10001' AND a.sd_occupied_status = '10001' AND a.sd_servicepoint_status = '10001' ) OR a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003' )
For our second pass lets work which one of these sections is causing the cardinality estimate to go wrong.
66
It appears that the cardinality estimate is going wrong in the section in brackets.
1* select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST')) /*+ GATHER_PLAN_STATISTICS */ COUNT(*) ad_service_point a a.id_best_address IS NULL
PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------SQL_ID 2545fjyq0m5nm, child number 0 ------------------------------------SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM ad_service_point a WHERE a.id_best_address IS NULL Plan hash value: 185642247 ---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:16.37 | 92450 | 92426 | PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------|* 2 | TABLE ACCESS FULL| AD_SERVICE_POINT | 1 | 6712K| 6712K|00:00:13.46 | 92450 | 9 ---------------------------------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------2 - filter("A"."ID_BEST_ADDRESS" IS NULL) Note ----- cpu costing is off (consider enabling it)
67
SQL> select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------SQL_ID 4rjg28wg48zv7, child number 0 ------------------------------------SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM ad_service_point a WHERE a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003' Plan hash value: 2502039880 ---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.03 | 6 | 5 | PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------| 2 | INLIST ITERATOR | | 1 | | 282 |00:00:00.02 | 6 | |* 3 | INDEX RANGE SCAN| AD_SERVICE_POINT_IX5 | 2 | 282 | 282 |00:00:00.03 | 6 | ---------------------------------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------3 - access(("A"."SD_SERVICEPOINT_STATUS"=10000 OR "A"."SD_SERVICEPOINT_STATUS"=10003)) Note -----
5 | 5 |
68
SQL> select * from table(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------SQL_ID 4rjg28wg48zv7, child number 0 ------------------------------------SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM ad_service_point a WHERE a.sd_servicepoint_status = '10000' OR a.sd_servicepoint_status = '10003' Plan hash value: 2502039880 ---------------------------------------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.03 | 6 | 5 | PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------| 2 | INLIST ITERATOR | | 1 | | 282 |00:00:00.02 | 6 | |* 3 | INDEX RANGE SCAN| AD_SERVICE_POINT_IX5 | 2 | 282 | 282 |00:00:00.03 | 6 | ---------------------------------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------3 - access(("A"."SD_SERVICEPOINT_STATUS"=10000 OR "A"."SD_SERVICEPOINT_STATUS"=10003)) Note -----
5 | 5 |
69
PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:17.90 | 92450 | 92426 | |* 2 | TABLE ACCESS FULL| AD_SERVICE_POINT | 1 | 1952K| 2070K|00:00:18.67 | 92450 | 9 ---------------------------------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------2 - filter(("A"."SD_SERVICEPOINT_STATUS"=10001 AND ("A"."SD_SERVICE_TYPE"=10000 OR ("A"."SD_OCCUPIED_STATUS"=10001 AND "A"."SD_SERVICE_TYPE"=10001))))
70
71
72
73
We could re-compute statistics on the AD_SERVICE_POINT table, this may ok for a test / development environment but it may not be practical to do this on a production environment on ad-hoc basis. If table monitoring is enabled, which it should be by default you flush the monitoring information:-
Then run:-
SELECT num_rows, last_analyzed, inserts, updates, deletes FROM user_tables t, user_tab_modifications m WHERE t.table_name = m.table_name AND table_name = < your table name >
74
We could re-compute statistics on the AD_SERVICE_POINT table, this may ok for a test / development environment but it may not be practical to do this on a production environment on ad-hoc basis. If table monitoring is enabled, which it should be by default you flush the monitoring information:-
Then run:-
SELECT t.table_name, last_analyzed, SUM(inserts), SUM(updates), SUM(deletes) FROM user_tables t, user_tab_modifications m WHERE t.table_name = m.table_name AND timestamp > last_analyzed AND t.table_name = <your table name> GROUP BY t.table_name, last_analyzed /
You can see from this that it wouldnt be that difficult to produce something similar that works for indexes.
75
people from Oracle call data correlation and what some people from outside of Oracle coin the predicate independence assumption.
The exact nature of the problem is that if there are predicates on the same table for which the data in the relevant columns is related, the optimizer always assumes that these are independent. This culminates in:=> incorrect selectivitys
76
This issue has been covered in great depth by the Oracle community that focus on the CBO, Jonathan Lewis describes this as: . . .Assume everyone in the audience knows which star sign they were born under . . . If I ask all the people born under Aries to raise their hands, I expect to see 100 hands, there are 12 star signs, and we assume even distribution of data selectivity is 1/12, cardinality is 1,200 / 12 = 100. How many people will raise their hands if I ask for all the people born under Aries and in December to raise their hands? What about all the people born under Aries in March? What about the people born under Aries in April ? According to Oracle the answer will be the same for all three questions:
Selectivity (month AND star sign) = selectivity (month) * selectivity (star sign) = 1/12 * 1/12 = 1 /144 Cardinality = 1,200 * 1/144 = 8.5 (rounded to 8 or 9 depending on version of Oracle). . .
77
Solutions to the data correlation / predicate independence assumption issue include: dynamic sampling (9i onwards), this causes Oracle to sample the data being queried, the level of sampling performed depends on the sampling level. Hints to force the optimizer down the appropriate execution path. SQL profiles (Oracle 10g onwards), Oracle uses what is known as offline optimization to sample data and partially execute the query in order to create a profile containing cardinality scaling information. Extended statistics (Oracle 11g onwards) allows you to gather statistics on columns containing related data. Oracle 11g automatic tuning ?.
78
Section 4
Summary & Wrap Up
31/01/2012
79
Summary
Having good technical knowledge will only get you so far. To be affective at tuning, your technical knowledge needs to be augmented with good practise and knowledge of common pit falls. This will be covered on the remaining slides.
80
Tuning Objectives
Set clear and well defined performance goals, in terms of data volume, concurrent users, timings, CPU / IO loading. How do you know you have met your goals without targets ?. Remember that timings are ultimately what is important to the business.
81
Do not tune using magic bullets, when faced with a problem, be aware of likely causes, but appreciate that one size does not fit all. Do not pick up text book or article from the internet describing a performance enhancing feature and apply it blindly across an application. Do not assume you can use the tuning advisor to solve all your ills because Oracle have mentioned this. Do not use dynamic_sampling all over the place because I have mentioned this. Etc . . . !!! ONE SIZE DOES NOT ALWAYS FIT ALL !!!
82
Hit ratios can hide a multitude of sins. Take the buffer cache hit ratio (bchr) for example. A piece of SQL performing meaningless work can perform lots of logical reads and hence lead to a good BCHR. Refer to custom hit ratio from www.oracledba.co.uk.
83
85
This is a meaningful hit ratio that I first came across in a Jonathan Lewis presentation http://www.jlcomp.demon.co.uk/hit_ratio.pdf 100 * ( 1 - least (1, desired response time / actual response time ) )
86
Oracle 11g is self tuning => tuning is dead !!! So why has Oracle produced an SQL test case packager in 11g => http://optimizermagic.blogspot.com/2008/03/oraclesupport-keeps-closing-my-tar.html. Every Oracle release has bugs and flaws, 11g will be no exception. When 11.1.0.7 comes out, digest and the list of bugs fixed in the optimizer . . . . New optimizer features are constantly introduced, every feature has its quirks and boundary cases under which things can break.
87
Some organisations publish on the internet are more interested in dominating search engine searches and advertising their books than disseminating advice grounded on well documented test cases and evidence. Prefer web sites and experts that provide worked examples, e.g. Tom Kyte and Jonathan Lewis.
88
The scope of a solution to a performance problem should match the scope of the problem. For example, an issue is caused by one particular SQL statement is more likely to be resolved by something such as a new index, index change, histogram than a system wide parameter change. Yes, there are DBAs out there who when faced with a performance issue will look at adjusting obscure parameters which most DBAs in their professional careers will never have any need to touch.
89
Always tune what you know, find where the time is going to first and then apply tuning techniques accordingly. Understand the flight envelope of you application, how it behaves under load. Refer to Cary Milsaps presentation on performance and skew from Oracle Open World 2008. Do not blindly pick an Oracle feature conducive to good performance, pick the bottle necks off one by one.
90
The RDBMS is constantly evolving, in 11g release 1 alone we have:Extended statistics Null aware joins Adaptive cursors Plan base lining A new method of calculating density
Response time = wait time + service time Service time = CPU time => joining, parsing etc Wait time = waiting on an event I/O, contention etc Understand this basic equation and its context with your application. Avoid blind scatter gun techniques such as looking at your software with a monitoring tool and saying hey there is a lot of latching taking place, therefore that must be my problem, work off the top waits events and drill down.
92
You get a mechanic to look at your car, you say there is a performance issue with the engine, it is doing 3000 Rpm at 60 Mph. The mechanic will probably ask:When did first start happening Under what driving conditions does this happen Does this happen consistently Have there been any changes to your car, when was it last serviced.
93
The moral of this story is that a single statistic in isolation is not that useful. In order to put statistics into context you need to also understand things such as:where is most of the time going in your application ? What is the normal performance base line or Flight envelope for your application Is you think you are seeing strange and anomalous behaviour, what conditions does it occur under ?
94
Performance views 10g tuning infra structure, ASH, advisors, ADDM and the time model. SQL trace, SQL extended trace, tkprof, trcsess. O/s tools, iostat, sar etc . . . DBMS_XPLAN Your tuning arsenal should consist of more than explain plan and the ability to spot full table scans.
95
Histograms, frequency and height balanced, MetaLink note 72539.1. Index quality => clustering factors, many DBAs do not understand this, this gets a whole chapter in Jonathan Lewiss Cost Based Fundamentals book. MetaLink notes 39836.1 and 223117.1 Predicate selectivity, MetaLink note 68992.1 System statistics aux_stats$, MetaLink note 153761.1 Sample sizes Numbers of distinct values. Etc . . .
96
The Oracle Real World performance group blog http:/www.structureddata.org Jonathan Lewiss blog http://jonathanlewis.wordpress.com/ The Oracle Optimizer Group Blog http://optimizermagic.blogspot.com/ Wolfgang Breitlings web site (a reviewer of Jonathan Lewiss CBO book) www.centrexcc.com General Oracle papers, look out for those in particular from Chritian Antognini http://www.trivadis.com/en/know-how-community/download-area.html The Search For Intelligent Life In The Cost-Based Optimizer by Anjo Kolk: http://www.evdbt.com/SearchIntelligenceCBO.doc
97