Sunteți pe pagina 1din 12

Excessive Temp Space Usages From Parallel Operations

The Sources of Temp Space Usages


SORT ORDER BY (PGA) SORT GROUP BY (PGA) HASH GROUP BY (PGA) WINDOW SORT (Analytic Function) (PGA) HASH JOIN (PGA and join order) HASH JOIN BUFFERED (PX related, need more research) BUFFER SORT (PX related, excessive) PX SEND BROADCAST (PX distribute BROADCAST, excessive)

How to Identify SQLs with Temp Space Issue


Use view V$SQL (or AWR DBA_HIST_SQLSTAT). Check column direct_writes and compare the value with disk_reads. If the value is significant, and the query is not related to direct load, it is highly possible that we have high temp space usages.

V$SQL Example (UAD)


SQL_ID ELAPSED TIME (SEC) IO WAIT TIME (SEC) DISK_READS 6jbvpvurr02rh 4129 2494 2,545,487

DIRECT_WRITES 5,481,060
Note: The reason DIRECT_WRITES is much greater than DISK_READS is that the query was still writing the data to temp space and yet to read when v$sql was checked.

Locate the Source of Temp Space Usages


For 11g, try v$sql_plan_monitor, column workarea_max_tempseg For 11g and 10g, try v$sql_workarea_active, column tempseg_size Any significant value from above metrics will tell the execution steps with large temp space usages.

Example to Use V$SQL_PLAN_MONITOR


SQL_ID SQL_EXEC_ID PLAN ID PLAN PARENT ID OPERATION READ REQUESTS WRITE REQUESTS TEMP SPACE (MB) 6jbvpvurr02rh 16777217 17 12 HASH JOIN 0 365,794 45,732

Note: The reason read requests (PHYSICAL_READ_REQUESTS) is 0 is that the query was still building the first hash table from the first row source.

Example to Use V$SQL_WORKAREA_ACTIVE


SQL_ID: 6jbvpvurr02rh

Operation

Plan Id SID

Temp Space (MB)

HASH JOIN
HASH JOIN

17
17

1042
1107

11,435
11,433

HASH JOIN
HASH JOIN

17
17

1156
1223

11,432
11,432

Analyze The Plan

1. The temp space usage is from plan Id 17: HASH JOIN 2. Since temp space is used, the first row source (Id 19 35) must be very large. 3. There is PX SEND BROADCAST for the first row source. It will amplify the temp space usages by the magnitude of DOP, in this case, DOP = 4. 4. When the row source of a HASH JOIN is already very large, BROADCAST PX distribute will make the join much harder.

Using Realtime Monitor (V$SQL_PLAN_MONITOR)

1. Up to plan step 20, the first row source has generated 112,679,920 rows. The plan step 19 PX SEND BROADCAST amplified it to 450,719,680 rows. It definitely made the join much harder. 2. BROADCAST is supposed to be used for small row source distribution, that is how Oracle estimated for this query: 10421 rows for the first row source. Since Oracle estimate the second row source with 2.9M records, Oracle thought this join order was better.

The Root Cause


1.

2.
3.

The bad temp space usages with BROADCAST PX distribution is usually the result of bad cardinality estimates of the first row source. The root cause is either the inaccuracy of table stats or Oracles incapability to estimate JOIN cardinality. For this case, both are to be blamed:
The fact table involved does not have global stats. There is no explicit partition range for Oracle to use partition level stats. Multi column range partition scheme makes cardinality, join estimate and partition pruning complicated. BLOOM filter is disabled on UAD DB which makes partition pruning by join almost impossible.

4.

The work around is to add two hints


Dynamic sample hint: dynamic_sampling(2), note no table alias is used, so it will be applied to all tables involved. The purpose is to have better cardinality estimate. OPT_PARAM('_bloom_filter_enabled' 'true') to enable bloom filter for join related partition pruning.

PX BUFFER SORT Example

1. BUFFER SORT in PX is the result of that the operations on one row source/table is not parallelized, while the whole query runs in parallel. The BUFFER SORT operation happens when the query switches from serial operation to parallel operation. The temp space usage can be identified, using v$sql_workarea_active or v$sql_plan_monitor, or by researching the plan self if the query has completed long time ago. 2. In above case (DIRECT MARKETING, SEM), the query run with DOP 32, but the operation on the major row source, the fact table AGG_BY_SPACEID_KWOID_7D, was serial operation.

The Impact of BUFFER SORT


If the BUFFER SORT is on the major row source and results significant temp space usages, it basically triples the IO requests (with additional one round of write and read) The more interesting thing is, the whole query runs in parallel, even with very high DOP, but the slowest operation to read a very large table runs in serial. This is basically PX resource waste. The work around is, to identify the operations running in serial (inside plan, those operations have column TQ and IN-OUT empty) and see if parallel hints can be added to appropriate tables, it will not only make PX operation more efficient, also reduce temp space usages.

S-ar putea să vă placă și