Excessive Temp Space Usages From Parallel Operations

Excessive Temp Space Usages From Parallel Operations
The Sources of Temp Space Usages

SORT ORDER BY (PGA) SORT GROUP BY (PGA) HASH GROUP BY (PGA) WINDOW SORT (Analytic Function) (PGA) HASH JOIN (PGA and join order) HASH JOIN BUFFERED (PX related, need more research) BUFFER SORT (PX related, excessive) PX SEND BROADCAST (PX distribute BROADCAST, excessive)
How to Identify SQLs with Temp Space Issue

Use view V$SQL (or AWR DBA_HIST_SQLSTAT). Check column direct_writes and compare the value with disk_reads. If the value is significant, and the query is not related to direct load, it is highly possible that we have high temp space usages.
V$SQL Example (UAD)

SQL_ID ELAPSED TIME (SEC) IO WAIT TIME (SEC) DISK_READS 6jbvpvurr02rh 4129 2494 2,545,487
DIRECT_WRITES 5,481,060
Note: The reason DIRECT_WRITES is much greater than DISK_READS is that the query was still writing the data to temp space and yet to read when v$sql was checked.
Locate the Source of Temp Space Usages

For 11g, try v$sql_plan_monitor, column workarea_max_tempseg For 11g and 10g, try v$sql_workarea_active, column tempseg_size Any significant value from above metrics will tell the execution steps with large temp space usages.
Example to Use V$SQL_PLAN_MONITOR

SQL_ID SQL_EXEC_ID PLAN ID PLAN PARENT ID OPERATION READ REQUESTS WRITE REQUESTS TEMP SPACE (MB) 6jbvpvurr02rh 16777217 17 12 HASH JOIN 0 365,794 45,732
Note: The reason read requests (PHYSICAL_READ_REQUESTS) is 0 is that the query was still building the first hash table from the first row source.
Example to Use V$SQL_WORKAREA_ACTIVE

SQL_ID: 6jbvpvurr02rh
Operation
Plan Id SID
Temp Space (MB)
HASH JOIN
HASH JOIN
17
17
1042
1107
11,435
11,433
HASH JOIN
HASH JOIN
17
17
1156
1223
11,432
11,432
Analyze The Plan
1. The temp space usage is from plan Id 17: HASH JOIN 2. Since temp space is used, the first row source (Id 19 35) must be very large. 3. There is PX SEND BROADCAST for the first row source. It will amplify the temp space usages by the magnitude of DOP, in this case, DOP = 4. 4. When the row source of a HASH JOIN is already very large, BROADCAST PX distribute will make the join much harder.
Using Realtime Monitor (V$SQL_PLAN_MONITOR)
1. Up to plan step 20, the first row source has generated 112,679,920 rows. The plan step 19 PX SEND BROADCAST amplified it to 450,719,680 rows. It definitely made the join much harder. 2. BROADCAST is supposed to be used for small row source distribution, that is how Oracle estimated for this query: 10421 rows for the first row source. Since Oracle estimate the second row source with 2.9M records, Oracle thought this join order was better.
The Root Cause

1.
2.
3.

The bad temp space usages with BROADCAST PX distribution is usually the result of bad cardinality estimates of the first row source. The root cause is either the inaccuracy of table stats or Oracles incapability to estimate JOIN cardinality. For this case, both are to be blamed:
The fact table involved does not have global stats. There is no explicit partition range for Oracle to use partition level stats. Multi column range partition scheme makes cardinality, join estimate and partition pruning complicated. BLOOM filter is disabled on UAD DB which makes partition pruning by join almost impossible.
4.

The work around is to add two hints

Dynamic sample hint: dynamic_sampling(2), note no table alias is used, so it will be applied to all tables involved. The purpose is to have better cardinality estimate. OPT_PARAM('_bloom_filter_enabled' 'true') to enable bloom filter for join related partition pruning.
PX BUFFER SORT Example
1. BUFFER SORT in PX is the result of that the operations on one row source/table is not parallelized, while the whole query runs in parallel. The BUFFER SORT operation happens when the query switches from serial operation to parallel operation. The temp space usage can be identified, using v$sql_workarea_active or v$sql_plan_monitor, or by researching the plan self if the query has completed long time ago. 2. In above case (DIRECT MARKETING, SEM), the query run with DOP 32, but the operation on the major row source, the fact table AGG_BY_SPACEID_KWOID_7D, was serial operation.
The Impact of BUFFER SORT

If the BUFFER SORT is on the major row source and results significant temp space usages, it basically triples the IO requests (with additional one round of write and read) The more interesting thing is, the whole query runs in parallel, even with very high DOP, but the slowest operation to read a very large table runs in serial. This is basically PX resource waste. The work around is, to identify the operations running in serial (inside plan, those operations have column TQ and IN-OUT empty) and see if parallel hints can be added to appropriate tables, it will not only make PX operation more efficient, also reduce temp space usages.

Excessive Temp Space Usages From Parallel Operations

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Excessive Temp Space Usages From Parallel Operations

Încărcat de

Drepturi de autor:

Formate disponibile

Excessive Temp Space Usages From Parallel Operations

The Sources of Temp Space Usages

How to Identify SQLs with Temp Space Issue

V$SQL Example (UAD)

Locate the Source of Temp Space Usages

Example to Use V$SQL_PLAN_MONITOR

Example to Use V$SQL_WORKAREA_ACTIVE

Temp Space (MB)

Analyze The Plan

Using Realtime Monitor (V$SQL_PLAN_MONITOR)

The Root Cause

The work around is to add two hints

PX BUFFER SORT Example

The Impact of BUFFER SORT

S-ar putea să vă placă și