Sunteți pe pagina 1din 45

8/7/19

OraPub LVCs Are Different


You learn to master the topic at a deeper, higher confidence and more
practical level compared to any other teaching method I offer.

This occurs because the class is spread out over multiple weeks, each session
is 2 hours with a day in between, you do homework/practice on your real
systems and I personally work with you through the entire class.

Details & Registration: https://www.orapub.com/lvc


Event Calendar: https://www.orapub.com/events

1
8/7/19

Hash Joins

Basic Construct
Smaller Result Set
STEP ABOVE HASH JOIN
HASH JOIN
TABLE/ROW SOURCE A (BUILD TABLE)
TABLE/ROW SOURCE B (PROBE TABLE)

Larger Result Set

2
8/7/19

What do you mean by


Bigger
And

Smaller
result set?

Massive Table select ...


from ...
where <filter on big table>

Build Table Smaller result set Smaller Table

Probe Table
Bigger result set

3
8/7/19

Extremely Wide Table


Probe Table
Narrow Table
“Fewer Rows” but they’re
wider (more/longer columns)
Build Table
Smaller result set due to
column width, in spite of having
more rows.

PROBE_TABLE 2

3
BUILD_TABLE
4

6
STEP ABOVE HASH JOIN
HASH JOIN 7
TABLE/ROW SOURCE A (BUILD TABLE)
TABLE/ROW SOURCE B (PROBE TABLE)

4
8/7/19

But aren’t there

Multiple Rows
In Each Bucket?

5
8/7/19

When Hash Joins

Trounce
Nested Loops

CHILD_TABLE

PARENT_TABLE

select *
from parent p, child c
where
p.parent_id = c.parent_id

10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

6
8/7/19

Nested Loops
CHILD_TABLE 2 blocks
PARENT_TABLE

1 block
3 blocks
select *
from parent p, child c
where
p.parent_id = c.parent_id

10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

Nested Loops
CHILD_TABLE 1 block
PARENT_TABLE

“free” block
2 blocks
select *
from parent p, child c
where
p.parent_id = c.parent_id

10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

7
8/7/19

Nested Loops
CHILD_TABLE 1 block

2 Million Blocks
PARENT_TABLE

“free” block
2 blocks
select *
from parent p, child c

of just the parent table10,000 Rows


where
p.parent_id = c.parent_id

1,000,000 Rows Height of 2 100 Blocks


10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

Nested Loops
CHILD_TABLE 1 block
PARENT_TABLE

“free” block
2 blocks
select *
from parent p, child c
where
p.parent_id = c.parent_id

10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

8
8/7/19

Nested Loops
CHILD_TABLE 1 block
PARENT_TABLE

“free” block
2 blocks
select *
from parent p, child c
where
p.parent_id = c.parent_id

10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

Hash Join
CHILD_TABLE

PARENT_TABLE

select *
from parent p, child c
where
p.parent_id = c.parent_id

10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

9
8/7/19

Hash Join
CHILD_TABLE

PARENT_TABLE

select *
from parent p, child c
where
p.parent_id = c.parent_id

10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

Hash Join
CHILD_TABLE

PARENT_TABLE

select *
from parent p, child c
where
p.parent_id = c.parent_id

10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

10
8/7/19

Hash Join
CHILD_TABLE

10,100 Blocks.
PARENT_TABLE

select *
from parent p, child c
where

For everything.
p.parent_id = c.parent_id

10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks (plus some CPU for hashing) 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

Nested Loops Hash Join

2 Million 10,100 Blocks


Blocks + CPU

11
8/7/19

How would this be different if we


were just selecting

One Row?
Nested Loops
CHILD_TABLE 2 blocks
PARENT_TABLE

1 block
3 blocks
select *
from parent p, child c
where
p.parent_id = c.parent_id

10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

12
8/7/19

Nested Loops
CHILD_TABLE 2 blocks

10,003 Blocks
PARENT_TABLE

1 block
3 blocks
select *
from parent p, child c
where

Total.
p.parent_id = c.parent_id

10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks

Nested Loops Hash Join

10,003 Blocks 10,100 Blocks


+ CPU

13
8/7/19

Use

Hash Joins
when …

The cost of reading each row source ONLY ONCE

Is Less Than (better)


the cost of using rows in your first table to
find rows in your second table.

14
8/7/19

Use

Nested Loops
when it’s cheaper to find rows in your
second table using rows in your first table.

Pairings
15
8/7/19

There’s usually a

Reason
Things “pair well” together

But pairings are not

Not
Hard and fast rules.
(think rough guidelines)

16
8/7/19

Hash Joins
Pair with

Full Table Scans

Here’s Why:

17
8/7/19

With hash joins, you


cannot use rows in your
first result set to “pluck”
rows out of your second
result set.

Nested Loops
TABLE_A TABLE_B

18
8/7/19

Hash Joins
TABLE_A TABLE_B

start scanning the


Since you can’t even

second table until you’ve

scanned the first…

19
8/7/19

…hash joins usually have some

“build time”
associated with them before you start seeing any
rows.

Where does all this 0

PROBE_TABLE hashing take place? 2

In Memory
3
BUILD_TABLE
4

6
STEP ABOVE HASH JOIN
HASH JOIN 7

(specifically the PGA)


TABLE/ROW SOURCE A (BUILD TABLE)
TABLE/ROW SOURCE B (PROBE TABLE)

20
8/7/19

When the whole build table is small


enough to fit in memory, we call it an

Optimal
Hash Join

What happens if our Build Table

Doesn’t Fit
In Memory?
21
8/7/19

Partition 1 Partition 2 Partition 3


0 7 14
1 8 15
2 9 16
3 10 17
4 11 18
5 12 19
6 13 20
?

PGA Memory
Partition 3 (Build) Partition 3 (Probe)

(temp space)

PGA Memory
Partition 3 (Build) Partition 3 (Probe)

(temp space)

22
8/7/19

This is called aPartition 3 (Probe)


14

15

One-Pass
16

17

18

19

20

PGA Memory
Partition 3 (Build)

Hash Join
(temp space)

PGA Memory
What happens if

BOTH
Partition 3 Partition 3

PARTITIONS
Are too big to fit into the PGA hash area?

23
8/7/19

PGA Memory

Partition 3 Partition 3

PGA Memory

Partition 3

24
8/7/19

“Chubby Bunny”
Syndrome

Time Start the “trying


not to choke and
die” phase…

Number of Marshmallows

25
8/7/19

Join Order

Flying to a Destination

26
8/7/19

What order did I travel in?


“Please
“I’d
“I’dlike Take
liketo meaticket
a book
plane toroom
Hotel ABC”
toplease”
XYZ”

What order did travel in?


“Meet me at the airport at this
“I’d
“I’d like a plane ticket to XYZ”
timeliketo totake
book
meatoroom
Hotelplease”
ABC”

27
8/7/19

Build Tables
Final Result!

Left-Deep Tree (traditional)

Probe Tables

Left-Deep Tree
HASH JOIN
HASH JOIN
HASH JOIN
TABLE_A
TABLE_B
TABLE_C
TABLE_D

28
8/7/19

Build Tables
Final Result!

Probe Tables

Left-Deep Tree Right-Deep Tree


HASH JOIN HASH JOIN
HASH JOIN TABLE_D
HASH JOIN HASH JOIN
TABLE_A TABLE_C
TABLE_B HASH JOIN
TABLE_C TABLE_B
TABLE_D TABLE_A

29
8/7/19

Right-Deep Trees Require


HASH JOIN
TABLE_D More Memory
HASH JOIN
TABLE_C
HASH JOIN
TABLE_B
TABLE_A

What’s the

Advantage
of a

Right Deep Tree?


30
8/7/19

What if TABLE_A were

HUGE?
HASH JOIN
TABLE_D
HASH JOIN
TABLE_C
HASH JOIN
TABLE_B
TABLE_A

Oracle always wants to use the

smallest table
for the

build table.
31
8/7/19

Sometimes it makes sense to start with your

Largest Table
from a join order perspective

Oracle
can now

32
8/7/19

Oracle
can now

So Why
Smaller = Build?
Larger = Probe?

33
8/7/19

Two Reasons

#1
Build Table
Determines if we have an
Optimal Hash Join
34
8/7/19

#2 The build table is

COPIED
Rows and all

Which would you rather 0

copy, a small table or a 1

large table? 3
CHILD_TABLE
4

PARENT_TABLE 5

35
8/7/19

Hash Outer Join

What if
y
table is our preserv
0

huge? 1
ed
NON_PRESERVED 2

3
PRESERVED
4

6
STEP ABOVE HASH JOIN
HASH OUTER JOIN 7
PRESERVED ROW SOURCE A (BUILD TABLE)
NON-PRESERVED ROW SOURCE B (PROBE TABLE)

36
8/7/19

Hash Outer Right Join

PRESERVED 2

3
NON_PRESERVED

6
STEP ABOVE HASH JOIN
HASH OUTER RIGHT JOIN 7
PRESERVED ROW SOURCE A (BUILD TABLE)
NON-PRESERVED ROW SOURCE B (PROBE TABLE)

37
8/7/19

Hash Semi Join


and
Hash Semi Right Join
(Only used when a semi-join subquery is unnested)

select *
select * from some_table a
from some_table a where a.id =some
where (select b.id
exists (select * from some_other_table b)
from some_other_table b
where a.id = b.id)

select *
select * from some_table a
from some_table a where a.id =any
where a.id (select b.id
in (select b.id from some_other_table b)
from some_other_table b)

38
8/7/19

Hash Semi Join

SUBQUERY_TABLE 2

3
MAIN_TABLE
4

6
STEP ABOVE HASH JOIN
HASH SEMI JOIN 7
MAIN ROW SOURCE A (BUILD TABLE)
SUBQUERY ROW SOURCE B (PROBE TABLE)

39
8/7/19

Hash Semi Right Join

MAIN_TABLE 2

3
SUBQ_TABLE
4

6
STEP ABOVE HASH JOIN
HASH SEMI RIGHT JOIN 7
SUBQUERY ROW SOURCE A (BUILD TABLE)
MAIN ROW SOURCE B (PROBE TABLE)

40
8/7/19

Hash Joins
only work with

Equijoins
select *
from table_a a
, table_b b
where a.id = b.id

41
8/7/19

select *
from table_a a
, table_b b
where a.id > b.id

Hash Joins
require
Join Access Predicates

42
8/7/19

Plan hash value: 3387384502


---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 |
|* 1 | HASH JOIN | | 1 | 21 |
| 2 | TABLE ACCESS BY INDEX ROWID BATCHED| TABLE_A | 1 | 15 |
|* 3 | INDEX RANGE SCAN | TABLE_A_ID_IDX | 1 | |
| 4 | TABLE ACCESS BY INDEX ROWID BATCHED| TABLE_B | 1 | 6 |
|* 5 | INDEX RANGE SCAN | TABLE_B_CODE_IDX | 1 | |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("TABLE_A"."ID"="TABLE_B"."CODE")
3 - access("TABLE_A"."ID"=1)
5 - access("TABLE_B"."CODE"=1)

How to Tune a Hash Join


Make the build table take up
less memory

43
8/7/19

How to Tune a Hash Join


•Select fewer columns on your build table
•Select fewer rows from build table
•If two tables are PARTITIONED THE SAME on
their join key, Oracle can work on a partition
at a time

How to Tune a Hash Join


Make sure you have sufficient PGA
memory to avoid multi-pass or
one-pass operations.

44
8/7/19

Summary
• Hash joins only work with equijoins
• Oracle always wants the build table to be the smaller result set
• The entire row is hashed in a hash join (select as few columns
as possible!)
• You can tune a hash join by reducing the memory requirements
of the build table (or through partition-wise joins)
• Right-deep trees don’t change the join order, just the order of
operations

Thanks!

Questions? Comments? Feedback?


kaley@tuningsql.com

45

S-ar putea să vă placă și