Documente Academic
Documente Profesional
Documente Cultură
This occurs because the class is spread out over multiple weeks, each session
is 2 hours with a day in between, you do homework/practice on your real
systems and I personally work with you through the entire class.
1
8/7/19
Hash Joins
Basic Construct
Smaller Result Set
STEP ABOVE HASH JOIN
HASH JOIN
TABLE/ROW SOURCE A (BUILD TABLE)
TABLE/ROW SOURCE B (PROBE TABLE)
2
8/7/19
Smaller
result set?
Probe Table
Bigger result set
3
8/7/19
PROBE_TABLE 2
3
BUILD_TABLE
4
6
STEP ABOVE HASH JOIN
HASH JOIN 7
TABLE/ROW SOURCE A (BUILD TABLE)
TABLE/ROW SOURCE B (PROBE TABLE)
4
8/7/19
Multiple Rows
In Each Bucket?
5
8/7/19
Trounce
Nested Loops
CHILD_TABLE
PARENT_TABLE
select *
from parent p, child c
where
p.parent_id = c.parent_id
10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks
6
8/7/19
Nested Loops
CHILD_TABLE 2 blocks
PARENT_TABLE
1 block
3 blocks
select *
from parent p, child c
where
p.parent_id = c.parent_id
10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks
Nested Loops
CHILD_TABLE 1 block
PARENT_TABLE
“free” block
2 blocks
select *
from parent p, child c
where
p.parent_id = c.parent_id
10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks
7
8/7/19
Nested Loops
CHILD_TABLE 1 block
2 Million Blocks
PARENT_TABLE
“free” block
2 blocks
select *
from parent p, child c
Nested Loops
CHILD_TABLE 1 block
PARENT_TABLE
“free” block
2 blocks
select *
from parent p, child c
where
p.parent_id = c.parent_id
10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks
8
8/7/19
Nested Loops
CHILD_TABLE 1 block
PARENT_TABLE
“free” block
2 blocks
select *
from parent p, child c
where
p.parent_id = c.parent_id
10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks
Hash Join
CHILD_TABLE
PARENT_TABLE
select *
from parent p, child c
where
p.parent_id = c.parent_id
10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks
9
8/7/19
Hash Join
CHILD_TABLE
PARENT_TABLE
select *
from parent p, child c
where
p.parent_id = c.parent_id
10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks
Hash Join
CHILD_TABLE
PARENT_TABLE
select *
from parent p, child c
where
p.parent_id = c.parent_id
10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks
10
8/7/19
Hash Join
CHILD_TABLE
10,100 Blocks.
PARENT_TABLE
select *
from parent p, child c
where
For everything.
p.parent_id = c.parent_id
10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks (plus some CPU for hashing) 100 Rows per Block
100 Rows per Block 27 Leaf Blocks
11
8/7/19
One Row?
Nested Loops
CHILD_TABLE 2 blocks
PARENT_TABLE
1 block
3 blocks
select *
from parent p, child c
where
p.parent_id = c.parent_id
10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks
12
8/7/19
Nested Loops
CHILD_TABLE 2 blocks
10,003 Blocks
PARENT_TABLE
1 block
3 blocks
select *
from parent p, child c
where
Total.
p.parent_id = c.parent_id
10,000 Rows
1,000,000 Rows Height of 2 100 Blocks
10,000 Blocks 100 Rows per Block
100 Rows per Block 27 Leaf Blocks
13
8/7/19
Use
Hash Joins
when …
14
8/7/19
Use
Nested Loops
when it’s cheaper to find rows in your
second table using rows in your first table.
Pairings
15
8/7/19
There’s usually a
Reason
Things “pair well” together
Not
Hard and fast rules.
(think rough guidelines)
16
8/7/19
Hash Joins
Pair with
Here’s Why:
17
8/7/19
Nested Loops
TABLE_A TABLE_B
18
8/7/19
Hash Joins
TABLE_A TABLE_B
19
8/7/19
“build time”
associated with them before you start seeing any
rows.
In Memory
3
BUILD_TABLE
4
6
STEP ABOVE HASH JOIN
HASH JOIN 7
20
8/7/19
Optimal
Hash Join
Doesn’t Fit
In Memory?
21
8/7/19
PGA Memory
Partition 3 (Build) Partition 3 (Probe)
(temp space)
PGA Memory
Partition 3 (Build) Partition 3 (Probe)
(temp space)
22
8/7/19
15
One-Pass
16
17
18
19
20
PGA Memory
Partition 3 (Build)
Hash Join
(temp space)
PGA Memory
What happens if
BOTH
Partition 3 Partition 3
PARTITIONS
Are too big to fit into the PGA hash area?
23
8/7/19
PGA Memory
Partition 3 Partition 3
PGA Memory
Partition 3
24
8/7/19
“Chubby Bunny”
Syndrome
Number of Marshmallows
25
8/7/19
Join Order
Flying to a Destination
26
8/7/19
27
8/7/19
Build Tables
Final Result!
Probe Tables
Left-Deep Tree
HASH JOIN
HASH JOIN
HASH JOIN
TABLE_A
TABLE_B
TABLE_C
TABLE_D
28
8/7/19
Build Tables
Final Result!
Probe Tables
29
8/7/19
What’s the
Advantage
of a
HUGE?
HASH JOIN
TABLE_D
HASH JOIN
TABLE_C
HASH JOIN
TABLE_B
TABLE_A
smallest table
for the
build table.
31
8/7/19
Largest Table
from a join order perspective
Oracle
can now
32
8/7/19
Oracle
can now
So Why
Smaller = Build?
Larger = Probe?
33
8/7/19
Two Reasons
#1
Build Table
Determines if we have an
Optimal Hash Join
34
8/7/19
COPIED
Rows and all
large table? 3
CHILD_TABLE
4
PARENT_TABLE 5
35
8/7/19
What if
y
table is our preserv
0
huge? 1
ed
NON_PRESERVED 2
3
PRESERVED
4
6
STEP ABOVE HASH JOIN
HASH OUTER JOIN 7
PRESERVED ROW SOURCE A (BUILD TABLE)
NON-PRESERVED ROW SOURCE B (PROBE TABLE)
36
8/7/19
PRESERVED 2
3
NON_PRESERVED
6
STEP ABOVE HASH JOIN
HASH OUTER RIGHT JOIN 7
PRESERVED ROW SOURCE A (BUILD TABLE)
NON-PRESERVED ROW SOURCE B (PROBE TABLE)
37
8/7/19
select *
select * from some_table a
from some_table a where a.id =some
where (select b.id
exists (select * from some_other_table b)
from some_other_table b
where a.id = b.id)
select *
select * from some_table a
from some_table a where a.id =any
where a.id (select b.id
in (select b.id from some_other_table b)
from some_other_table b)
38
8/7/19
SUBQUERY_TABLE 2
3
MAIN_TABLE
4
6
STEP ABOVE HASH JOIN
HASH SEMI JOIN 7
MAIN ROW SOURCE A (BUILD TABLE)
SUBQUERY ROW SOURCE B (PROBE TABLE)
39
8/7/19
MAIN_TABLE 2
3
SUBQ_TABLE
4
6
STEP ABOVE HASH JOIN
HASH SEMI RIGHT JOIN 7
SUBQUERY ROW SOURCE A (BUILD TABLE)
MAIN ROW SOURCE B (PROBE TABLE)
40
8/7/19
Hash Joins
only work with
Equijoins
select *
from table_a a
, table_b b
where a.id = b.id
41
8/7/19
select *
from table_a a
, table_b b
where a.id > b.id
Hash Joins
require
Join Access Predicates
42
8/7/19
43
8/7/19
44
8/7/19
Summary
• Hash joins only work with equijoins
• Oracle always wants the build table to be the smaller result set
• The entire row is hashed in a hash join (select as few columns
as possible!)
• You can tune a hash join by reducing the memory requirements
of the build table (or through partition-wise joins)
• Right-deep trees don’t change the join order, just the order of
operations
Thanks!
45