Understanding Indexes

International Oracle Users Group Live 2004
Technical Session #549:
Understanding Indexes
Tim Gorman Principal - SageLogix, Inc.
Email: tim@sagelogix.com
www.SageLogix.Com
Agenda
B*Tree index architecture Issues with B*Tree indexes in Oracle
Real issues
Sparsely-populated indexes Contention on INSERT Uneven data distribution Low data cardinality (a.k.a. selectivity)
Imagined or mythical issues

Indexes needing rebalancing
???
Some interesting options for indexing

Function-based, descending, compression
2
B*Tree index architecture

Oracle implements balanced B*Tree indexes
as its primary indexing method
Tree structure consisting of root, branch, and leaf nodes each node is one database block in Oracle root node is the top-level branch node starting point for searches into the index branch nodes are connectors point to other branch nodes or to leaf nodes leaf nodes contain data values and pointers to rows
3

Branch entries contain: Index: PK_X Max data value piece DBA of next level branch or leaf root Leaf entries contain: Data value ROWID of table row branch branch branch
branch
leaf
leaf
leaf
leaf
leaf Table: X
leaf
leaf

Index read I/O is sequential in nature
Oracle wait-event for single-block I/O requests used during indexed access db file sequential read Used with RANGE, UNIQUE, and FULL scans FAST FULL index-scans are an exception Uses sequential multiblock I/O similar to FULL tablescans Conventional wisdom that tables and indexes must be separated to different tablespaces to enable some form of parallel I/O is a myth Think of how a treasure hunt is performed.
5

When indexes are built on populated
tables using CREATE INDEX or ALTER INDEX xxx REBUILD
1. Table is scanned and column values are sorted 2. Leaf blocks of index are populated first 3. Then supporting branch levels are built to support those leaves
All blocks are filled up to PCTFREE

threshold
6

When indexes are built transactionally
using conventional-path INSERT or UPDATE statements
How the indexes grow is dependent upon two possible design decisions Optimize either for: random data values
Or:
sequentially ascending data values
7

If optimizing for random data values
As blocks fill, split them in half, move upper half to new block, leave lower half behind Leaves behind half-full blocks Allows back-filling as lower data values occur
If optimizing for sequential data values

As blocks fill, simply overflow into a new block Leaves behind full blocks No need to worry about back-filling as lower data values could never occur

Overflow: Grows like this
Overflowing uses space more efficiently in a heap

(non-sorted) data structure Also uses space more efficiently when optimizing sequential data values
Split:
Grows like this
Splitting anticipates back-filling in a sorted data

structure Therefore uses space more efficiently when optimizing random data values
9

If the last index entry inserted is the highest
data value in the block
Then data will overflow into the next block on the same level
If the last index entry inserted is not the

highest data value in the block
Then the current block will split, with the lower set of values remaining in place and the higher set of values moving to the next block on the same level
10

If the branch block above cannot
accommodate another child
Then this algorithm is applied recursively first
If this algorithm iterates recursively all the

way back to the root block, and the root becomes full
Then the root splits/overflows to generate a new branch level (BLEVEL)
Root block stays in place, generates two new blocks at a new 1st BLEVEL below
11

Balanced B*Tree indexes grow balanced by
design, automatically
New levels are added to the tree structure above, dynamically, as needed
Indexes do not shrink automatically when

deletions occur
Instead, empty index entries are left in place For possible reuse by newly inserted data ROWID (pointer to data) portion is set to NULL Deletions may cause an index to become sparse and thus less efficient over time
12
Issues with B*Tree indexes
B*Tree indexes optimize:

Both the random and sequential data values! Even distribution of distinct data values High data cardinality or selectivity of values Sparseness Resulting from data deletions
13
As a result, issues with B*Tree indexes are:
Block contention when inserting sequential data Uneven distribution of data values Popular and unpopular data values Low data cardinality
Sparsely-populated indexes
Indexes never become unbalanced over time
If so, then why does performance sometimes deteriorate over time?
Instead, indexes can become sparsely

populated due to:
Deletions of row data Unfortunate patterns of inserted random data
Sparseness simply means that more I/O is

required to perform the same work
14
How is this condition detected?
ANALYZE INDEX VALIDATE STRUCTURE command Populate session-private view named INDEX_STATS with only one row Column PCT_USED is the average percentage of space utilized in the blocks belonging to the index Derived from the ratio of the value of the columns USED_SPACE and BTREE_SPACE Expect PCT_USED to be 90 by default Lesser values may indicate sparseness developing Best to watch values over time
15
How is this condition detected?
If PCT_USED is much less than 90 Non-zero values in the column DEL_LF_ROWS Cause of sparseness is probably row deletions, if the value in the column DEL_LF_ROWS is a large percentage of value in column LF_ROWS
16
Rebuild or coalesce are two possible solutions for index sparseness

Dont guess!!!
Make certain using ANALYZE INDEX VALIDATE STRUCTURE

Be aware that the ANALYZE command locks the index, however
To rebuild an index:
Use ALTER INDEX REBUILD Use ALTER INDEX COALESCE
17
To coalesce space within an index:
ALTER INDEX REBUILD is the most commonlyused solution for sparsely-populated indexes
Unlike CREATE INDEX, uses the existing index as the source, which is faster because: Index is usually smaller than the table (less I/O) Index is already sorted (some sorting still needed) Other features of ALTER INDEX REBUILD include: Parallel execution (throw more resources at task) Direct-path operations (no rollback/undo generated)
NOLOGGING (no redo generated)

COMPUTE STATISTICS (better CBO stats cheaply) ONLINE (allows full usage of index during task)
18
ALTER INDEX COALESCE is another
alternative to rebuilding
Merges unused space within indexes to free blocks for reuse
Slower than ALTER INDEX REBUILD

because it is a transactional operation, not a bulk operation
No direct-path, parallel, nologging, etc
But COALESCE is implicitly an ONLINE operation
19
Contention on INSERT
Multiple concurrently-executing INSERTs into
an index on column(s) with sequential data values can also bottle-neck on buffer busy waits (a.k.a. block-level contention)
Pseudo-randomizing such data with REVERSE key indexes relieves this performance problem Instead of many processes attempting to insert index entries into the right-most leaf block
Insertions are evenly scattered over all of the available leaf blocks
20
[ CREATE | ALTER ] INDEX REVERSE
Pseudo-randomizes non-random data By flipping or reversing the physical order of the data value during storage Causes sequential data values to become more random by simply reversing data 123456 becomes 654321 123457 becomes 754321, etc... Index data values are transparently converted (flipped) and unconverted (unflipped) upon insert and retrieval, respectively
21
Adverse impacts of using REVERSE indexes:
only equivalence operations will use the index =, !=, <>, IN, and NOT IN range-scans will not use the index >, >=, <, <=, LIKE, BETWEEN
But, is this really a problem?

Depends on how the sequential data values are used: Sequence numbers or surrogate system-generated keys are usually sought using equivalence operations
No problem here
But timestamps are often range-scanned There is likely a problem with these situations
22
Uneven distribution of data values

Basic rule of thumb for using indexes:
Use indexes when selecting unpopular data Otherwise, use FULL table-scan Or a different index? Or partitioning? Or
By default, the Oracle cost-based optimizer

(CBO) assumes even data distribution
Only LOVAL, HIVAL, and DISTINCT_KEYS gathered during ANALYZE or DBMS_STATS
Real-life often invalidates this assumption

Selectively gathering column-level statistics provides the CBO with information on data values which are popular and which are unpopular
23

There are two commands for gathering CBO
statistics
ANALYZE TABLE FOR [ ALL | INDEXED ] COLUMNS DBMS_STATS.GATHER_COLUMN_STATS
Populates data dictionary views

DBA_TAB_HISTOGRAMS and DBA_PART_HISTOGRAMS with rows representing buckets of data values
All buckets assumed to have the same number of rows Each bucket is defined by its highest value By default, only one row populated
24

Please dont gather column-level statistics
unless the problem is proven
Gathering column-level statistics is not a good choice as a default operation Perform SQL tuning with SQL Trace/TKPROF or STATSPACK to provide proof of a problem
If the CBO fails to choose an index

it could be because it has detected low cardinality on average
If the CBO chooses an index

it could perform poorly if the data value is overly popular
25
Illustrating the importance of data

SQL> create table t1 2 ( c1 varchar2(30), 3 c2 number, 4 c3 number 5 ) tablespace tools; Table created. SQL> begin 2 for i in 1..100000 loop 3 insert into t1 4 values(to_char(mod(i,187)), i, mod(i,187)); 5 end loop; 6 end; 7 / PL/SQL procedure successfully completed.
26

SQL> create index i1 on t1(c1) tablespace tools; Index created. SQL> analyze table t1 compute statistics; Table analyzed. SQL> set autotrace on SQL> select c1 from t1 where c1 = '10000'; no rows selected Execution Plan ----------------------------------------------------0 SELECT STATEMENT Optimizer=CHOOSE (Cost=37 Card=535 Bytes=1605) 1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=37 Card=535 Bytes=1605)
27

SQL> select num_rows, blocks 2 from user_tables 3 where table_name = 'T1'; NUM_ROWS BLOCKS ------------- --------100,000 232 SQL> select num_rows, distinct_keys, 2 avg_leaf_blocks_per_key, avg_data_blocks_per_key 3 from user_indexes where index_name = 'I1'; Avg Leaf Avg Data Distinct Blocks Blocks Nbr Rows Key Per Key Per Key -------- -------- ---------- --------100,000 187 1 231
28

SQL> truncate table t1
Table truncated.
SQL> begin 2 for i in 1..100000 loop 3 insert into t1 4 values(to_char(round(i/187,0)), i, 5 round(i/187,0)); 6 end loop; 7 end; 8 / PL/SQL procedure successfully completed.
29

SQL> create index i1 on t1(c1) tablespace tools; Index created. SQL> analyze table t1 compute statistics; Table analyzed. SQL> set autotrace on SQL> select c1 from t1 where c1 = '10000'; no rows selected Execution Plan ----------------------------------------------------0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=187 Bytes=561) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=2 Card=187 Bytes=561) 2 1 INDEX (RANGE SCAN) OF 'I1' (NON-UNIQUE) (Cost=1 Card=187)
30

SQL> select num_rows, blocks 2 from user_tables 3 where table_name = 'T1'; NUM_ROWS BLOCKS ------------- --------100000 242 SQL> select num_rows, distinct_keys, 2 avg_leaf_blocks_per_key, avg_data_blocks_per_key 3 from user_indexes where index_name = 'I1'; Avg Leaf Avg Data Distinct Blocks Blocks Nbr Rows Key Per Key Per Key -------- -------- ---------- --------100000 536 1 1
31
Low data cardinality

B*Tree indexes work best with high
cardinality
Think of the example of a telephone book Would you use the standard indexing method to find all occurances of a name comprising 25% of the book? Or, would it be better to just scan through the book?
32

Bitmap indexes are extremely compact
That is their primary advantage No real magic involved Less I/O is faster, plain and simple
Still, no matter how compact they are

a FULL table scan is still likely faster than a single bitmap index on low cardinality data However, merging two or more bitmap indexes brings out the real power of the mechanism Scanning of merged bitmaps using fast lowlevel bitmasking operations Sifts through huge numbers of rows quickly
33

Bitmap indexes are designed for quickly scanning low
cardinality data
each database block is comprised of two bitmap segments each bitmap segment is comprised of a bitmap, a ROWID list, and a list of distinct data values
ROWID list is a forward-compressed list of ROWIDs, positioned according to bits in bitmap Bitmap is organized into columns for distinct data values and rows of bits representing rows
ROWID list Distinct values list Bitmap ROWID list Distinct values list Bitmap
More distinct data values means less room for rows within each data value
34

Bitmap indexes extremely slow during DML
INSERT, UPDATE, and DELETE
For each change, entries within the bitmap segments must be:
Expanded/decoded Manipulated Re-encoded
For multiple concurrently-executing DML changes

Slower individual changes cause queueing Only two bitmap segments per block increases contention
Because of this, bitmap indexes are most feasible on partitioned tables in non-transactional applications
Data changes should be performed in bulk using some variation on the technique of EXCHANGE PARTITION
35
Function-based indexes
Indexes can be based on functions and
expressions
create index xxx_ix1 on xxx(upper(c1)); create index xxx_ix2 on xxx(upper(c1)||yadda_yadda(c2));
36
User-defined functions usable along with standard
built-in functions
Restrictions for using user-defined functions

But functions and expressions must be deterministic repeatable: given the same inputs, the function or expression must always provide the same results
user-defined functions cannot access tables or package variables

PRAGMA RESTRICT_REFERENCES(, RNDS, WNDS, RNPS, WNPS) CREATE FUNCTION DETERMINISTIC
37
Function-based indexes must have statistics
gathered before they can be used
System permissions
[ GLOBAL ] QUERY REWRITE needed to CREATE or ALTER REBUILD
Parameter QUERY_REWRITE_ENABLED = TRUE to

use
38
Descending indexes
Implemented as a form of function-based
indexes
CREATE INDEX DESC data values in index leaf blocks are sorted in descending order must have statistics before it will be utilized not usable with rule-based optimizer system permission QUERY REWRITE or GLOBAL QUERY REWRITE needed to CREATE or ALTER REBUILD parameter QUERY_REWRITE_ENABLED not necessary to use
39
Index compression
Since Oracle8i, compression has been available on
B*Tree indexes:
Reduces repeated storage of prefix column data values Index compression [ COMPRESS [ #-prefix-cols ] | NOCOMPRESS ] UNIQUE: default #-prefix-cols is (#-cols)-1 NONUNIQUE: default #-prefix-cols is #-cols Same syntax for IOT in USING clause as for index compression DML is supported on compressed indexes, but it becomes much slower Index compression is best used for read-mostly or readonly situations Mixing of compressed and uncompressed index partitions is possible
40
Solutions for B*Tree indexes
Dont assume that there is anything wrong with a B*Tree index

Prove it first!!!
To address sparse index structures

ALTER INDEX [ REBUILD | COALESCE ]
To address block contention on INSERTs

ALTER INDEX REVERSE
To address uneven data distribution

Gather column-level statistics or histograms
To address low cardinality

Use bitmap indexes or no indexes at all
41
Additional groovy stuff

Function-based, descending, and compressed indexes
Technical Session #549
Slides, paper, and scripts downloadable from http://www.SageLogix.com and http://www.EvDBT.com/papers.htm Email: tim@sagelogix.com
42

Understanding Indexes

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Understanding Indexes

Încărcat de

Drepturi de autor:

Formate disponibile

International Oracle Users Group Live 2004

Technical Session #549:

Imagined or mythical issues

Some interesting options for indexing

B*Tree index architecture

B*Tree index architecture

B*Tree index architecture

B*Tree index architecture

All blocks are filled up to PCTFREE

B*Tree index architecture

B*Tree index architecture

If optimizing for sequential data values

B*Tree index architecture

Overflowing uses space more efficiently in a heap

Grows like this

Splitting anticipates back-filling in a sorted data

B*Tree index architecture

If the last index entry inserted is not the

B*Tree index architecture

If this algorithm iterates recursively all the

B*Tree index architecture

Indexes do not shrink automatically when

Issues with B*Tree indexes

B*Tree indexes optimize:

As a result, issues with B*Tree indexes are:

Instead, indexes can become sparsely

Sparseness simply means that more I/O is

Rebuild or coalesce are two possible solutions for index sparseness

Make certain using ANALYZE INDEX VALIDATE STRUCTURE

To coalesce space within an index:

NOLOGGING (no redo generated)

Slower than ALTER INDEX REBUILD

But COALESCE is implicitly an ONLINE operation

But, is this really a problem?

Uneven distribution of data values

By default, the Oracle cost-based optimizer

Real-life often invalidates this assumption

Uneven distribution of data values

Populates data dictionary views

Uneven distribution of data values

If the CBO fails to choose an index

If the CBO chooses an index

Illustrating the importance of data

Illustrating the importance of data

Illustrating the importance of data

Illustrating the importance of data

Illustrating the importance of data

Illustrating the importance of data

Low data cardinality

Low data cardinality

Still, no matter how compact they are

Low data cardinality

Low data cardinality

For multiple concurrently-executing DML changes

Restrictions for using user-defined functions

user-defined functions cannot access tables or package variables

Parameter QUERY_REWRITE_ENABLED = TRUE to

Solutions for B*Tree indexes

Dont assume that there is anything wrong with a B*Tree index

To address sparse index structures

To address block contention on INSERTs

To address uneven data distribution