Sunteți pe pagina 1din 24

Module 7: Analyze Secondary Index Criteria

After completing this module, you will be able to: Choose columns as candidate Secondary Indexes. Describe Composite Secondary Indexes. Analyze Change Rating and Value Access.

Accessing Rows
SELECT {expression} FROM tablename

Returns the value(s) from the table(s) for display or processing. The row(s) must be physically read first.
UPDATE tablename SET columns = {expression}

Changes one or more column values to new values. The rows(s) must be physically located (read) first.
DELETE FROM tablename

Removes rows from a table. The row(s) must be physically located (read) first.
Any of the above SQL statements can contain a WHERE clause.

Values in the WHERE cause tell the RDBMS what set of rows to act on. Without a WHERE clause, all rows participate in the operation. Limiting the number of rows the RDBMS must handle improves throughput.

Row Selection
WHERE clause conditions that may use indexing if available*:
colname = value colname IS NULL colname IN (subquery) * colname IN (explicit list of values) t1.col_x = t1.col_y t1.col_x = t2.col_x condition1 AND condition2 condition1 OR condition2 colname = ANY, SOME or ALL

Access methods for the above depend on whether the column(s) are indexed, type of index, and selectivity of the index.

WHERE clause conditions that typically cause a Full Table Scan:


non-equality comparisons colname IS NOT NULL colname NOT IN (explicit list of values) colname NOT IN (subquery) colname BETWEEN ... AND Join condition1 OR Join condition2 t1.col_x [ computation ] = value t1.col_x [ computation ] = t1.col_y INDEX (colname) SUBSTRING (colname) SUM MIN MAX AVG COUNT DISTINCT ANY ALL NOT (condition1) col1 || col2 = value colname LIKE ... missing a WHERE clause

The following functions effect output only, not base row selection. Poor relational models severely limit physical design choices and generally force more Full Table Scans.

GROUP BY HAVING WITH WITH BY ...

ORDER BY UNION INTERSECT EXCEPT

Secondary Index Considerations



Secondary Indexes consume disk space for their subtables. INSERTs, DELETEs, and UPDATEs (sometimes) cost double the I/Os. Choose Secondary Indexes on frequently used set selections. Avoid choosing Secondary Indexes with volatile data values. Secondary Index use is based on an Equality search. Weigh the impact on Batch Maintenance and OLTP applications. USI changes are Transient Journaled. NUSI changes are not. A NUSI may have multiple rows per value. The Optimizer may not use a NUSI if it is too weakly selective. Remove or drop NUSIs that are not used. NUSIs are generally useful in decision support applications. Data demographics change over time. Revisit ALL index (Primary and Secondary) choices regularly. Make sure they are still serving you well.

Composite Secondary Indexes


Dont include a column that doesnt improve selectivity
(e.g., CITY vs. CITY + STATE).

Composite Indexes are used only if ALL columns have explicit values.
Composite USIs may be used for PK enforcement. Avoid composite NUSIs other than for covering indexes. Multiple single column NUSIs support bit mapping combinations. COLLECT STATISTICS on all NUSIs to allow the Optimizer to use them.

Secondary Index Candidate Guidelines


All Primary Index candidates are Secondary Index candidates. Columns that are not Primary Index candidates have to also be considered as NUSI candidates. A guideline to use in initially selecting NUSI candidates is the following:
typical rows per value rows in table

<

typical row size typical block size

Example: Assume the typical rows per value is 2000 and the table has 100,000 rows. If typical row size is 480 bytes and typical block size is 48,000 bytes, then
Is 2000 100,000 < 480 48,000 ? Is .02 < .01 ? No, therefore this column is not a NUSI candidate.

If typical row size is 4800 bytes and typical block size is 48,000 bytes, then
Is 2000 100,000 < 4800 48,000 ? Is .02 < .10 ? Yes, therefore this column is a NUSI candidate.

Exercise 3 Sample
On the following pages, there are sample tables with typical rows per value demographics. Indicate ALL possible Secondary Index candidates (USI and NUSI). Later exercises will guide your final choices. EXAMPLE 6,000,000 Rows PK/FK A PK,SA B C FK,NN D NN,ND Secondary Index Candidate Guidelines: All PI candidates are secondary Index candidates. Assume 48K blocks and rows are < 480 bytes. Column(s) are a NUSI candidate if
typical rows per value rows in table typical row size

<
F

typical block size

Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating PI/SI

0 1M 50M 6M 1 0 1 0 UPI USI

2.6K 0 0 700K 12 5 7 1 NUPI NUSI

0 1K 5K 150K 300 0 50 5 NUPI NUSI

500K 0 0 6M 1 0 1 3 UPI USI

0 0 0 8 800K 0 700K 0

0 0 0 1.5M 9 725K 3 4 NUSI?

0 0 0 1.5M 725K 5 3 4 NUSI?

52 0 0 700 10K 10K 9K 9 NUSI

Exercise 3 Choosing SI Candidates


ENTITY 1 1,000,000 Rows PK/FK A PK,UA B C D E F

Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating PI/SI

50K 10M 10M 1M 1 0 1 0 UPI USI

0 0 0 950K 2 0 1 3 NUSI NUPI

0 0 0 3K 400 0 325 2 NUPI NUSI

0 0 0 2.5K 350 0 300 1 NUSI NUPI

0 0 0 200K 3 500K 2 1 NUSI ?

0 0 0 9K 100 0 90 1 NUPI NUSI

Exercise 3 Choosing SI Candidates (cont.)


ENTITY 2 100,000 Rows PK/FK G PK,SA H I J K L

Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating

0 100M 100M 100K 1 0 1 0 UPI USI

360 0 0 1K 200 0 100 0 NUPI? NUSI

12 0 0 90K 2 4K 1 9 NUSI

12 0 0 12 10K 0 8K 1

4 0 0 5 24K 0 20K 2

0 0 0 1.8K 60 0 50 0 NUPI NUSI

PI/SI

Exercise 3 Choosing SI Candidates (cont.)


DEPENDENT 500,000 Rows PK/FK FK Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating PI/SI 0 700K 1M 200K 3 0 1 0 A PK SA 0 0 0 5 200K 0 50K 0 0 0 0 9K 75 0 50 3 NUPI NUSI 0 0 0 100K 2 390K 1 1 0 0 0 500K 1 0 1 0 UPI USI 0 0 0 200K 2 150K 1 1 M N O P NN, ND Q

UPI NUPI USI NUSI

NUSI?

NUSI?

Exercise 3 Choosing SI Candidates (cont.)


ASSOCIATIVE 1 3,000,000 Rows PK/FK FK Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating 182 0 0 1M 5 0 3 0 UPI A PK SA 0 8M 300M 100K 50 0 30 0 0 0 0 150 21K 0 19K 0 0 0 0 8K 400 0 350 0 G R S

PI/SI

NUPI
USI NUSI

NUPI
NUSI NUSI

NUPI
NUSI

Exercise 3 Choosing SI Candidates (cont.)


ASSOCIATIVE 2 1,000,000 Rows PK/FK FK Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating 0 7M 8M 500K 3 0 1 0 UPI A M PK FK 0 25K 200K 100K 150 0 8 0 0 0 0 5.6K 180 0 170 0 0 0 0 750 1350 0 1000 0 G T U

PI/SI

NUPI
USI NUSI

NUPI
NUSI

NUPI
NUSI NUSI

Exercise 3 Choosing SI Candidates (cont.)


HISTORY 8,000,000 Rows PK/FK FK Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating 10M 300M 2.4B 1M 18 0 3 0 UPI A PK SA 1.5K 0 0 730 18K 0 17K 0 0 0 0 N/A N/A N/A N/A N/A 0 0 0 N/A N/A N/A N/A N/A 0 0 0 N/A N/A N/A N/A N/A DATE D E F

PI/SI

NUPI
USI NUSI NUSI

Change Rating
Change Rating indicates how often values in a column are UPDATEd:

0 = column values never change. 10 = column changes with every write operation.
PK columns are always 0. Historical data columns are always 0. Data that does not normally change = 1. Update tracking columns = 10. All other columns are rated 2 - 9. Base Primary Index choices on columns with very stable data values:

A change rating of 0 - 2 is reasonable.


Base Secondary Index choices on columns with fairly stable data values.

A change rating of 0 - 5 is reasonable.

Value Access
Value Access:

How often a column appears with an equality value. For example:


WHERE column_name = hardcoded_value. WHERE column_name = substitutable_value. Value Access Frequency:

How often annually all known transactions access rows from the table
through this column. The above demographics result from Activity Modeling. Index choices determine if the rows will be accessed via index or FTS. Secondary indexes can be created at any time.

CREATE INDEX involves a FTS, sorting, and index subtable writes.


Low value Access Frequency:

Secondary Index overhead may cost more than doing the FTS.

Exercise 4 Sample
On the following pages, there are sample tables with change row and value access demographics. Eliminate Index candidates based on change rating and value access. Later exercises will guide your final choices. EXAMPLE 6,000,000 Rows PK/FK A PK,SA B C FK,NN D NN,ND E F G H Change Rating Guidelines: PI change rating 0 - 2. SI change rating 0 - 5. Value Access Guideline: NUSI Value Access > 0

Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating PI/SI

0 1M 50M 6M 1 0 1 0 UPI USI

2.6K 0 0 700K 12 5 7 1 NUPI NUSI

0 1K 5K 150K 300 0 50 5 NUPI NUSI

500K 0 0 6M 1 0 1 3 UPI USI

0 0 0 8 800K 0 700K 0

0 0 0 1.5M 9 725K 3 4 NUSI?

0 0 0 1.5M 725K 5 3 4 NUSI?

52 0 0 700 10K 10K 9K 9 NUSI

Exercise 4 Eliminating Index Candidates


ENTITY 1 1,000,000 Rows PK/FK A PK,UA B C D E F

Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating PI/SI

50K 10M 10M 1M 1 0 1 0 UPI USI

0 0 0 950K 2 0 1 3 NUPI NUSI

0 0 0 3K 400 0 325 2 NUPI NUSI

0 0 0 2.5K 350 0 300 1 NUPI NUSI

0 0 0 200K 3 500K 2 1 NUSI?

0 0 0 9K 100 0 90 1 NUPI NUSI

Exercise 4 Eliminating Index Candidates (cont.)


ENTITY 2 100,000 Rows PK/FK G PK,SA H I J K L

Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating

0 100M 100M 100K 1 0 1 0 UPI USI

360 0 0 1K 200 0 100 0 NUPI? NUSI

12 0 0 90K 2 4K 1 9

12 0 0 12 10K 0 8K 1

4 0 0 5 24K 0 20K 2

0 0 0 1.8K 60 0 50 0 NUPI NUSI

PI/SI

NUSI

Exercise 4 Eliminating Index Candidates (cont.)


DEPENDENT 500,000 Rows PK/FK FK Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating PI/SI 0 700K 1M 200K 3 0 1 0 A PK SA 0 0 0 5 200K 0 50K 0 0 0 0 9K 75 0 50 3 NUPI USI NUSI NUSI NUSI USI NUSI? 0 0 0 100K 2 390K 1 1 0 0 0 500K 1 0 1 0 UPI 0 0 0 200K 2 150K 1 1 M N O P NN, ND Q

UPI
NUPI

Exercise 4 Eliminating Index Candidates (cont.)


ASSOCIATIVE 1 3,000,000 Rows PK/FK FK Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating 182 0 0 1M 5 0 3 0 UPI A PK SA 0 8M 300M 100K 50 0 30 0 0 0 0 150 21K 0 19K 0 0 0 0 8K 400 0 350 0 G R S

PI/SI

NUPI
USI NUSI

NUPI
NUSI NUSI?

NUPI
NUSI

Exercise 4 Eliminating Index Candidates (cont.)


ASSOCIATIVE 2 1,000,000 Rows PK/FK FK Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating 0 7M 8M 500K 3 0 1 0 UPI A M PK FK 0 25K 200K 100K 150 0 8 0 0 0 0 5.6K 180 0 170 0 0 0 0 750 1350 0 1000 0 G T U

PI/SI

NUPI
USI NUSI

NUPI
NUSI

NUPI
NUSI NUSI

Exercise 4 Eliminating Index Candidates (cont.)


HISTORY 8,000,000 Rows PK/FK FK Value Access Join Access Join Rows Distinct Values Max Rows/Value Max Rows/NULL Typical Rows/Value Change Rating 10M 300M 2.4B 1M 18 0 3 0 UPI A PK SA 1.5K 0 0 730 18K 0 17K 0 0 0 0 N/A N/A N/A N/A N/A 0 0 0 N/A N/A N/A N/A N/A 0 0 0 N/A N/A N/A N/A N/A DATE D E F

PI/SI

NUPI
USI NUSI NUSI

Review Questions
1. In the case of a NUPI, the best way to avoid a duplicate row check is to ________. a. b. c. d. use set tables compare data values byte-by-byte within a Row Hash in order to ensure uniqueness. assign a USI on the PK to enforce uniqueness on a NUPI table. use the NOT NULL constraint on the column

2. Why should you avoid creating composite NUSIs? What is a better method? ______________________________________________________ ______________________________________________________ 3. Which SQL statement does not require the row to be physically read first? ____ a. b. c. d. INSERT SELECT UPDATE DELETE

Module 7: Review Question Answers


1. In the case of a NUPI, the best way to avoid a duplicate row check is to ________. a. b. c. d. use set tables compare data values byte-by-byte within a Row Hash in order to ensure uniqueness. assign a USI on the PK to enforce uniqueness on a NUPI table. use the NOT NULL constraint on the column

2. Why should you avoid creating composite NUSIs? What is a better method?
Composite indexes are utilized only if explicit values are supplied for each and every column in the index. It is better to define several single-column NUSIs in order to possibly utilize NUSI bit mapping.

3. Which SQL statement does not require the row to be physically read first? ____ a. b. c. d. INSERT SELECT UPDATE DELETE

S-ar putea să vă placă și