Planning For SQL Server® 2012 Indexing

Module 5
Planning for
SQL Server 2012
Indexing
Module Overview
Core Indexing Concepts
Data Types and Indexes
Single Column and Composite Indexes
Lesson 1: Core Indexing Concepts

How SQL Server Accesses Data
The Need for Indexes
Index Structures
Selectivity, Density and Index Depth
Index Fragmentation
Demonstration 1A: Viewing Index Fragmentation
How SQL Server Accesses Data

Table Scan
SQL Server reads all table pages
Index
SQL Server uses index pages to find rows
The Need for Indexes

ANSI SQL does not mention indexes
Generally considered to be external to the logical data model
All queries can be executed without indexes
Performance might be improved by the presence of indexes
Some constraints are implemented via indexes
Indexes are used to make constraints efficient but in theory

could be implemented in other ways
Analogy: Physical library holding books
Index by author is useful
Additional indexes would also be useful
Index Structures
Indexes are commonly based on tree structures
Not just a binary tree as nodes can have more than two
children
Top node is called the root node; bottom level nodes are
called leaf nodes
Selectivity, Density and Index Depth

Selectivity
A measure of how many rows are returned compared to the

total number of rows
High selectivity means a small number of rows when related to

the total number of rows
Density
A measure of the lack of uniqueness of data in the table
High density indicates a large number of duplicates
Index Depth
Number of levels within the index
Common misconception that indexes are deep
Index Fragmentation
How does fragmentation occur?
SQL Server reorganizes index pages when data

modifications cause index pages to split
Types of fragmentation:
Internal pages are not full
External pages are out of logical sequence
Detecting fragmentation
SQL Server Management Studio Index Properties
System function - sys.dm_db_index_physical_stats
Demonstration 1A: Viewing Index Fragmentation

In this demonstration, you will see how to identify
fragmented indexes
Lesson 2: Data Types and Indexes

Numeric Index Data
Character Index Data
Date-Related Index Data
GUID Index Data
BIT Index Data
Indexing Computed Columns
Numeric Index Data

Indexes with numeric keys work efficiently
Many values fit in a small number of index pages
Sorts and comparisons are quick
Exact numeric types are most efficient
Integer types are the most efficient
INT and BIGINT commonly used
Approximate data types (float and real) much less efficient
Character Index Data

Character data types are much less efficient when used in
index keys
Character values tend to be much larger than numeric
values
Even short character values are slow to compare unless
binary comparisons are being made
Most SQL Server applications use collations other than binary
Rules for collations need to be applied whenever comparisons

are made
Date-Related Index Data

Date data types are generally good candidates for index
keys
Very commonly used in business applications
Only slightly less efficient than integer data

Size of the data will be important
date more efficient than datetime
GUID Index Data

Becoming very common in new business applications
Moderate efficiency
Size is 128 bits or 16 bytes
Comparison performance is reasonable
Problems arise
Randomness of generation causes fragmentation problems
Very common problem in many current applications
BIT Index Data

BIT columns have only two possible values
BIT columns are efficient as index keys
Common misconception that BIT columns are not useful in
indexes
Many valid usage scenarios exist
Filtered indexes often useful with BIT column indexes
Indexing Computed Columns

Computed columns are based on expressions
Values are typically derived from other columns
Indexing the computed value rather than any underlying
values can be useful
Can assist with improving performance on poorly-designed

databases
Example: a column that is used to hold values that should

have been held in separate columns
Persisted computed columns
Avoid the overhead of computing values each time they are

selected
Are computed during INSERT and UPDATE statements
Lesson 3: Single Column and Composite Indexes

Single Column vs. Composite Indexes
Ascending vs. Descending Indexes
Index Statistics
Demonstration 3A: Viewing Index Statistics
Single Column vs. Composite Indexes

Indexes are not always constructed on a single column
Multi-column indexes are called "composite" indexes
Composite indexes are often useful
Tend to be more useful than single column indexes in most

typical business applications
Having an index sorted first by customer then by order date

makes it very easy to find orders for a particular customer on
a particular date
A query might involve multiple search predicates
Two columns together might be selective while neither is

selective on its own
Index on A,B is not the same as an index on B,A
Typically index the most restrictive column first
Ascending vs. Descending Indexes

Indexes could be constructed in ascending or descending
order
In general, for single column indexes, both are equally
useful
Each layer of a SQL Server index is double-linked (ie: linked in

both directions)
SQL Server can start at either end and work towards the other
end
Each component of a composite index can be ascending or
descending
Might be very useful for avoiding sort operations
Index Statistics
SQL Server needs to have knowledge of the layout of the
data in a table or index before it optimizes and executes

queries
Needs to create a reasonable plan for executing the query
Important to know the usefulness of each index
Selectivity is the most important metric
By default, SQL Server automatically creates statistics on
indexes
Can be disabled
Recommendation is to leave auto-creation and auto-update

enabled
Demonstration 3A: Viewing Index Statistics

In this demonstration, you will see how to work with index
statistics
Lab 5: Planning for SQL Server Indexing

Exercise 1: Explore existing index statistics
Challenge Exercise 2: Design column orders for indexes
(Only if time permits)
Logon information
Virtual machine
10776A-MIA-SQL1
User name
AdventureWorks\Administrator
Password
Pa$$w0rd
Estimated time: 45 minutes
Lab Scenario
You have been asked to explain the concept of index
statistics and selectivity to a new developer. You will explore
the statistics available on an existing index and determine
how selective some sample queries would be.
One of the company developers has provided you with a list
of the most important queries that will be executed by the
new marketing management system. Depending upon how
much time you have available, you need to determine the
best column orders for indexes to support each query.
Complete as many as possible within the allocated time. In
later modules, you will consider how these indexes would
be implemented. Each query is to be considered in isolation
in this exercise.
Lab Review
Which types of queries would most likely lead to widely-
differing query plans?
If you have an equality predicate and a LIKE predicate in
your most important query, which predicate would you try

to satisfy as the first column of a composite index?
Module Review and Takeaways

Review Questions
Best Practices

Planning For SQL Server® 2012 Indexing

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Planning For SQL Server® 2012 Indexing

Încărcat de

Drepturi de autor:

Formate disponibile

Module 5

Lesson 1: Core Indexing Concepts

How SQL Server Accesses Data

SQL Server reads all table pages

SQL Server uses index pages to find rows

The Need for Indexes

Generally considered to be external to the logical data model

All queries can be executed without indexes

Performance might be improved by the presence of indexes

Some constraints are implemented via indexes

Indexes are used to make constraints efficient but in theory

Analogy: Physical library holding books

Index by author is useful

Additional indexes would also be useful

called leaf nodes

Selectivity, Density and Index Depth

A measure of how many rows are returned compared to the

High selectivity means a small number of rows when related to

A measure of the lack of uniqueness of data in the table

High density indicates a large number of duplicates

Number of levels within the index

Common misconception that indexes are deep

SQL Server reorganizes index pages when data

Internal pages are not full

External pages are out of logical sequence

SQL Server Management Studio Index Properties

System function - sys.dm_db_index_physical_stats

Demonstration 1A: Viewing Index Fragmentation

Lesson 2: Data Types and Indexes

Numeric Index Data

Many values fit in a small number of index pages

Sorts and comparisons are quick

Exact numeric types are most efficient

Integer types are the most efficient

INT and BIGINT commonly used

Approximate data types (float and real) much less efficient

Character Index Data

Character values tend to be much larger than numeric

Even short character values are slow to compare unless

binary comparisons are being made

Most SQL Server applications use collations other than binary

Rules for collations need to be applied whenever comparisons

Date-Related Index Data

Very commonly used in business applications

Only slightly less efficient than integer data

date more efficient than datetime

GUID Index Data

Size is 128 bits or 16 bytes

Comparison performance is reasonable

Randomness of generation causes fragmentation problems

Very common problem in many current applications

BIT Index Data

Many valid usage scenarios exist

Filtered indexes often useful with BIT column indexes

Indexing Computed Columns

Values are typically derived from other columns

Indexing the computed value rather than any underlying

values can be useful

Can assist with improving performance on poorly-designed

Example: a column that is used to hold values that should

Persisted computed columns

Avoid the overhead of computing values each time they are

Are computed during INSERT and UPDATE statements