Indexing for Performance by Kimberly Tripp

This is a course I've been studying through Pluralsight:

SQL Server: Indexing for Performance by Kimberly L. Tripp

This course will teach you how to correctly choose indexes for your environment. You'll also learn how to understand index internals, how indexes are used, and much more.

I've placed related resources here for myself:

j learning; cd sql_server\indexing_for_performance

Overview

Two stages:

Query Tuning

i.e. you tune individual queries and come up with decent indexes for them in isolation

Server Tuning

consolidate indexes for all typical queries.
This way you reduce or avoid:
- unused indexes
- duplicate indexes
- too-similar indexes

General tips (throughout course)

When running queries to be analysed:
- Turn on statistics io
```
SET STATISTICS IO ON;
```
- Turn on show plan (i.e. "show actual execution plan" button -- Kimberly calls it 'Show Plan')
To inspect indexes:

Run the 'DMV' @IndexId

Kimberly uses

[sys].[dm_db_index_physical_stats]

...so much she just refers to it as 'the DMV')

	-- Now, use the DMV (adding index ID) to review all indexes:
	SELECT [index_id] AS [ID]
		, [index_depth] AS [D]
			, [index_level] AS [L]
			, [record_count] AS [Rows]
			, [page_count] AS [Pages]
			, [avg_page_space_used_in_percent] AS [Page:Percent Full]
			, [min_record_size_in_bytes] AS [Row:MinLen]
			, [max_record_size_in_bytes] AS [Row:MaxLen]
			, [avg_record_size_in_bytes] AS [Row:AvgLen]
	FROM [sys].[dm_db_index_physical_stats]
			(DB_ID (N'EmployeeCaseStudy')					-- Database ID
			, OBJECT_ID (N'EmployeeCaseStudy.dbo.Employee') -- Object ID
			, NULL											-- Index ID
			, NULL											-- Partition ID
			, 'DETAILED');									-- Mode
	GO

DBCC Ind

EXEC ('DBCC IND ([EmployeeCaseStudy], ''[dbo].[Employee]'', 1)');
EXEC ('DBCC IND ([EmployeeCaseStudy], ''[dbo].[Employee]'', 2)');
EXEC ('DBCC IND ([EmployeeCaseStudy], ''[dbo].[Employee]'', 3)');

Show the statistics:

DBCC SHOW_STATISTICS ('Employee', 'EmployeeZipRange1FilteredIX');

When is index used

Non clustered non covering Indexes are only used when selectivity < about 1.5 %
- i.e. If the query planner, expects your query to returns 2% or more of the table it would rather scan the table than seek the non c index.
- This is because the cost of many bookmark lookups (from index leaf to actual table (which are random access access)) are slower than the even larger number of sequential reads when scanning the table.
- And note Statistics must be up to date for those percentage estimates <2% to be good enough.
```
--[sys].[dm_db_index_physical_stats]

SELECT *
FROM sys.dm_db_index_physical_stats
		(db_id(), object_id('Charge'), 1, NULL, 'DETAILED')
go
```

When don't you want a covering index?

Highly selective queries (returning a handful of records) do not need to be covered - as there are very few bookmark lookups performed.
Select * does not need to be covered as u can't cover it anyway.

Filtered index: a useful example

Imagine We only query error message when status = error

So we have a filtered index on status = error and also error message column (particularly if it's "left only" comparisons, i.e. like 'x%' not like %x%)

So small you can afford to include extra columns
Sql server will not query across two filtered sets (and join them) -

eg where status in ('error','warning') (assuming a filtered index for status 'error' and a filtered index for status 'warning'....

Sql server won't combine those and will just scan the table instead.
One trick is to change the query to be:

...where status = 'error' ... union all ... where status = 'warning' as then it will combine the two sets but you have to be repetitive to do that

So you want the filtered index to be clearly the "filtered index of choice" for this query.

Should have automated statistics updating maintenance routine at a time when system is less busy.

UPDATE STATISTICS [dbo].[Employee];

Index consolidation

Unused indexes - get rid of those obviously.
Duplicates obviously only need 1. But check no one is using hints to target one of the dupes by name.

Kimberly's script for finding duplicates (She may have a newer one by now? This one still looks like it would work on all but columnstore indexes?)

Similar - may make a "super index" and find it's still useful but less space maintenance easier to cache and stay in cache etc.

Kimberly has a replacement for sp_helpindex - I think this is her latest version:

USE THIS: sp_helpindex

External References

Indexing for Performance Finding the Right Balance PDF — a 2004 pdf of earlier version of the course.
USE THIS: sp_helpindex - Kimberly' replacement for sp_helpindex