Michael J. Swart

March 23, 2011

Last Word On Scans

So last week I talked about all the different ways that Microsoft uses the word scans when in SQL Server. I got together a quick reference guide that describes scans terminology in terms of the SQL Server performance counters (because those counters were granular and trustworthy). But I didn’t really go into as much detail as I could have when I dealt with query plans. Or as Rob Farley prefers execution plans

Sorting Out Execution Plan Scans

Showplan Operators That Read From Tables

So here are the execution plan operators that access data from tables. I give a short description of each operator and I explain how that affect the performance counters (Full/Range/Probe). I won’t explain what each one means, I’ll let others do that. But if you know a bit about internals, the icons are actually quite well designed.

Table Scan
Table Scan (Showplan Operator Documentation)
This operator corresponds with the Full Scans/sec performance counter. I actually don’t encounter table scans very often and that’s mostly because most of the tables I deal with have clustered indexes defined. You’ll only ever see table scans on heaps.
Clustered Index Scan
Clustered Index Scan (Showplan Operator Documentation)
This operator corresponds with the Full Scans/sec performance counter. This is your run-of-the-mill full scan on a clustered index. Notice the brackets in the showplan icon. Those brackets are supposed to indicate a clustered index. Contrast this with the nonclustered index scan:
Nonclustered Index Scan
Nonclustered Index Scan (Showplan Operator Documentation)
This operator corresponds with the Full Scans/sec performance counter. And this is a full scan of a nonclustered index. You’d probably see this if the nonclustered index contains all the columns to select. It’s easier to scan because it’s a narrower index than the clustered index.
Clustered Index Seek
Clustered Index Seek (Showplan Operator Documentation)
This operator corresponds with the Probe Scans/sec performance counter when the clustered index is unique (as with a primary key) and the “seek predicate” includes the key columns needed to return at most one record. But if that’s not the case, then this operator will count towards the performance counter Range Scans/sec.
Nonclustered Index Seek
Nonclustered Index Seek (Showplan Operator Documentation)
Exactly like its Clustered Index counterpart, this operator corresponds with the Probe Scans/sec performance counter when SQL Server can determine that when it looks up the requested row it is sure to get at most one row (i.e. the index is unique etc…) Otherwise it counts towards the performance counter Range Scans/sec.
Key Lookup
Key Lookup (Showplan Operator Documentation)
This operator is also known as a bookmark lookup operator. It always counts towards the Probe Scans/sec performance counter. It’s interesting that even though it gets a single record at a time, this operator is often seen as a symptom of poor performance. That’s because the number of executions can get big. Many executions can kill the performance of the query. If you focus on the performance counters, you’ll notice that each of these executions will count towards the Probe Scans performance counter.
RID Lookup
RID Lookup (Showplan Operator Documentation)
Just like the key lookup, this lookup counts towards the Probe Scans/sec performance counter. I’ve written about this operator before in an article called Get Rid of RID Lookups.

Update On Scan Quick Reference

In the meantime, thanks to Kendra Little’s script to search system tables, I discovered one more dmv that I had overlooked last week, sys.dm_db_index_operational_stats, and so I include it here:
Scan Terminology

August 31, 2009

Get Rid of RID Lookups

Filed under: Technical Articles — Tags: , , — Michael J. Swart @ 6:42 am

RID Lookup
Takeaway: If you see an unexpected RID lookup in an execution plan, consider adding a clustered index as a db schema improvement.

What is a RID lookup?

You may come across the RID lookup operator as part of an execution plan. With Management Studio, RID lookups are displayed with the following icon (sort of … I fancied it up a little).

A RID Lookup is a lookup into a heap using a Row ID. The Row is included with entries in a non-clustered index in order to find the rest of a table’s data in a heap. (Remember, with a heap, the table data is stored unordered so a Row ID is needed for the correlation).

The RID Lookup’s official docs can be read here at Books Online. But it’s light on the details and it references a more thorough explanation of lookups in general which can be found at Craig Freedman’s blog here.

They’re bad… or at least not good.

When you find an RID Lookup in a query plan, it’s a symptom. It indicates a database schema that breaks several rules of thumb. Those rules-of-thumb are:

  • Each table should have a clustered index (of course there are exceptions but we’re dealing with rules-of-thumb here).
  • A non-clustered index has been created indicating that someone somewhere identified an ordering on one or more columns that made sense for that data.
  • There is at least one query (i.e. the one that generated the RID Lookup) that needs columns that are not covered by the non-clustered index.

These three points mean a wasted opportunity. A clustered index should have been created. A heap plus a non-clustered index take about the same space as a clustered index. At least logically: the non-clustered pages map to the clustered index’s index nodes and the heap’s pages map to the clustered index’s leaf nodes.

If possible, definitely consider modifying the schema to include a clustered index (After assessing risks of course).

Really? Always?

Well of course no respectable blogger writes in absolutes and Grant Fritchley has a video here that discusses RID lookups further. He says that when the RID lookup only deals with one row, that the impact is negligible. True, but had the table been designed with a clustered index, the performance would have been no worse.

Powered by WordPress