Michael J. Swart

December 5, 2012

Well That Wasn’t Obvious

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 12:00 pm

I’ve discovered that non-intuitive lessons stick in the brain the best. I share three examples that I either learned recently (or at least re-learned recently).

Uniqueifiers on Indexes That I Thought Shouldn’t Need Them

Do you remember what a Uniquifier is? SQL Server uses these hidden values on rows to keep non-clustered indexes in sync with their non-unique clustered indexes. Think of this questionable schema:

create table Actors
(
	ActorId int identity not null,
	LastName nvarchar(100) not null,
	FirstName nvarchar(100) not null,
	AgentId int null,
	primary key nonclustered (ActorId)
)
 
create clustered index IX_Actors
	on Actors(LastName, FirstName)

Then add some data:

insert Actors (LastName, FirstName, AgentId)
values ('Douglas', 'Michael', 3),
       ('Douglas', 'Michael', 4),
       ('Keaton', 'Michael', 5),
       ('Douglas', 'Kirk', 4)

I can demonstrate the uniquifier value here because of the non-unique clustered index:

select sys.fn_PhysLocFormatter (%%physloc%%) as [physical rid], *
from Actors -- in my case, this shows page 231
 
DBCC TRACEON (3604);
DECLARE @dbid int = DB_ID(); 
DBCC PAGE(@dbid, 1, 231, 3); 
 
/* showing (among other things):
...
Slot 1 Offset 0x60 Length 51
UNIQUIFIER = 0                       
LastName = Douglas                   
FirstName = Michael                  
ActorId = 1                          
AgentId = 3                          
KeyHashValue = (16035ff378a3)        
 
Slot 2 Offset 0x93 Length 55
UNIQUIFIER = 1                       
LastName = Douglas                   
FirstName = Michael                  
ActorId = 2                          
AgentId = 4                          
KeyHashValue = (16035e22af06)        
...
*/

Cool! Just like real life (See Michael Douglas (I – XXVI) at IMDB). This uniquifier value behaves very much like the roman numerals you see at IMDB.

Here’s the part that wasn’t obvious… what if I chose a clustered index whose columns include the primary key…

drop index IX_Actors ON Actors;
create clustered INDEX IX_Actors
	ON Actors(LastName, FirstName, ActorId)

That should be unique right? The index includes the primary key column (ActorId) that guarantees it. So SQL Server shouldn’t need the uniquifier right?

But it does! 

SQL Server doesn’t count on the index being unique unless you say so. I actually discovered the above while using Kendra Little’s sp_BlitzIndex, a tool so underexposed it’s pasty.

UPDATE Modifies Column Values Only Once

What does the following code produce?

-- set up test table 
declare @test TABLE (id int, value int);
declare @source TABLE (id int, increment int);
 
insert @test (id, value)
values (1, 0), (2, 0)
 
insert @source (id, increment)
values (1, 10),(1, 20),(1, 30),(2, 100),(2, 100),(2, 100)
 
-- "sum": add each value in the source to the existing value
update @test 
set value = value + increment
from @test t
join @source s
    on s.id = t.id
 
-- check the results
select id, value
from @test
/* The results:
id          value
----------- -----------
1           10
2           100
*/

Surprised? I think this example (or one very much like it) surprises those of us who started our careers as programmers. Many of us follow up this lesson with the related “UPDATE statements don’t have GROUP BY clauses” and then the lesson “How do I use CTEs?”

Indexed Views Don’t Support Max/Min Aggregates

Indexed views support a couple aggregate functions like COUNT_BIG() and SUM(). And with some trickery you can calculate AVG() and STDEV(). But SQL Server restricts the use of MAX() and MIN() in indexed views despite how useful they’d be.

It might help to understand why. SQL Server maintains indexed views as physical database objects and it can maintain aggregate values like COUNT() and SUM() by processing only the changing rows in the base table while safely ignoring the rest of the table. But I can’t say the same for MAX() or MIN(). If SQL Server supported MAX() and MIN() in indexed views, then when you delete the row in the base table that represents the MAX() value, SQL Server would have to search the rest of the base table to find the new MAX() value.

Check out the microsoft.connect feature suggestion Expand aggregate support in indexed views (MIN/MAX). Aaron Bertrand created this Connect suggestion and I like it because it shows how effective constructive feedback can be. I like it because of its description, comments and the useful workaround. The Microsoft team even gave some insight into how they almost included this feature in SQL 2008. This connect item only seems to be lacking an E.T.A. so go and cast your vote!

Bonus Content

I didn’t draw any illustrations this week so I’m including some bonus content (admittedly written by others):

Non-obvious Things From Twitter Friends

Request for Tweets: What thing in sql surprised you? Something not obvious. For me it was how the TRY and CATCH didn't always.

 I love how the log shipping "use default data and log folders" ... doesn't

how about how INSERT <tbl> VALUES (1),(2),(3) limits you to 1000 entries

Rolling time-window filtered index. The cool surprise - building each new index uses the existing one, so it's really quick.

Trivia

  • Did you know Michael Keaton (born Michael Douglas) changed his name to avoid confusion with that other guy? Pretty wise.
  • Jes Schultz Borland reminded me recently that good writers use more active verbs. I took that advice to heart and turned the writing of this article into an exercise. I avoided using words like is, was or are here and I think it turned out pretty well.

7 Comments »

  1. Heh– I promise to expose sp_blitzindex in public more. I’ve done a bit of a soft launch with it to collect initial feedback and additional ideas. That’s gone super well, so I’m going to webcast it next week! Thanks for the lovely link, and glad you checked it out!

    Comment by Kendra Little — December 5, 2012 @ 12:14 pm

  2. I thought so Kendra, it’s a great tool, even from the start. I think it’s ready for more exposure… who knows, maybe a nice farmers tan would do no harm.

    Comment by Michael J. Swart — December 5, 2012 @ 12:20 pm

  3. “what if I chose a clustered index whose columns include the primary key. So SQL Server shouldn’t need the uniquifier right?” => No, because the PK can be dropped afterwards. Then, you’d have a CI without uniquifier although one is needed.

    Comment by tobi — December 6, 2012 @ 11:02 am

  4. [...] Well That Wasn’t Obvious - Super interesting lessons from Michael J. Swart (Blog|Twitter) and community chums. [...]

    Pingback by Something for the Weekend - SQL Server Links 06/12/12 — December 7, 2012 @ 2:49 pm

  5. Re: Trivia – Even MORE Trivia!

    Screen Actors Guild regulations have long required that all members have “unique working names”. That’s why Michael Keaton couldn’t use his birth name (Michael Douglas) and why Michael J. Fox had to use the “J.” (there was already a “Michael Fox” – who ever heard of that guy?).

    Comment by Jason — December 10, 2012 @ 11:16 am

  6. Thanks Jason! I did not know that. That’s cool and a bit bizarre.

    More and more trivia: Wikipedia tells me that the “J” in Michael J. Fox’s name doesn’t stand for anything. Michael J. Fox was born Michael Andrew Fox but took the J as “an homage to actor Michael J. Pollard”. For what it’s worth Michael J. Swart is short for Michael John Swart.

    Comment by Michael J. Swart — December 10, 2012 @ 11:23 am

  7. [...] Well That Wasn’t Obvious - Super interesting lessons from Michael J. Swart (Blog|Twitter) and community chums. [...]

    Pingback by Something for the Weekend - SQL Server Links 07/12/12 • John Sansom — May 21, 2013 @ 9:03 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress