Michael J. Swart

January 25, 2012

Rebuild Your Indexes Online (When You Can)

So I was recently reading SQL Server’s blog, specifically the article Customers using SQL Server 2012 today! and some brave businesses are already using and enjoying some of SQL Server 2012 features that help their databases stay available. I’m excited about the new features too. For example, index rebuilds have been improved. We can now rebuild indexes online that include “blob” columns (like nvarchar(max), image, etc…). This means that (almost) every index can be rebuilt without requiring a table lock for the entire operation. And that’s good news for availability!

This wasn’t the case in earlier versions. In earlier versions, you couldn’t rebuild an index online if it included columns that were large strings. If you tried, you would get this message:

/*
An online operation cannot be performed for index 'ix_t1_1' because the 
index contains column '<column name>' of data type text, ntext, image, varchar(max), 
nvarchar(max), varbinary(max), xml, or large CLR type. For a non-clustered 
index, the column could be an include column of the index. For a clustered 
index, the column could be any column of the table. If DROP_EXISTING is 
used, the column could be part of a new or old index. The operation must be 
performed offline. 
*/

Logic to determine when ONLINE=ON is supported

So that means there’s one more thing to check when finding out whether you can rebuild an index on-line. The logic now goes like this. You can rebuild an index online if:

  • You’re using Enterprise Edition or higher
  • And you’re using SQL Server 2012 or later
  • Or you’re using SQL Server 2008 or earlier
  • And your index is clustered and the table contains no blob columns
  • Or your index is non-clustered and the index includes no blob columns

Hmm… You can’t really put parentheses in a bullet point list. Here try this flow chart:

Clear as mud

Hmm… that really doesn’t clear things up too well either. And I haven’t even mentioned partitioned indexes.

Just Try It

Maybe you’re like me, you may have to deal with multiple versions, multiple editions, and multiple tables and their indexes. Instead of wading through that logic above, just TRY it and let SQL Server figure it out. Here’s a sproc which takes a table name and index name and tries to rebuild the index online. If it can’t, it builds it offline.

create procedure s_TryRebuildOnlineOtherwiseOffline
(
	@schema sysname = 'dbo',
	@tablename sysname,
	@indexname sysname
)
as
begin
	set @schema = QUOTENAME(@schema);
	set @tablename = QUOTENAME(@tablename);
	set @indexname = QUOTENAME(@indexname);
 
	declare @sqlRebuild nvarchar(max)
	set @sqlRebuild = N'ALTER INDEX ' + @indexname + ' ON ' + @schema + '.' + @tablename + ' REBUILD';
	declare @sqlRebuildOnline nvarchar(max)
	set @sqlRebuildOnline = @sqlRebuild + ' WITH (ONLINE=ON)';
 
	begin try
		EXEC sp_executesql @sqlRebuildOnline;
		print @sqlRebuildOnline;
	end try
	begin catch
		EXEC sp_executesql @sqlRebuild;
		print @sqlRebuild;
	end catch
end
go

Alternatively, you can adjust the script to REORGANIZE an index when it can’t rebuild it online.

By the way, if you use Ola Hallengren’s maintenance scripts, he’s already taken all this logic into account! If you write your own maintenance scripts, feel free to incorporate what I have here.

January 13, 2012

Thank you notes…

Filed under: SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 12:00 pm

(It’s been a hectic week. The good news is that I’ve got loads of topics to write about. The bad news is that I’ve got no time to do it! Illustrations will make a return next week too because I’m finally getting that Adobe Illustrator license I’ve had my eye on. Maybe I’ll also include a retroactive Jimmy Fallon illustration).

Aaron Bertrand maintains the definitive list of free SQL Server tools. I want to give a shout out to just a couple of them. Without these tools, this week would have been a lot rougher at work for our team.

sp_whoisactive

Thank you Adam Machanic,
For making a lightweight sproc to help me see at a glance the activity on a SQL Server database. (Also kudos to the parameters @get_plans and @find_block_leaders).

If you’re unfamiliar with this stored procedure, start here.

Plan Explorer

Thank you SQL Sentry,
For making a tool that helps me understand everything about a query plan is doing. (Also kudos to the Expressions tab and the Parameters tab).

Plan Explorer presents SQL Server query plans just a bit nicer than SQL Server Management Studio does. When looking at a problematic query plan, I want to go from looking at a plan to a solution as quick as possible. For me this week, it would be safe to say that Plan Explorer cut that analysis time in half.

If you’re unfamiliar with this tool, start here.

Dynamic Management Views

Thank you Microsoft (SQL-Server-2005-and-later-DMVs),
For tracking statistics about query executions and making it possible to examine the history of the db activity without having to trace server activity (so extra kudos to dm_exec_query_stats, dm_exec_cached_plans and dm_exec_query_plan).

I don’t know whether to call these views free. They’re available with any database at SQL Server 2005 (or later). So if you’re using SQL Server, then you can use these DMVs.

If you’re unfamiliar with these views, start here.

December 8, 2011

CXPACKET? What’s That? And What’s Next?

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: , , — Michael J. Swart @ 8:54 pm

Takeaway: Is the wait type CXPACKET bothering you? It means parallelism. Here’s how to find the queries that are giving you this grief.

The Wait Type CXPACKET

Since 2005, SQL Server provides dynamic management views (DMVs). These views help monitor database health and they’re the first place I go to when troubleshoot problems. For example the view sys.dm_os_wait_stats gives an aggregated way to tell you where the waits are. Often you’ll come across the wait type CXPACKET. Microsoft explains:

“[This wait type] Occurs when trying to synchronize the query processor exchange iterator.”

Hmmm… I know what those words mean individually… Any way, if you keep searching, you’ll see others explain: “CXPACKET means parallelism”. We’re getting a bit closer now. If a query is run with multiple threads, and one thread completes before the others, then SQL Server will report the wait type CXPACKET for that thread.

It’s like several chefs are making soup. If one chef’s only job is to add the cilantro, there’s going to be some waiting involved. So is the extra help worth the trouble?

SQL Server thought it could use more cooks for a particular “recipe”… Maybe it’s time to make simpler recipes. (The answer is almost always yes, especially for OLTP systems). SQL Server doesn’t use parallelism often. It only comes up with a multi-threaded query plan when it needs to, i.e. when the plan is estimated to be costly enough. It’s usually those queries with the high estimated cost that need to be fixed, the CXPACKET wait type is a symptom of that.

Those High Cost Queries

So which queries are those? Again, I like to go to the DMVs. The one I like to look at is sys.dm_exec_query_stats except that it doesn’t have a flag called uses_parallelism. For the definitive way to find parallel queries, you have to scan and parse all the query plans that are cached using sys.dm_exec_query_plan.

But that takes for ever so I don’t recommend it. But recently I found out that we don’t really need it: Parallel queries are interesting, they’re one of the only kind of queries that can use more CPU time than actual elapsed time. So if that’s true, then we just need to examine these queries:

select * from sys.dm_exec_query_stats 
where total_elapsed_time < total_worker_time
	-- total_worker_time is measured to the microsecond, but is accurate to the millisecond so:
	and ( ( total_worker_time – total_elapsed_time ) / execution_count ) * 1.0 > 1000 -- avg difference is at least 1 ms

But it might not be true, for example, a parallel query might be suffer from blocking too long. Long enough that the elapsed time might be longer than the total amount of CPU time. That’s an interesting question: How good an indication of parallelism is the ratio between elapsed time and CPU time?

(total_elapsed_time < total_cpu_time) Vs. uses_parallelism

I’m lucky. I’ve got a large set of query data that can help me.

This chart needs a little explanation. Pick a dot in the chart above. It represents a set of queries whose ratio of cpu to elapsed time are the same. Look at the value on the x-axis to see what that common ratio is. Say that it’s 1.5, this means that that dot represents a set of queries whose elapsed time is exactly 50% greater than its cpu time. Now look at its value on the Y axis. You might see 2%. That’s means that 2% of those queries use parallelism.

I expected a slope, not a waterfall and this is what that means to me. Knowing that a query’s worker time is greater than its elapsed time is a great indication of parallelism. And knowing by how much doesn’t change that.

CXPACKET? Here’s What’s Next

This query lists the top 20 worst parallel queries cached in your database.

-- TOP 20 parallel queries (by CPU)
SELECT TOP (20)
    [Total CPU] = total_worker_time,
	[Total Elapsed Time] = total_elapsed_time,
	[Execution Count] = execution_count,
    [Average CPU in microseconds] = cast(total_worker_time / (execution_count + 0.0) as money),
    [DB Name] = DB_NAME(ST.dbid),
    [Object Name] = OBJECT_NAME(ST.objectid, ST.dbid),
    [Query Text] = (SELECT [processing-instruction(q)] = CASE 
            WHEN [sql_handle] IS NULL THEN ' '
            ELSE (SUBSTRING(ST.TEXT,(QS.statement_start_offset + 2) / 2,
                (CASE 
                        WHEN QS.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX),ST.text)) * 2
                        ELSE QS.statement_end_offset
                        END - QS.statement_start_offset) / 2))
            END
			FOR XML PATH(''), type),
    [Query Plan] = qp.query_plan
FROM sys.dm_exec_query_stats QS
CROSS APPLY sys.dm_exec_sql_text([sql_handle]) ST
CROSS APPLY sys.dm_exec_query_plan ([plan_handle]) QP
WHERE total_elapsed_time < total_worker_time
	AND ( ( total_worker_time - total_elapsed_time ) / execution_count ) * 1.0 > 1000 -- average difference is more than a millisecond
ORDER BY total_worker_time DESC

Some interesting things about this query:

  • I’ve written it for SQL Server Management Studio. In SSMS, you can click on the query and the query plan to explore these queries further.
  • I’ve added a filter to weed out queries whose average cpu time is more than a millisecond. I’m sure you won’t miss those stats. Microsoft reports CPU time in microseconds, but it’s only accurate to milliseconds. So you can get false-positives if you don’t ignore the blazing fast queries. That’s why I added that filter.

November 28, 2011

SQL Fingertips

Filed under: SQLServerPedia Syndication,Technical Articles,Tongue In Cheek — Tags: , , — Michael J. Swart @ 11:36 pm

They Might Be Giants (TMBG) released an album called Apollo 18. It has 38 tracks and it runs for 42 minutes. It’s an album that will always remind me of my friend and first University roommate (Hey, what’s up Chris!). So the average track is about a minute long and this is due to a set of 21 tracks collectively called “fingertips”. Most of these fingertips are five to fifteen seconds long. Some are a bit longer. These fingertips are probably not good enough to be put on an album on their own. But collectively, they make a really cool album (especially on shuffle mode).

The List

So when I found myself with a collection of tiny tips that might not qualify as a blog post on their own, I thought of that album and a microsecond later, we’ve got SQL Fingertips!

So here they are, in no particular order.

  • Stackoverflow tip: Use the bounty feature without being afraid of the hit to your reputation. A bounty offered on a decent question usually attracts enough attention to get your question up-votes. And unless it’s a crummy question, those up-votes will earn you back the reputation points you spent.
  • T-SQL tip: You can use the following script as a template to be able to run a single query on multiple databases and have it return a single result set:
    DECLARE @temp TABLE (
    	ServerName sysname,
    	DBName sysname,
    	ProductCount int
    );
     
    insert @temp (ServerName, DBName, ProductCount)
    exec master..sp_MSforeachdb N'use [?];
    	if exists (select 1 from sys.tables where name = ''Product'' and SCHEMA_NAME(schema_id) = ''Production'')
    		exec sp_executesql N''
    			select @@servername, db_name(), Count(1) as [count]
    			from Production.Product''';
     
    select * from @temp;
  • SQL Server Management Studio tip: GO can take a parameter. If you write GO 100, your batch will be executed 100 times. Many people know this but I always here a couple “wow”s each time I demonstrate it. And here’s a couple more notes about that batch separator.
    • Don’t forget to SET ROWCOUNT OFF.
    • The batch separator doesn’t have to be GO. It’s configurable! I used to set it to ENGAGE for a while, but the joke got old and I switched it back.
  • Blog tip: If you like following blogs or people, but maybe you’re falling out of love with your RSS reader. No worries. You can get any RSS feed at all sent straight to your inbox.  There are lots of RSS-to-inbox websites out there. The one I recommend is FeedMyInbox.com. So, for example, to get my own blog delivered to your inbox, start here.
  • Data → Eye → Brain, A Data Visualization tip: Data visualizations (charts, diagrams, etc..) help us gain knowledge and understand data that maybe we couldn’t understand otherwise (See Anscombe’s Quartet). And for large and complex datasets, the best visualizations are usually custom designed and interactive. There’s a lot of power in customized visualizations: See what David McCandless and Miriah Meyers have to say.
  • Twitter DBA Tip #1: I put a call on twitter asking people to tweet the one database tip they thought more people should know. The first one comes from @BrentO: “Don’t leave your database in full recovery mode if you’re not doing t-log backups.” He’s right. Full recovery mode usually implies a backup strategy that includes transaction log backups. (And full or differential backups are not enough either).
  • Twitter DBA Tip #2: This one’s from @wnylibrarian“If using SQLExpress & no SQLAgent, try using SQLCMD in a batchfile.” I’m glad Gary tweeted this one. I’ve come up against this problem in my own work. It turns out that when downgrading editions of SQL Server, often one of the first things that you miss is SQL Agent. If you follow Gary’s advice, it’s not a hurdle, it’s only an inconvenience.
  • Twitter DBA Tip #3: Next I heard from @DaveH0ward who wants to remind people that “You need to size your data /log files and set autogrowth amounts appropriately.” I think a lot of people don’t know that the default auto-growth settings for SQL Server databases are a bad idea. Tim Ford warns us of dire consequences for not following this tip.
  • Twitter DBA Tip #4: Finally I heard from @SQLSandwhiches (aka Adam Mikolaj). He shared a pet peeve “I’ve seen so many servers that don’t have logs to auto cycle. Don’t you love opening a log file with 300,000 entries?” Adam wrote about his experience earlier this year (including a solution of course).
  • MixedTape Fingertip Tip: Say you find yourself back in the nineties making a mixed tape and you find that you have a minute left to fill on one side of the tape. They Might Be Giants to the rescue! Pick several fingertips from their Apollo 18 album and optimize every second of that mixed tape! My personal favorite is the 45 second Minimum Wage off of their other album Flood.

So how many of these SQL Fingertips did you know? Have you got a small tip? Share it in the comments. Have you got a whole bunch of small tips? Write your own SQL Fingertip post. I’d be especially grateful to read a set of Business Intelligence fingertips.

November 16, 2011

Be Quick About Reading The Data You Asked For

So last week, I mentioned that if an application takes it’s time processing data returned from a query it executed, it will not cause ADO.net to time out. This week I’m going to qualify that and on the way, I’ll explain a problem associated with “lazy” readers. I’ll talk about what lazy readers look like from SQL Server’s point of view. And I’ll talk about a potential pitfall with the C# keyword “yield“.

So say you’ve written an application that accesses the database using ADO.net’s SqlDataReader. As you process the rows returned from the server, you might take a while between one Read() call and the next. If that happens, it can lead to problems. In other words:

If…

  • Your application has issued a query using a DataReader,
  • And the amount of data returned is more than 8 kilobytes (the size of an SNI packet),
  • And your application takes its time processing rows (i.e. time between Read() calls),

Then…

  • Your database query is still active!
  • And your transaction is still open! (Remember that even if you don’t use an explicit BEGIN TRANSACTION, SQL Server’s AUTOCOMMIT mode means that each query is it’s own mini-transaction and that transaction isn’t over yet!)
  • Memory used by the database for your query is still in use
  • And resource locks that were taken to process your query are still held!

But you might get lucky…

  • For example, you might be using READ COMMITTED SNAPSHOT isolation level (which is “READ COMMITTED with more awesome” according to Kendra Little). Because even though your query is still using the same number of resources, there’s a smaller risk of seeing concurrency issues.
  • Or if you’re using the default isolation level READ COMMITTED (which you probably are) SQL Server might have given up some of the locks that have been acquired while processing your rows. In this case, the number of locks held at any time is fairly small. And you might not see any problems (at first).

Or you might not be so lucky…

  • More likely, your query will be doing something that’s not simple. It could be using a sort, or parallelism or a particularly involved join. In that case, the query will hold onto the locks it uses without releasing them.
  • Then you might have concurrency issues.
  • Concurrency problems lead to deadlocks and excessive blocking.
  • And that can lead to…

Timeouts!
All because of the processing  on the client side took too long.

But How Can You Tell?
How do you know if this is a problem in your case?

From The Server Side

Look at the wait stats that are collected with sys.dm_os_wait_stats system view. (Or even better, look at Glenn Berry’s DMV query that reports on the same thing). Now take a look at the top wait statistics. If SQL Server is waiting on a client to process data, you’ll see ASYNC_NETWORK_IO as one of the top wait statistics. One interesting note about SQL Server 2012. I noticed that you might see this wait stat instead: PREEMPTIVE_OS_WAITFORSINGLEOBJECT.

That tells you whether this has been a problem since the last server restart. But if you want to know if the database is waiting on some client right now, you can use sp_whoisactive by Adam Machanic. (You could also query sys.dm_exec_requests, or in a pinch sys.sysprocesses). Again, if you see a query waiting on things like ASYNC_NETWORK_IO or PREEMPTIVE_OS_WAITFORSINGLEOBJECT, then that query is waiting for a lazy client application to read its data.

From The Client Side

Make sure your DataReaders are quick. You can tell pretty easily most of the time. Check out these two examples:

IEnumerable<int> ReturnASum()
{
	int sum = 0;
	using (SqlConnection conn = new SqlConnection("blah blah"))
	using(SqlCommand cmd = new SqlCommand(sqlStatement, conn))
	{
		conn.Open();
		using (SqlDataReader dr = cmd.ExecuteReader())
		{
			while (dr.Read())
			{
				sum += dr.GetInt32(0);
			}
		}
	}
	return sum;
}

Pretty straightforward, but compare that to the next example. This next example might be a problem:

void SendDataIntoSpace()
{
	using (SqlConnection conn = new SqlConnection("blah blah"))
	using(SqlCommand cmd = new SqlCommand(sqlStatement, conn))
	{
		conn.Open();
		using (SqlDataReader dr = cmd.ExecuteReader())
		{
			while (dr.Read())
			{
				SendToMarsRover(dr.GetInt32(0));
			}
		}
	}
}

Sometimes it can be tricky to tell whether code will be processed quickly or not. If your data rows are yielded to the calling procedure, then the processing speed is no longer under your control. In this next example, we can’t tell by inspection how quickly the data will be processed. To do that, we have to look at all the places where this method is called.

IEnumerable<int> LetSomeoneElseReadDataAtTheirOwnPace()
{
	using (SqlConnection conn = new SqlConnection("blah blah"))
	using(SqlCommand cmd = new SqlCommand(sqlStatement, conn))
	{
		conn.Open();
		using (SqlDataReader dr = cmd.ExecuteReader())
		{
			while (dr.Read())
			{
				yield return dr.GetInt32(0);
			}
		}
	}
}

A good data access layer shouldn’t make the rest of the application care about how fast they process rows. If you’re familiar with yield already, then you have to weigh memory benefits of yield against concurrency problems and make an informed decision (good luck!)

November 6, 2011

Insights into “Timeout Expired”

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: , , , , — Michael J. Swart @ 8:53 pm
Timeout Expired. The timeout period elapsed prior to the completion
of the operation or the server is not responding

Many of you readers will recognize this error message. This error message is telling you that a query took too long to execute and caused this exception to be thrown. And unless you’ve changed the CommandTimeout property on your SqlCommand to a different value, then “too long” means longer than 30 seconds.

Logically, this means that if a query takes longer than 30 seconds to execute, the client will do two things:

  1. Tell the server to cancel the SQL Batch and
  2. Throw an exception to be handled by the application.

So in a diagram, that looks something like this:

Logical sequence diagrams

But that’s not the whole picture. Specifically, it doesn’t help me answer these questions:

  • So if the query has 30 seconds, does the stopwatch start when I send the query to the server?
  • Does the  stopwatch stop when I get the first row?
  • What if the client spends time processing data instead of fetching rows from the server (like when saving data to disk). Is that “on the clock”?
  • Why do I sometimes see timeouts on queries after more than 30 seconds?
  • Why do I sometimes see queries not time out even though they take longer than 30 seconds?

Microsoft documentation clears some of this up in SqlCommand.CommandTimeout Property. They explain:

Note: This property is the cumulative time-out for all network reads during command execution or processing of the results. A time-out can still occur after the first row is returned, and does not include user processing time, only network read time.

Using that note, we know that processing time is not “on the clock”. The timer is only running down when the client is waiting for a response from the server.

Tabular Data Stream (TDS)

We can answer the rest of the questions if we experiment a bit and if we understand exactly how the server and client communicate with eachother. Tabular Data Stream (TDS) is the conversation protocol that Microsoft uses to let database clients and servers communicate. It’s a set of messages and responses. TDS lets servers authenticate clients, it lets clients send SQL to the servers and it lets servers return data to clients. So if the previous diagram is more of a logical diagram, then here’s a diagram of the same simple query as a TDS conversation:

TDS Sequence Diagram

See how the row data response has been split into three packets? In testing I see that these packets can be 8 kilobytes in size so these packets usually contain several rows.

Now here’s a possible TDS conversation with a timeout:

TDS Sequence Diagram (with timeout)

Although the official docs say the time is cumulative, in practice, I only see timeouts when 30 seconds has elapsed between packets regardless of wait times for previous packets.

So based on this, what do we know?

  • The timeout stopwatch starts ticking when the query is sent to the server. It also restarts when the client needs data from the next packet of row data.
  • The stop watch stops when the client receives a packet from the server.
  • If the client takes a long time processing data sent from the server, this will not impact timeouts (I didn’t know this fact until recently).
  • Queries can take longer than the defined timeout period without timing out in situations like these:
    1. if either the client is processing data slowly or
    2. the server is responding slowly but steadily (i.e. without gaps between packets that exceed the timeout period).
  • In short, a timeout always represents server unresponsiveness as seen by the client.

Other Resources

October 26, 2011

Where Are Your Popular Joins?

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: , , , — Michael J. Swart @ 8:00 pm

So you know a lot about your databases right? You’re familiar with their schemas and tables and the queries that run on them. Personally I use sys.dm_exec_query_stats to understand what the most popular queries are.

But I recently started wondering about popular table joins.

I was wondering: “What tables in my database are most commonly joined together?” I already have a pretty good idea based on the data model. But I wanted to find out if the popular queries are in sync with my understanding. Unfortunately there’s no system view called sys.dm_exec_join_stats. The whole reason that I was curious is that I wanted to find a set of common table joins whose queries might be improved with a indexed view.

So I wrote something that gives me a bit of an idea. It’s a query that looks at cached query plans and counts nested loop joins (multiplied by execution count).

USE tempdb;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
BEGIN TRANSACTION
 
;WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
select 
	cp.usecounts as numberOfJoins,
	seeknodes.query('.') as plansnippet
into #my_joins
from sys.dm_exec_cached_plans cp
cross apply sys.dm_exec_query_plan(cp.plan_handle)
	as qp
cross apply query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple/QueryPlan//SeekKeys/Prefix[@ScanType="EQ"]') 
	as seeks(seeknodes)
where seeknodes.exist('./RangeColumns/ColumnReference[1]/@Database') = 1
	and seeknodes.exist('./RangeExpressions/ScalarOperator/Identifier/ColumnReference[1]/@Database') = 1;
 
WITH XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/showplan' as p1)
select sum(numberOfJoins) as [Number Of Joins],
	myValues.lookupTable + '(' + myValues.lookupColumn + ')' as lookupColumn,
	myValues.expressionTable + '(' + myValues.expressionColumn + ')' as expressionColumn
from #my_joins
cross apply plansnippet.nodes('./p1:Prefix/p1:RangeColumns/p1:ColumnReference[1]')
	as rangeColumns(rangeColumnNodes)
cross apply plansnippet.nodes('./p1:Prefix/p1:RangeExpressions/p1:ScalarOperator/p1:Identifier/p1:ColumnReference[1]')
	as rangeExpressions(rangeExpressionNodes)
cross apply (
	select
		rangeColumnNodes.value('@Database', 'sysname') as lookupDatabase, 
		rangeColumnNodes.value('@Schema', 'sysname') as lookupSchema,
		rangeColumnNodes.value('@Table', 'sysname') as lookupTable,
		rangeColumnNodes.value('@Column', 'sysname') as lookupColumn,
		rangeExpressionNodes.value('@Database', 'sysname') as expressionDatabase, 
		rangeExpressionNodes.value('@Schema', 'sysname') as expressionSchema,
		rangeExpressionNodes.value('@Table', 'sysname') as expressionTable,
		rangeExpressionNodes.value('@Column', 'sysname') as expressionColumn	
	) as myValues
where myValues.expressionTable != myValues.lookupTable
group by myValues.lookupTable, myValues.lookupColumn, myValues.expressionTable, myValues.expressionColumn
order by SUM(numberOfJoins) desc;
 
rollback;

Some caveats:

  • Parsing xml takes a lot of time and a lot of CPU (the subtree cost is huge and execution time is measured in seconds or minutes)
  • It’s only useful on a system that is used a lot (as opposed to a dev database).
  • It only reports statistics about queries that are found in cached plans. So the stats are only relevant since the last server restart
  • It only counts loop joins (not hash or merge joins)
  • If you want, you can adjust the query to include schema and database names in the results

I hope you find it useful. This query gives hints for further investigation into potential indexed views. It worked well for me and so I thought it was worth sharing. I ran this query against a server I know well and I was surprised at some of the results. Good luck.

October 19, 2011

Secret Santa as a Puzzle

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: , , , — Michael J. Swart @ 12:00 pm

So every year, the adults in my family pick names to do a gift exchange. I guess lots of people do this and they call it Secret Santa.

In my family there are five couples that participate and we draw names from a hat. Inevitably, someone will draw their own name or the name of their spouse and everyone puts the names back in the hat. It usually takes us quite a few tries before we get it right.

This past thanksgiving (Canadian) we went through this again. It’s fun and we always give the person who picked their own name a hard time. But I got to thinking about this as a puzzle:

“How can we exchange names fairly and secretly without the do-overs?”

Well, let’s look at this a few ways:

  • As a mathie.
  • As a computer techie.
  • From a practical point of view.

As a Mathie

I graduated from University of Waterloo with a BMath with a major in CS and C&O*. But since graduating, I’ve focused only on CS (Computer Science) things. The mathie in me wonders: “What are the odds that we pass the hat around successfully?”

I thought long and hard about it but couldn’t crack that question. My textbooks were no help and there were no professors on hand to answer, so I asked http://math.stackexchange.com for help. Hoping that I’d have just as much success with that site as I do with stackoverflow.com. I was not disappointed. Here’s the question I asked:

Five couples draw names from a hat. If a person draws their own name, or the name of their spouse, all the names go back in a hat and names are re-drawn. Using a computer, I know that the probability of this happening is 1 – (440192 / 10!) or about 88%. What’s a general expression for n couples?

And a stackexchange user joriki gave a brilliant answer:

By assigning a letter to each couple, this can be reduced to the problem of finding the number an of anagrams of a word with n different letters, each occurring twice, with no letters fixed. The desired number of permutations is then 2nan, since each of the n couples can be assigned in two ways to the two instances of its letter. Wikipedia mentions this problem as a generalized derangement problem. The general formula given for a word with numbers n1, … ,nr of r different letters is

where Pk is the k-th Laguerre polynomial. In the present case, r=n and ni=2, so we only need the second Laguerre polynomial, which is P2(x)=(1/2)(x2-4x+2). The n factors of (1/2) cancel with the n factors of 2, so the probability of success is

where (2n)! counts the total number of permutations. For n=5, Wolfram|Alpha gives 440192/(10!), as you calculated.

I think it’s a beautiful answer and for the first time in a long time, I missed doing Math. So this math problem and the probabilities are now well understood. But it doesn’t save any time picking names at Thanksgiving does it? Let’s look at it from another point of view.

As a Computer Techie

Using C#
Okay, I understand this subject much better. A C# program is easy to write. In fact, I wrote a quick one and it looks something like this:

static void Main()
{
	List<string> names = new List<string>() {"Mike", "Leanne", "Dave S",
		"Cindy", "Marianne", "Dave B", "Linda", "Matt",
		"Lisa", "Dan" };
 
	List<int> picks = new List<int>() { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
 
	while (!IsValid(picks))
	{
		Shuffle(ref picks);
	}
 
	PrintResults(picks, names);
}

You can look at the full program here. It works, but it doesn’t feel like the right solution to the problem: The results aren’t secret for one thing. We’d need a trusted third party to distribute the results. Also notice the while loop, we don’t know for certain if/when the while loop will break, so this program seems a little sloppy.

Using SQL
So now here’s a subject I’m really really comfortable with. I used an approach that my friend Paul Santos explained to me when we talked about this problem. His approach is this:

  • Drawing from a hat is a permutation of 10 names. So generate all the permutations
  • Filter out the invalid permutations
  • Pick a random permutation from the list of valid ones and report that

So here’s me showing all the permutations:

with OneToTen AS
(
	SELECT n
	FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) as N(n)
)
SELECT A.n, B.n, C.n, D.n, E.n, F.n, G.n, H.n, I.n, J.n
FROM OneToTen A
JOIN OneToTen B
	ON B.n NOT IN (A.n)
JOIN OneToTen C
	ON C.n NOT IN (A.n, B.n)
JOIN OneToTen D
	ON D.n NOT IN (A.n, B.n, C.n)
JOIN OneToTen E
	ON E.n NOT IN (A.n, B.n, C.n, D.n)
JOIN OneToTen F
	ON F.n NOT IN (A.n, B.n, C.n, D.n, E.n)
JOIN OneToTen G
	ON G.n NOT IN (A.n, B.n, C.n, D.n, E.n, F.n)
JOIN OneToTen H
	ON H.n NOT IN (A.n, B.n, C.n, D.n, E.n, F.n, G.n)
JOIN OneToTen I
	ON I.n NOT IN (A.n, B.n, C.n, D.n, E.n, F.n, G.n, H.n)
JOIN OneToTen J
	ON J.n NOT IN (A.n, B.n, C.n, D.n, E.n, F.n, G.n, H.n, I.n)
OPTION (FORCE ORDER, MAXDOP 1)

To filter out the invalid permutations, I add a where clause. To report a random permutation, I order the results by newid() and select the top 1 row:

with OneToTen AS
(
	SELECT n
	FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) as N(n)
)
SELECT TOP (1) A.n, B.n, C.n, D.n, E.n, F.n, G.n, H.n, I.n, J.n
FROM OneToTen A
JOIN OneToTen B
	ON B.n NOT IN (A.n)
JOIN OneToTen C
	ON C.n NOT IN (A.n, B.n)
JOIN OneToTen D
	ON D.n NOT IN (A.n, B.n, C.n)
JOIN OneToTen E
	ON E.n NOT IN (A.n, B.n, C.n, D.n)
JOIN OneToTen F
	ON F.n NOT IN (A.n, B.n, C.n, D.n, E.n)
JOIN OneToTen G
	ON G.n NOT IN (A.n, B.n, C.n, D.n, E.n, F.n)
JOIN OneToTen H
	ON H.n NOT IN (A.n, B.n, C.n, D.n, E.n, F.n, G.n)
JOIN OneToTen I
	ON I.n NOT IN (A.n, B.n, C.n, D.n, E.n, F.n, G.n, H.n)
JOIN OneToTen J
	ON J.n NOT IN (A.n, B.n, C.n, D.n, E.n, F.n, G.n, H.n, I.n)
WHERE A.n NOT IN (1, 2)
	AND B.n NOT IN (1, 2)
	AND C.n NOT IN (3, 4)
	AND D.n NOT IN (3, 4)
	AND E.n NOT IN (5, 6)
	AND F.n NOT IN (5, 6)
	AND G.n NOT IN (7, 8)
	AND H.n NOT IN (7, 8)
	AND I.n NOT IN (9, 10)
	AND J.n NOT IN (9, 10)
ORDER BY NEWID()
OPTION (FORCE ORDER, MAXDOP 1)

So I’m really happy with this. But only as a solution to a puzzle. The only thing it has going for it is that it’s guaranteed to halt. It doesn’t take less cpu than the C# program and we’re further away from telling relatives who they’re meant to buy presents for!

A Practical Point Of View

In real life, using authentication schemes and trusted third parties is a great way to bring any holiday party to a halt Mr. Buzz Killington (No more stuffing for you). So what kind of practical things can you do to make things easier or avoid do-overs? Here’s a couple ideas:

  • There are many many websites that are built just for this problem: SecretSanta.com, Elfster, and DrawNames.com. (I’m not making those up!) I think there’s even a facebook or iPhone app for that. I haven’t looked too closely at these so I can’t vouch for them.
  • You could write people’s names on an old deck of cards. Shuffle those cards and deal them out. You don’t avoid any problems this way – I mean, you won’t avoid getting your own name – but shuffling and dealing is quicker than passing around a hat.
  • If you pick your own name, replace it and pick again. The exchange won’t be 100% secret, but most people don’t care. This is what most Secret Santas are like.
  • Status quo. Do nothing. This is probably what my family will stick with. We’ll continue to pass around the hat and give people a hard time for picking their own names.

Have you got any other ideas? If you do a gift exchange with your family, what does your family do?

*~ CS and C&O are short for “Computer Science and Combinatorics and Optimization”, my first year at University was spent learning to pronounce that correctly.

October 13, 2011

Power View Demo Good, Not Great

So I caught the keynote speech online on the first day of the PASS Summit conference (2011). There were a couple announcements but no big ones. I remember this time last year, many were expecting a release this week. But the big release isn’t until next year. We can now start calling SQL Server codename Denali by its real name, SQL Server 2012. No surprises there. We also learned that the project codename Crescent will be called Power View.

the Power View Demo


You can actually watch the whole keynote here. There’s a demo of Power View near the end. This demo was presented by (new) Microsoft Technical Fellow Amir Netz. He did a good job of presenting what Power View can do. Power View is – among other things – a data visualization tool. Amir gave an updated version of last year’s demo where he explored movie and box-office data.
In Amir’s demo, we learned that:

  • Eddie Murphy voices Donkey in Shrek
  • Alan Rickman plays Professor Snape
  • John Wayne acted a lot.

Basically movie trivia. Amir demonstrated that it’s easy to dig into the data to any depth we want. After the talk, Ted Kummert thanked Amir and called the demo fun. It was a good demo, but it wasn’t a great demo. A lot of it was interesting, Amir and his team are clearly passionate about the product, but the best reactions on twitter could be summed up as “neat!” rather than “wow!”

Compare that to Hans Rosling

Hans Rosling gave a TED talk in 2006 called “Debunking third-world myths with the best stats you’ve ever seen” Several people on Twitter compared this TED talk to today’s keynote.

The data technology that Hans uses is something he developed called Gapminder. On the surface, it’s the same as Power View. As Hans speaks, we learn:

  • By most metrics, countries we think of as third-world are equivalent to 1960s U.S.A.
  • The term “developing country” carries a meaning with most people that’s not useful today.
  • There’s tremendous variation in regions (like sub-Saharan Africa, Arab states, etc…). A distinction that we rarely make.

Now that’s a great talk. Hans is a statistician first (not a B.I. expert). He demonstrates the power of data to illuminate topics and solve problems.

The Difference

So what takes Hans Rosling from good to great? What’s the difference between today’s keynote demo and the TED talk?

  • Both presenters are passionate and enthusiastic. There’s no faulting them there.
  • Only during the TED talk did I imagine myself solving problems. (Like using data to improve software quality… Lean-manufacturing style!)
  • There’s a difference in the importance of the data used in the examples…

A Parallel in the Teaching of Poetry

I’m reminded of something from the world of poetry actually. A poet, James Fenton, wrote a textbook called: An Introduction To English Poetry. In it, he gives advice to poetry teachers about the best examples of poetry to give poets-in-training. For illustration, Fenton talks about a particular form of poetry, the villanelle. What’s a villanelle? It’s a 19 line-poem with a very particular form including many repetitive lines. Here are two examples (from the public domain):

Do Not Go Gentle into that Dark Night
by Dylan Thomas

Do not go gentle into that good night,
Old age should burn and rage at close of day;
Rage, rage against the dying of the light.

Though wise men at their end know dark is right,
Because their words had forked no lightning they
Do not go gentle into that good night.

Good men, the last wave by, crying how bright
Their frail deeds might have danced in a green bay,
Rage, rage against the dying of the light.

Wild men who caught and sang the sun in flight,
And learn, too late, they grieved it on its way,
Do not go gentle into that good night.

Grave men, near death, who see with blinding sight
Blind eyes could blaze like meteors and be gay,
Rage, rage against the dying of the light.

And you, my father, there on the sad height,
Curse, bless me now with your fierce tears, I pray.
Do not go gentle into that good night.
Rage, rage against the dying of the light.

A Dainty Thing’s the Villanelle
by William Ernest Henley

A Dainty thing’s the Villanelle,
Sly, musical, a jewel in rhyme,
It serves its purpose passing well.

A double-clappered silver bell
That must be made to clink in chime,
A dainty thing’s the Villanelle;

And if you wish to flute a spell,
Or ask a meeting ‘neath the lime,
It serves its purpose passing well.

You must not ask of it the swell
Of organs grandiose and sublime–
A dainty thing’s the Villanelle;

And, filled with sweetness, as a shell
Is filled with sound, and launched in time,
It serves its purpose passing well.

Still fair to see and good to smell
As in the quaintness of its prime,
A dainty thing’s the Villanelle,
It serves its purpose passing well.

You don’t have to be an English major to tell which one of those poems is inspiring, and which one isn’t. Fenton argues that “…if you start from Thomas’s villanelle as a model, you will be setting your sights much higher than if you start from Henley.” And he’s quite right, people take cues from examples like these.

Thinking About Examples

Back to B.I. demos. Not every example in every talk in every conference has to change the world. But if you’re introducing a powerful new tool like Power View in a keynote, maybe “nifty” or “neat” isn’t the impact you’re aiming for. I wonder if there’s a missed opportunity here for Microsoft to inspire. I wonder what Amir Netz’s demo with Hans Rosling’s data would have looked like? In that hypothetical scenario, I bet Power View users who saw that demo would then be more likely to start projects that make people say “Wow!” instead of simply “Neat!”

September 8, 2011

Mythbusting: Concurrent Update/Insert Solutions

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: , , , — Michael J. Swart @ 8:00 am

I’ve recently come across a large number of methods that people use to avoid concurrency problems when programming an Insert/Update query. Rather than argue with some of them. I set out to test out how valid each method by using experiment! Here’s the myth:

Does Method X really perform Insert/Update queries concurrently, accurately and without errors?

… where X is one of a variety of approaches I’m going to take. And just like the Discovery Channel show Mythbusters, I’m going to call each method/procedure/myth either busted, confirmed or plausible based on the effectiveness of each method.
Jamie Hyneman and Adam Savage of Mythbusters

Actually, feel free to follow along at home (on development servers). Nothing here is really dangerous.

Here’s what my stored procedure should do. The stored procedure should look in a particular table for a given id. If a record is found, a counter field on that record is incremented by one. If the given id is not found, a new record is inserted into that table. This is the common UPSERT scenario.

I want to be able to do this in a busy environment and so the stored procedure has to co-operate and play nicely with other concurrent processes.

The Setup

The set up has two parts. The first part is the table and stored procedure. The stored procedure will change for each method, but here’s the setup script that creates the test database and test table:

/* Setup */
if DB_ID('UpsertTestDatabase') IS NULL
    create database UpsertTestDatabase
go
use UpsertTestDatabase
go
 
if OBJECT_ID('mytable') IS NOT NULL
    drop table mytable;
go
 
create table mytable
(
    id int,
    name nchar(100),
    counter int,
    primary key (id),
    unique (name)
);
go

The second thing I need for my setup is an application that can call a stored procedure many times concurrently and asynchronously. That’s not too hard. Here’s the c-sharp program I came up with: Program.cs. It compiles into a command line program that calls a stored procedure 10,000 times asynchronously as often as it can. It calls the stored procedure 10 times with a single number before moving onto the next number This should generate 1 insert and 9 updates for each record.

Method 1: Vanilla

The straight-forward control stored procedure, it simply looks like this:

/* First shot */
if OBJECT_ID('s_incrementMytable') IS NOT NULL
drop procedure s_incrementMytable;
go
 
create procedure s_incrementMytable(@id int)
as
declare @name nchar(100) = cast(@id as nchar(100))
 
begin transaction
if exists (select 1 from mytable where id = @id)
	update mytable set counter = counter + 1 where id = @id;
else
	insert mytable (id, name, counter) values (@id, @name, 1);
commit
go

It works fine in isolation, but when run concurrently using the application, I get primary key violations on 0.42 percent of all stored procedure calls! Not too bad. The good news is that this was my control scenario and now I’m confident that there is a valid concurrency concern here. And that my test application is working well.

Method 2: Decreased Isolation Level

Just use NOLOCKS on everything and all your concurrency problems are solved right?

if OBJECT_ID('s_incrementMytable') IS NOT NULL
	drop procedure s_incrementMytable;
go
 
create procedure s_incrementMytable(@id int)
as
declare @name nchar(100) = cast(@id as nchar(100))
 
set transaction isolation level read uncommitted
begin transaction
if exists (select 1 from mytable where id = @id)
	update mytable set counter = counter + 1 where id = @id;
else 
	insert mytable (id, name, counter) values (@id, @name, 1);
commit
go

I find out that there are still errors that are no different than method 1. These primary key errors occur on 0.37 percent of my stored procedure calls. NOLOCK = NOHELP in this case.

Method 3: Increased Isolation Level

So let’s try to increase the isolation level. The hope is that the more pessimistic the database is, the more locks will be taken and held as they’re needed preventing these primary key violations:

if OBJECT_ID('s_incrementMytable') IS NOT NULL
	drop procedure s_incrementMytable;
go
 
create procedure s_incrementMytable(@id int)
as
declare @name nchar(100) = cast(@id as nchar(100))
 
set transaction isolation level serializable
begin transaction
if exists (select 1 from mytable where id = @id)
	update mytable set counter = counter + 1 where id = @id;
else 
	insert mytable (id, name, counter) values (@id, @name, 1);
commit
go

Bad news! Something went wrong and while there are no primary key violations, 82% of my queries failed as a deadlock victim. A bit of digging tells me that several processes have gained shared locks and are also trying to convert them into exclusive locks… Deadlocks everywhere

Method 4: Increased Isolation + Fine Tuning Locks

Hmm… What does stackoverflow have to say about high concurrency upsert? A bit of research on Stackoverflow.com lead me to an excellent post by Sam Saffron called Insert or Update Pattern For SQL Server. He describes what I’m trying to do perfectly. The idea is that when the stored procedure first reads from the table, it should grab and hold a lock that is incompatible with other locks of the same type for the duration of the transaction. That way, no shared locks need to be converted to exclusive locks. So I do that with a locking hint.:

if OBJECT_ID('s_incrementMytable') IS NOT NULL
	drop procedure s_incrementMytable;
go
 
create procedure s_incrementMytable(@id int)
as
declare @name nchar(100) = cast(@id as nchar(100))
 
set transaction isolation level serializable
begin transaction
if exists (select 1 from mytable with (updlock) where id = @id)
	update mytable set counter = counter + 1 where id = @id;
else 
	insert mytable (id, name, counter) values (@id, @name, 1);
commit
go

Zero errors! Excellent! The world makes sense. It always pays to understand a thing and develop a plan rather than trial and error.

Method 5: Read Committed Snapshot Isolation

I heard somewhere recently that I could turn on Read Committed Snapshot Isolation. It’s an isolation level where readers don’t block writers and writers don’t block readers by using row versioning (I like to think of it as Oracle mode). I heard I could turn this setting on quickly and most concurrency problems would go away. Well it’s worth a shot:

ALTER DATABASE UpsertTestDatabase
SET ALLOW_SNAPSHOT_ISOLATION ON
 
ALTER DATABASE UpsertTestDatabase
SET READ_COMMITTED_SNAPSHOT ON
go
 
if OBJECT_ID('s_incrementMytable') IS NOT NULL
	drop procedure s_incrementMytable;
go
 
create procedure s_incrementMytable(@id int)
as
declare @name nchar(100) = cast(@id as nchar(100))
 
begin transaction
if exists (select 1 from mytable where id = @id)
	update mytable set counter = counter + 1 where id = @id;
else 
	insert mytable (id, name, counter) values (@id, @name, 1);
commit
go

Ouch! Primary key violations all over the place. Even more than the control! 23% of the stored procedure calls failed with a primary key violation. And by the way, if I try this with Snapshot Isolation, I not only get PK violations, I get errors reporting “Snapshot isolation transaction aborted due to update conflict”. However, combining method 4 with snapshot isolation once again gives no errors. Kudos to Method 4!

Other Methods

Here are some other things to try (but I haven’t):

  • Avoiding concurrency issues by using Service Broker. If it’s feasible, just queue up these messages and apply them one at a time. No fuss.
  • Rewrite the query above as: UPDATE…; IF @@ROWCOUNT = 0 INSERT…; You could try this, but you’ll find this is almost identical with Method 1.

So How Are We Going To Call This One?

So here are the results we have:

Concurrency Method Status Notes
Method 1: Vanilla Busted This was our control. The status quo is not going to cut it here.
Method 2: Decreased Isolation Level Busted NOLOCK = NOHELP in this case
Method 3: Increased Isolation Level Busted Deadlocks! Strict locking that is used with the SERIALIZABLE isolation level doesn’t seem to be enough!
Method 4: Increased Isolation + Fine Tuning Locks Confirmed By holding the proper lock for the duration of the transaction, I've got the holy grail. (Yay for StackOverflow, Sam Saffron and others).
Method 5: Read Committed Snapshot Isolation Busted While RCSI helps with most concurrency issues, it doesn't help in this
particular case.
Other Methods: Service Broker Plausible Avoid the issue and apply changes using a queue. While this would work, the architectural changes are pretty daunting
Update! (Sept. 9, 2011) Other Methods: MERGE statement Busted See comments section
Update! (Feb. 23, 2012) Other Methods: MERGE statement + Increased Isolation Confirmed With a huge number of comments suggesting this method (my preferred method), I thought I’d include it here to avoid any further confusion


Don’t try to apply these conclusions blindly to other situations. In another set of circumstances who knows what the results would be. But test for yourself. Like a good friend of mine likes to say: “Just try it!”

So that’s the show. If you have a myth you want busted. Drop me a line, or leave me a message in the comments section. Cheers ’til next time.

« Newer PostsOlder Posts »

Powered by WordPress