Michael J. Swart

November 28, 2011

SQL Fingertips

They Might Be Giants (TMBG) released an album called Apollo 18. It has 38 tracks and it runs for 42 minutes. It’s an album that will always remind me of my friend and first University roommate (Hey, what’s up Chris!). So the average track is about a minute long and this is due to a set of 21 tracks collectively called “fingertips”. Most of these fingertips are five to fifteen seconds long. Some are a bit longer. These fingertips are probably not good enough to be put on an album on their own. But collectively, they make a really cool album (especially on shuffle mode).

The List

So when I found myself with a collection of tiny tips that might not qualify as a blog post on their own, I thought of that album and a microsecond later, we’ve got SQL Fingertips!

So here they are, in no particular order.

  • Stackoverflow tip: Use the bounty feature without being afraid of the hit to your reputation. A bounty offered on a decent question usually attracts enough attention to get your question up-votes. And unless it’s a crummy question, those up-votes will earn you back the reputation points you spent.
  • T-SQL tip: You can use the following script as a template to be able to run a single query on multiple databases and have it return a single result set:
    DECLARE @temp TABLE (
    	ServerName sysname,
    	DBName sysname,
    	[Row Count] int
    );
     
    declare @schemaname sysname = 'Production';
    declare @tablename sysname = 'Product';
    declare @sql nvarchar(max) = N'use [?];
    	if exists (select 1 from sys.tables where name = ''' + @tablename + N''' and SCHEMA_NAME(schema_id) = ''' + @schemaname + N''')
    		exec sp_executesql N''
    			select @@servername, db_name(), Count(1) as [count]
    			from ' + QUOTENAME(@tablename) + '''';
     
    insert @temp (ServerName, DBName, [Row Count])
    exec master..sp_MSforeachdb @sql;
    select * from @temp;
  • SQL Server Management Studio tip: GO can take a parameter. If you write GO 100, your batch will be executed 100 times. Many people know this but I always here a couple “wow”s each time I demonstrate it. And here’s a couple more notes about that batch separator.
    • Don’t forget to SET ROWCOUNT OFF.
    • The batch separator doesn’t have to be GO. It’s configurable! I used to set it to ENGAGE for a while, but the joke got old and I switched it back.
  • Blog tip: If you like following blogs or people, but maybe you’re falling out of love with your RSS reader. No worries. You can get any RSS feed at all sent straight to your inbox.  There are lots of RSS-to-inbox websites out there. The one I recommend is FeedMyInbox.com. So, for example, to get my own blog delivered to your inbox, start here.
  • Data → Eye → Brain, A Data Visualization tip: Data visualizations (charts, diagrams, etc..) help us gain knowledge and understand data that maybe we couldn’t understand otherwise (See Anscombe’s Quartet). And for large and complex datasets, the best visualizations are usually custom designed and interactive. There’s a lot of power in customized visualizations: See what David McCandless and Miriah Meyers have to say.
  • Twitter DBA Tip #1: I put a call on twitter asking people to tweet the one database tip they thought more people should know. The first one comes from @BrentO: “Don’t leave your database in full recovery mode if you’re not doing t-log backups.” He’s right. Full recovery mode usually implies a backup strategy that includes transaction log backups. (And full or differential backups are not enough either).
  • Twitter DBA Tip #2: This one’s from @wnylibrarian“If using SQLExpress & no SQLAgent, try using SQLCMD in a batchfile.” I’m glad Gary tweeted this one. I’ve come up against this problem in my own work. It turns out that when downgrading editions of SQL Server, often one of the first things that you miss is SQL Agent. If you follow Gary’s advice, it’s not a hurdle, it’s only an inconvenience.
  • Twitter DBA Tip #3: Next I heard from @DaveH0ward who wants to remind people that “You need to size your data /log files and set autogrowth amounts appropriately.” I think a lot of people don’t know that the default auto-growth settings for SQL Server databases are a bad idea. Tim Ford warns us of dire consequences for not following this tip.
  • Twitter DBA Tip #4: Finally I heard from @SQLSandwhiches (aka Adam Mikolaj). He shared a pet peeve “I’ve seen so many servers that don’t have logs to auto cycle. Don’t you love opening a log file with 300,000 entries?” Adam wrote about his experience earlier this year (including a solution of course).
  • MixedTape Fingertip Tip: Say you find yourself back in the nineties making a mixed tape and you find that you have a minute left to fill on one side of the tape. They Might Be Giants to the rescue! Pick several fingertips from their Apollo 18 album and optimize every second of that mixed tape! My personal favorite is the 45 second Minimum Wage off of their other album Flood.

So how many of these SQL Fingertips did you know? Have you got a small tip? Share it in the comments. Have you got a whole bunch of small tips? Write your own SQL Fingertip post. I’d be especially grateful to read a set of Business Intelligence fingertips.

November 16, 2011

Be Quick About Reading The Data You Asked For

So last week, I mentioned that if an application takes it’s time processing data returned from a query it executed, it will not cause ADO.net to time out. This week I’m going to qualify that and on the way, I’ll explain a problem associated with “lazy” readers. I’ll talk about what lazy readers look like from SQL Server’s point of view. And I’ll talk about a potential pitfall with the C# keyword “yield“.

So say you’ve written an application that accesses the database using ADO.net’s SqlDataReader. As you process the rows returned from the server, you might take a while between one Read() call and the next. If that happens, it can lead to problems. In other words:

If…

  • Your application has issued a query using a DataReader,
  • And the amount of data returned is more than 8 kilobytes (the size of an SNI packet),
  • And your application takes its time processing rows (i.e. time between Read() calls),

Then…

  • Your database query is still active!
  • And your transaction is still open! (Remember that even if you don’t use an explicit BEGIN TRANSACTION, SQL Server’s AUTOCOMMIT mode means that each query is it’s own mini-transaction and that transaction isn’t over yet!)
  • Memory used by the database for your query is still in use
  • And resource locks that were taken to process your query are still held!

But you might get lucky…

  • For example, you might be using READ COMMITTED SNAPSHOT isolation level (which is “READ COMMITTED with more awesome” according to Kendra Little). Because even though your query is still using the same number of resources, there’s a smaller risk of seeing concurrency issues.
  • Or if you’re using the default isolation level READ COMMITTED (which you probably are) SQL Server might have given up some of the locks that have been acquired while processing your rows. In this case, the number of locks held at any time is fairly small. And you might not see any problems (at first).

Or you might not be so lucky…

  • More likely, your query will be doing something that’s not simple. It could be using a sort, or parallelism or a particularly involved join. In that case, the query will hold onto the locks it uses without releasing them.
  • Then you might have concurrency issues.
  • Concurrency problems lead to deadlocks and excessive blocking.
  • And that can lead to…

Timeouts!
All because of the processing  on the client side took too long.

But How Can You Tell?
How do you know if this is a problem in your case?

From The Server Side

Look at the wait stats that are collected with sys.dm_os_wait_stats system view. (Or even better, look at Glenn Berry’s DMV query that reports on the same thing). Now take a look at the top wait statistics. If SQL Server is waiting on a client to process data, you’ll see ASYNC_NETWORK_IO as one of the top wait statistics. One interesting note about SQL Server 2012. I noticed that you might see this wait stat instead: PREEMPTIVE_OS_WAITFORSINGLEOBJECT.

That tells you whether this has been a problem since the last server restart. But if you want to know if the database is waiting on some client right now, you can use sp_whoisactive by Adam Machanic. (You could also query sys.dm_exec_requests, or in a pinch sys.sysprocesses). Again, if you see a query waiting on things like ASYNC_NETWORK_IO or PREEMPTIVE_OS_WAITFORSINGLEOBJECT, then that query is waiting for a lazy client application to read its data.

From The Client Side

Make sure your DataReaders are quick. You can tell pretty easily most of the time. Check out these two examples:

IEnumerable<int> ReturnASum()
{
	int sum = 0;
	using (SqlConnection conn = new SqlConnection("blah blah"))
	using(SqlCommand cmd = new SqlCommand(sqlStatement, conn))
	{
		conn.Open();
		using (SqlDataReader dr = cmd.ExecuteReader())
		{
			while (dr.Read())
			{
				sum += dr.GetInt32(0);
			}
		}
	}
	return sum;
}

Pretty straightforward, but compare that to the next example. This next example might be a problem:

void SendDataIntoSpace()
{
	using (SqlConnection conn = new SqlConnection("blah blah"))
	using(SqlCommand cmd = new SqlCommand(sqlStatement, conn))
	{
		conn.Open();
		using (SqlDataReader dr = cmd.ExecuteReader())
		{
			while (dr.Read())
			{
				SendToMarsRover(dr.GetInt32(0));
			}
		}
	}
}

Sometimes it can be tricky to tell whether code will be processed quickly or not. If your data rows are yielded to the calling procedure, then the processing speed is no longer under your control. In this next example, we can’t tell by inspection how quickly the data will be processed. To do that, we have to look at all the places where this method is called.

IEnumerable<int> LetSomeoneElseReadDataAtTheirOwnPace()
{
	using (SqlConnection conn = new SqlConnection("blah blah"))
	using(SqlCommand cmd = new SqlCommand(sqlStatement, conn))
	{
		conn.Open();
		using (SqlDataReader dr = cmd.ExecuteReader())
		{
			while (dr.Read())
			{
				yield return dr.GetInt32(0);
			}
		}
	}
}

A good data access layer shouldn’t make the rest of the application care about how fast they process rows. If you’re familiar with yield already, then you have to weigh memory benefits of yield against concurrency problems and make an informed decision (good luck!)

November 7, 2011

T-SQL Tuesday #024: Procedures “By The Numbers”

Filed under: SQLServerPedia Syndication,Tongue In Cheek — Tags: , , , — Michael J. Swart @ 10:12 pm

T-SQL Tuesday Logo
I’m super excited about this month’s T-SQL Tuesday, for a couple reasons. First of all, this month’s T-SQL Tuesday is hosted by Brad Schulz. Brad is a guy that I got to interview a couple years ago. Second, is that the topic is Sprox ‘n’ Funx (Stored Procedures and Functions). Which I like because the topic is nice and technical.

I started out thinking that I could treat the question “Are stored procedures a good thing?” But that issue is a contentious one and it would take a book to treat that issue properly (So maybe later)

Today, I’m giving some attention to some interesting numbers related to stored procedures (the format is taken from Harper’s Index a popular feature in Harper’s magazine)

Stored Procedures By The Numbers

250: The largest size in megabytes allowed for the source text of a stored procedure.

2,100: The largest number of parameters allowed for a stored procedure.

32: The maximum number of nested levels allowed for stored procedures.

1,598: The number of mentions in sys.messages that contain the word procedure.

858: The number of mentions for function.

389,000: The number of Google results for should use stored procedures.

29,500: The number of Google results reported for should not used stored procedures.

1 in 866: The odds that a random word taken from MichaelJSwart.com is procedure.

1 in 879: The odds that a random word taken from the 2011 State of the Union address is budget.

18: The number of participating blogs in Brad Schulz’ T-SQL Tuesday on the subject of stored procedures and functions.

1875: Earliest year I could find a published mention for the word sproc*

* ~  Lives of the Irish Saints (1875) mentions a St. Sproc daughter of Colum. I am not making that up!

November 6, 2011

Insights into “Timeout Expired”

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: , , , , — Michael J. Swart @ 8:53 pm
Timeout Expired. The timeout period elapsed prior to the completion 
of the operation or the server is not responding

Many of you readers will recognize this error message. This error message is telling you that a query took too long to execute and caused this exception to be thrown. And unless you’ve changed the CommandTimeout property on your SqlCommand to a different value, then “too long” means longer than 30 seconds.

Logically, this means that if a query takes longer than 30 seconds to execute, the client will do two things:

  1. Tell the server to cancel the SQL Batch and
  2. Throw an exception to be handled by the application.

So in a diagram, that looks something like this:

Logical sequence diagrams

But that’s not the whole picture. Specifically, it doesn’t help me answer these questions:

  • So if the query has 30 seconds, does the stopwatch start when I send the query to the server?
  • Does the  stopwatch stop when I get the first row?
  • What if the client spends time processing data instead of fetching rows from the server (like when saving data to disk). Is that “on the clock”?
  • Why do I sometimes see timeouts on queries after more than 30 seconds?
  • Why do I sometimes see queries not time out even though they take longer than 30 seconds?

Microsoft documentation clears some of this up in SqlCommand.CommandTimeout Property. They explain:

Note: This property is the cumulative time-out for all network reads during command execution or processing of the results. A time-out can still occur after the first row is returned, and does not include user processing time, only network read time.

Using that note, we know that processing time is not “on the clock”. The timer is only running down when the client is waiting for a response from the server.

Tabular Data Stream (TDS)

We can answer the rest of the questions if we experiment a bit and if we understand exactly how the server and client communicate with eachother. Tabular Data Stream (TDS) is the conversation protocol that Microsoft uses to let database clients and servers communicate. It’s a set of messages and responses. TDS lets servers authenticate clients, it lets clients send SQL to the servers and it lets servers return data to clients. So if the previous diagram is more of a logical diagram, then here’s a diagram of the same simple query as a TDS conversation:

TDS Sequence Diagram

See how the row data response has been split into three packets? In testing I see that these packets can be 8 kilobytes in size so these packets usually contain several rows.

Now here’s a possible TDS conversation with a timeout:

TDS Sequence Diagram (with timeout)

Although the official docs say the time is cumulative, in practice, I only see timeouts when 30 seconds has elapsed between packets regardless of wait times for previous packets.

So based on this, what do we know?

  • The timeout stopwatch starts ticking when the query is sent to the server. It also restarts when the client needs data from the next packet of row data.
  • The stop watch stops when the client receives a packet from the server.
  • If the client takes a long time processing data sent from the server, this will not impact timeouts (I didn’t know this fact until recently).
  • Queries can take longer than the defined timeout period without timing out in situations like these:
    1. if either the client is processing data slowly or
    2. the server is responding slowly but steadily (i.e. without gaps between packets that exceed the timeout period).
  • In short, a timeout always represents server unresponsiveness as seen by the client.

Other Resources

Powered by WordPress