Michael J. Swart

April 27, 2016

You Can’t Force Query Plans If They Use TVPs With PKs

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 12:24 pm

Have you ever played “Fortunately/Unfortunately”? It’s a game where players alternately give good news and bad news. It goes something like this:

Databases such as SQL Server make it easy to retrieve sets of data.
Unfortunately, it’s kind of awkward to send sets of data to SQL Server.
Fortunately, table-valued parameters (TVPs) make this easier.
Unfortunately, queries that use TVPs often suffer from non-optimal query plans.
Fortunately, creating primary key or unique key constraints gives us a way to index table types.
Unfortunately, those constraints prevent any kind of plan forcing.
Fortunately, SQL Server 2014 lets us create named indexes for table types which lets us force plans if we need to.

otherhand

Let’s break this down:

Sending Sets of Data to SQL Server is Awkward

It always has been. Originally, developers were forced to send a CSV string to SQL Server and write a do-it-yourself function to split the string into a set of values.

  • In 2005, Microsoft introduced XML and CLR which let developers shred or split strings in new ways,
  • In 2008, Microsoft introduced table-valued parameters,
  • In 2014, they introduced In-Memory TVPs,
  • In 2016, there’s a new SPLIT_STRING() function

So there are more options now then there ever have been and they each have their own issues.

Aaron Bertrand explores some of those performance issues in STRING_SPLIT() in SQL Server 2016. It’s a specific use-case where he focuses on duration. In our case, we focus on aggregated system load like worker time or reads so we don’t necessarily value parallel query plans. But I love his methods. He gives us tools that let us evaluate our own situation based on our own criteria.

I’m going to focus on TVPs which is the most natural method of sending sets of data to SQL Server from a syntax point of view.

Indexes on Table Types

Table-valued parameters are implemented using table types. Before SQL Server 2014, the only way to index a table type was to define a primary key or a unique key on it like this:

create type dbo.TypeWithPK 
    as table ( id int primary key );

The syntax for CREATE TYPEprevents us from naming our primary key and this turns out to be important. Every time I define and use a table variable, SQL Server will dynamically generate a name for the primary key. So when I look at the plan for

declare @ids dbo.TypeWithPK;
select * from @ids

I see that it has a primary key named [@ids].[PK__#A079849__3213E83FDB6D7A43]:
PKName

As I’ll show later, this dynamically generated name prevents any kind of query plan forcing. But as of SQL Server 2014, we can include indexes in our table type definitions. More importantly, we can name those indexes:

create type dbo.TypeWithIndex 
    as table ( id int index IX_TypeWithIndex );
go
declare @ids dbo.TypeWithIndex;
select * from @ids;

This has a primary key named [@ids].[IX_TypeWithIndex] which is what we expect.

Plan Forcing is Not Allowed For TVPs with PKs

Where does plan forcing fit in your tool belt? For me, I’ve never used plan forcing as a permanent solution to a problem, but when I see a query that often suffers from suboptimal query plan choices, I look to plan guides to give me some stability while I work at fixing and deploying a permanent solution.

Plan forcing in SQL Server involves specifying a plan for a particular query. But the primary key name for a table variable is always different so the specified query plan is never going to match. In other words SQL Server is never going to use your query plan because your plan includes index [@ids].[PK__#A079849__3213E83FDB6D7A43], but the query it’s compiling has a differently named index like [@ids].[PK__#AA02EED__3213E83FAF123E51].

If you try, this is what that failure looks like:

USE PLAN
If you try to use the USE PLAN query hint, you’ll get error 8712:

Msg 8712, Level 16, State 0, Line 15
Index '@ids.PK__#B305046__3213E83F57A32F24', specified in the USE PLAN hint, does not exist. Specify an existing index, or create an index with the specified name.

Plan Guides
If you try to force the plan by creating a plan guide, you’ll also see message 8712:

select 
from sys.plan_guides
cross apply fn_validate_plan_guide(plan_guide_id)
-- Index '@ids.PK__#BA711C0__3213E83F44A3F2C8', specified in the USE PLAN hint, does not exist. Specify an existing index, or create an index with the specified name.

Query Store
And if you try to force a plan using SQL Server 2016’s Query Store, you’ll see this:

select plan_id, last_force_failure_reason_desc
from sys.query_store_plan
where is_forced_plan = 1
-- last_force_failure_reason_desc = 'NO_INDEX'

Summary

When defining table variables, avoid primary key or unique key constraints. Opt instead for named indexes if you’re using SQL Server 2014 or later. Otherwise, be aware that plan forcing is limited to queries that don’t use these table variables.

April 20, 2016

Are You Programming In The Database?

Typically, T-SQL is not the best platform for programming (understatement). If you have many procedures that call other procedures, that’s a signal that you might be programming in the database.

Find out using this query:

select 
    OBJECT_SCHEMA_NAME(p.object_id) as schemaName, 
    OBJECT_NAME(p.object_id) as procedureName,
    count(*) as [calls to other procedures]	
from sys.procedures p
cross apply sys.dm_sql_referenced_entities(schema_name(p.schema_id) + '.' + p.name, 'OBJECT') re
where re.referenced_entity_name in (select name from sys.procedures)
group by p.object_id
order by count(*) desc;

in Adventureworks, we see this result:
ProcsCallingProcs1

To drill down into those results, use this query:

select distinct
    QUOTENAME(OBJECT_SCHEMA_NAME(p.object_id)) + '.' 
        + QUOTENAME(OBJECT_NAME(p.object_id)) [This procedure...], 
    QUOTENAME(OBJECT_SCHEMA_NAME(p_ref.object_id)) + '.' 
        + QUOTENAME(OBJECT_NAME(p_ref.object_id)) [... calls this procedure]
from sys.procedures p
cross apply sys.dm_sql_referenced_entities(schema_name(p.schema_id) + '.' + p.name, 'OBJECT') re
join sys.procedures p_ref
	on re.referenced_entity_name = p_ref.name
order by 1,2

which gives results like this:
ProcsCallingProcs2

Adventureworks seems just fine to me. Only four instances of procedures calling procedures. I looked at the database I work with most. Hundreds of procedures (representing 15% of the procedures) call other procedures. On the other end of the spectrum is Stackoverflow. I understand that they don’t use stored procedures at all.

April 11, 2016

Tackle WRITELOG Waits Using the Transaction Log and Extended Events

Takeaway: WRITELOG waits are associated with a busy or slow transaction log. To tackle these waits, we need to measure transaction log activity. I describe a lightweight way to examine transaction log usage for busy OLTP systems.

Tackle WRITELOG

Start with Microsoft’s Advice: I’m not going to introduce the topic of transaction log performance. Microsoft’s SQL Customer Advisory Team already provides a great introduction with Diagnosing Transaction Log Performance Issues and Limits of the Log Manager. Their advice includes watching the “Log Bytes Flushed/sec” performance counter found in the “SQL Server:Databases” object.

Reactive Efforts: If you’re curious about transaction log activity for open transactions, Paul Randal has a script at Script: open transactions with text and plans.

Spiky Activity: It’s not too difficult to find infrequent activities that write a lot of data to the transaction log; activities like data warehouse ETLs, or index rebuilds. Use a trace or extended events to look for statements with large values for “writes”.

Scalability of OLTP Workloads

WRITELOG waits are a scalability challenge for OLTP workloads under load. Chris Adkin has a lot of experience tuning SQL Server for high-volume OLTP workloads. So I’m going to follow his advice when he writes we should minimize the amount logging generated. And because I can’t improve something if I can’t measure it, I wonder what’s generating the most logging? OLTP workloads are characterized by frequent tiny transactions so I want to measure that activity without filters, but I want to have as little impact to the system as I can. That’s my challenge.

Getting #SQLHelp

So I asked twitter. And I got some great advice from Erin Stellato:
sqlhelp
Erin also pointed out that the UI warns you that it’s a very high volume event.

Combining fn_dblog With Extended Events

So to avoid that kind of volume, I got the idea to read straight from the transaction log and combine that with a lighter extended events session to get the SQL text. The transaction_id captured by the extended events session corresponds to the XAct ID column in fn_dblog.

Here’s how that went:

The Script
The details for this script are kind of fussy, but it all comes together in a solution that won’t drag a server down. Care is still recommended; start with 10 seconds and go from there.

declare @Duration varchar(10) = '00:00:10';
declare @FileSize varchar(10) = '5'; -- in megabytes
 
-- create session
DECLARE @CreateSessionSQL nvarchar(max) = N'
    CREATE EVENT SESSION query_writes ON SERVER 
    ADD EVENT sqlserver.sp_statement_completed ( 
        SET collect_statement=(0)
        ACTION(sqlserver.transaction_id, sqlserver.database_name)
        WHERE sqlserver.transaction_id > 0
          AND sqlserver.database_name = ''' + DB_NAME() + N''')
    ADD TARGET package0.asynchronous_file_target(
      SET filename = N''query_writes.xel'',
          max_file_size = ' + @FileSize + N',
          max_rollover_files = 1)
    WITH (
        STARTUP_STATE=ON,
        EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
        TRACK_CAUSALITY=OFF)';
exec sp_executesql @CreateSessionSQL;
 
ALTER EVENT SESSION query_writes ON SERVER
    STATE=START;
 
-- get the latest lsn for current DB
declare @xact_seqno binary(10);
declare @xact_seqno_string_begin varchar(50);
exec sp_replincrementlsn @xact_seqno OUTPUT;
set @xact_seqno_string_begin = '0x' + CONVERT(varchar(50), @xact_seqno, 2);
set @xact_seqno_string_begin = stuff(@xact_seqno_string_begin, 11, 0, ':')
set @xact_seqno_string_begin = stuff(@xact_seqno_string_begin, 20, 0, ':');
 
-- wait a minute
waitfor delay @Duration;
 
-- get the latest lsn for current DB
declare @xact_seqno_string_end varchar(50);
exec sp_replincrementlsn @xact_seqno OUTPUT;
set @xact_seqno_string_end = '0x' + CONVERT(varchar(50), @xact_seqno, 2);
set @xact_seqno_string_end = stuff(@xact_seqno_string_end, 11, 0, ':')
set @xact_seqno_string_end = stuff(@xact_seqno_string_end, 20, 0, ':');
 
-- Stop the session
ALTER EVENT SESSION query_writes ON SERVER
    STATE=STOP;
 
-- read from transaction log
select 
    max([Xact ID]) as transactionId,
    max([Transaction Name]) as transactionName, 
    sum([Log Record Length]) as logSize,
    count(*) as [logRowCount]
into #TLOGS
from fn_dblog(@xact_seqno_string_begin, @xact_seqno_string_end) f
group by [Transaction Id]
 
-- read from session data
CREATE TABLE #SessionData (
    id int identity primary key,
    XEXml xml NOT NULL    
)
 
INSERT #SessionData(XEXml)
SELECT CAST(fileData.[event_data] as xml)
FROM sys.fn_xe_file_target_read_file ( 'query_writes*xel', null, null, null) fileData;
 
-- find minimum transactionId captured by xes 
-- (almost always the first one, depending on luck here)
declare @minTXFromSession bigint;
select TOP (1) @minTXFromSession = S.XEXml.value(
    '(/event/action[(@name=''transaction_id'')]/value)[1]', 'bigint')
from #SessionData S;
 
WITH SD AS
(
    SELECT 
        S.XEXml.value(
            '(/event/action[(@name=''transaction_id'')]/value)[1]', 'bigint') as transactionId,
        S.XEXml.value(
            '(/event/data[(@name=''object_id'')]/value)[1]', 'bigint') as objectId
    FROM #SessionData S
)
SELECT 
    ISNULL(T.transactionName, 'Unknown') as transactionTypeName, 
    OBJECT_NAME(S.objectid) as ObjectName,
    SUM(T.logsize) as totalLogSizeBytes,
    SUM(T.logRowCount) as totalLogRowCount,
    COUNT(*) as executions
FROM #TLOGS T
LEFT JOIN (SELECT DISTINCT * FROM SD) S
    ON T.transactionId = S.transactionId
WHERE T.transactionId >= @minTXFromSession
GROUP BY T.transactionName, S.objectId
ORDER BY SUM(T.logsize) DESC
 
-- clean up
DROP EVENT SESSION query_writes ON SERVER;
DROP TABLE #TLOGS
DROP TABLE #SessionData

Sample Results

Here’s an example of what the results would look like. It’s an aggregated view of all transaction log activity in a database for 10 seconds.
Example Results

Notes

  • Notice that the session is database specific. That’s because transaction logs are database specific. To help focus on the right database, use the “Log Bytes Flushed/sec” performance counter found in the “SQL Server:Databases” object.
  • Also notice that I’m tracking ObjectIds. That’s because we use procedures quite heavily. You may want to adapt this code to use query_hash instead. In both cases, collecting the statement text is not recommended.
  • The sample of data is limited by the size of the extended events target file or the duration variable, whichever is smaller.
  • @sqL_handLe pointed out to me that reading the log using fn_dblog will prevent the transaction log from truncating. Reading from the transaction log can be very tricky to do efficiently. Luckily we can use the sp_replincrementlsn trick to get LSN parameter values for fn_dblog.

March 23, 2016

Microsoft Fixed My Biggest SQL Server Pet Peeve (Two Years Ago)

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 8:00 am

Takeaway: The best way to avoid tempdb GAM and PFS contention caused by table-valued parameters (TVPs) is to use Memory-Optimized Table Variables.

This is my last post about tempdb. You can’t believe how wonderful it is to type that.*

explodedTempdb

SQL Server 2014 can eradicate tempdb contention and I didn’t even know it. Just yesterday I was saying that if I could have my one SQL Server wish, it would be to use table-valued parameters at high frequency without suffering from tempdb latch contention on GAM or PFS.

Then, I saw Jos de Bruijn’s article Improving temp table and table variable performance using memory optimization. He pointed out how In-Memory OLTP can alleviate excessive tempdb use. The piece that I was most interested in involves SQL Server 2014’s Memory-Optimized Table Parameters. Jos describes them as a great solution for “workloads that intensively use TVPs”.

So now in addition to database with types like this:

CREATE TYPE dbo.BIGINTSET AS TABLE
    ([Value] BIGINT NOT NULL INDEX IX_BIGINTSET);

I can add an additional memory optimized type like this:

CREATE TYPE dbo.BIGINTSET_OLTP AS TABLE
    ([Value] BIGINT NOT NULL INDEX IX_BIGINTSET_OLTP)
    WITH (MEMORY_OPTIMIZED=ON);

Any procedures that use the new table type BIGINTSET_OLTP will not touch tempdb.

It seems pretty easy doesn’t it? In this example, the reason I don’t just replace the existing type is because I want to be very deliberate about choosing memory optimized parameters. By adding a new table type, I make that choice in each procedure or query on a case by case basis.

To see whether this technique avoids hitting tempdb as advertised, I wrote this demo.

A Complete Demo

use master
GO
 
IF ( DB_ID('MyTVPTest') IS NOT NULL )
BEGIN
    ALTER DATABASE MyTVPTest SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE MyTVPTest;
END
GO
 
CREATE DATABASE MyTVPTest;
 
ALTER DATABASE MyTVPTest 
    ADD FILEGROUP MyTVPTest_MOD CONTAINS MEMORY_OPTIMIZED_DATA;
 
ALTER DATABASE MyTVPTest 
    ADD FILE (name='MyTVPTest_MOD1', filename='c:\sqldata\MyTVPTest_MOD1') 
    TO FILEGROUP MyTVPTest_MOD;
GO
 
use MyTVPTest;
GO
 
CREATE TYPE dbo.MyMOIntSet AS TABLE
    ([Value] INT NOT NULL INDEX IX_MyMOIntSet)
    WITH (MEMORY_OPTIMIZED=ON);
 
CREATE TYPE dbo.MyIntSet AS TABLE
    ([Value] INT NOT NULL INDEX IX_MyIntSet);
GO
 
CREATE TABLE dbo.MyTable
(
    id int not null primary key
);
 
INSERT dbo.MyTable (id)
SELECT top 1000 ROW_NUMBER() OVER (ORDER BY (SELECT 1))
FROM sys.messages;
GO
 
CREATE PROCEDURE dbo.s_FetchMyTableRowsWithFilter
    @Filter MyIntSet READONLY
AS
 
    SELECT MT.id
    FROM dbo.MyTable MT
    JOIN @Filter FT 
        ON FT.[Value] = MT.id;
GO
 
CREATE PROCEDURE dbo.s_FetchMyTableRowsWithMOFilter
    @Filter MyMOIntSet READONLY
AS
 
    SELECT MT.id
    FROM dbo.MyTable MT
    JOIN @Filter FT 
        ON FT.[Value] = MT.id;
GO
 
-- Call procedure using Memory Optimized table type
DECLARE @F_MO MyMOIntSet;
INSERT @F_MO ([Value]) VALUES (1), (3), (5);
EXEC dbo.s_FetchMyTableRowsWithMOFilter @F_MO;
 
-- Call procedure using regular table type
DECLARE @F MyIntSet;
INSERT @F ([Value]) VALUES (1), (3), (5);
EXEC dbo.s_FetchMyTableRowsWithFilter @F;

Look at the last two queries. The first one invokes a procedure that uses a memory-optimized table variable. The other one invokes a different procedure that uses a standard table variable. Then look at what’s going on in tempdb’s transaction log (using techniques found here). With the standard table variable I see this sort of activity in tempdb’s log:
tempdbActivity

But I see that memory-optimized queries avoids all tempdb activity. It’s not just the logging activity that’s avoided, all tempdb activity is avoided. The tempdb is simply not touched in this case.

Caveats

Notice a few things:

  1. This mainly addresses tempdb contention experienced by workloads with extremely frequent TVP use (thousands per second).
  2. In SQL Server 2014, memory-optimized table variables precludes the use of parallel queries.
  3. As the feature name suggests, the temp tables live in memory.

Luckily the high frequency described by the first caveat means that the queries I use must be as lightweight as possible. And that means that I don’t mind the second or third caveat so much. Aaron Bertrand dives deeper on the performance of Memory-Optimized table variables at Hekaton with a twist: In-memory TVPs – Part 3.

In the past, my colleagues and I have been incredibly frustrated at this bottleneck. A bottleneck that could not be fixed, not even with dollars. And so I’m looking forward to this feature more than any other single feature introduced in 2014 or 2016. We vendors are often constrained to use only those features present in every SQL Server version we support. In my case, it will be a while before we de-support SQL Server 2012 but that day will be like Christmas.

* I’ve written before about TVPs and the different kinds of contention caused by frequent use of TVPs:

Table Valued Parameters, A Short, Complete Example
PAGELATCH_EX Contention on 2:1:103
Follow up on Ad Hoc TVP contention
Three More Tricky Tempdb Lessons
What’s Going On Inside Tempdb?
Avoid Frequent use of TVPs With Wide Rows
Troubleshooting Tempdb, a Case Study

February 25, 2016

Look at Blocked Process Reports Collected With Extended Events

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 9:25 am

SQL Server Concurrency

I just met a friend at a SQL Saturday who let me know that he recognizes my name because it was attached to a project I wrote five years ago. The “Blocked Process Report Viewer”. I was impressed. I’m glad to know that it’s still used. So I decided to update it for 2016.

I’ve extended the Blocked Process Report Viewer to look at blocked process reports that were collected with extended events session.

bpr2

Where To Find It

The code for the procedure is where it always is on CodePlex at https://sqlblockedprocesses.codeplex.com/:

download

How To Use It

The viewer can consume blocked process report events captured by any extended events session as long as that session has a target of ring_buffer or event_file. For example, if you set up your extended events session the way Jeremiah Peschka did in Finding Blocked Processes and Deadlocks using SQL Server Extended Events. Then you would use the viewer like this:

exec sp_blocked_process_report_viewer
    @Source = 'blocked_process', -- the name that Jeremiah gave to his xe session
    @Type = 'XESESSION';

which gives you something like

The value of the blocked process report viewer is that it organizes all the blocked processes into “episodes”. Each episode has a lead blocker which is the process in the front of the traffic jam. Look at that process closely.

Feedback

Let me know how it goes. Tell me if there are any errors or performance issues. I’m especially interested in the performance of the viewer when given extended events sessions that use ring_buffer targets.

February 17, 2016

Future Proofing for Concurrency; Blocked Process Reports Are Not Enough

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 8:00 am
SQL Server Concurrency

Takeaway: When assessing concurrency, performance testing is the primary way to ensure that a system can handle enough load concurrently. Monitoring blocked process reports does not give enough warning to anticipate blocking issues.

Logical Contention

Capacity planning is difficult for DBAs who expect growth. Will there be enough CPU, Memory or I/O to serve the anticipated load? One category falls outside those three, logical contention.

Logical contention is a problem where excessive blocking causes throughput to suffer. It would be great to get advanced warning. One essential strategy is to make use of the blocked process report. The problem is that blocked process reports are an alarm metric, not a guage metric. In other words, the blocked process report can indicate when there is a problem, but it is poor at giving advanced notice.

A Simple Concurrency Simulator

Check out this Concurrency Simulator.
ConcurrencyScreen

It simulates how well a server can handle tasks that use pessimistic locking (tasks that cannot be run in parallel because they block eachother). It’s amazing to see how nice things seem at 6000 tasks per hour and how quickly things can go wrong by 7000 tasks per hour. Unfortunately this scenario is too familiar. Episodes of excessive blocking tend to be either nonexistent or very painful with no in-between.

Graphically that looks like this:
ConcurrencyIssues

You’ll notice that even at a five second threshold, the blocked process report gives us little warning. Blocked process reports don’t warn of approaching trouble, they indicate that trouble is here.

How Useful Are Blocked Process Reports Then?

Continue monitoring blocked processes. Knowing that you’ve got concurrency issues is preferable to not knowing. Just keep in mind that it can’t replace performance testing.


Other animations and interactive media by Michael J Swart

February 3, 2016

You’re Probably Taking Sort Order For Granted Somewhere

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 8:00 am

As many people will point out, you can’t depend on the order of results without using the ORDER BY clause. So it’s easy to say “Simple! Don’t write code that expects unsorted data to be sorted”. But it’s very difficult to be careful everywhere all the time.

Remember that this is an application problem and is not a SQL problem. We only get into trouble when applications (or people) expect results to be sorted when they’re not. So unless you have a tiny application, or a huge amount of discipline, it’s likely that there is some part of your application that assumes sorted results when it shouldn’t.

Here’s a method I used that attempts to identify such areas, exposing those assumptions. It involves reversing indexes.

If you don’t ask for results to be ordered they may still appear to be ordered. SQL Server will return results in a way that’s convenient and this is often in some index order. So if the indexes are reversed, then the idea is that what’s convenient for SQL Server will be reversed.

Which results are ordered on purpose and which are ordered by luck?

Which results are ordered on purpose and which are ordered by luck?

It’s impossible to tell. But after the indexes are reversed:

It's now apparent.

It’s now apparent.

Reverse the Indexes On A Dev Box

First use this powershell script to generate some SQL. It’s a script adapted from a Stackoverflow answer by Ben Thul “How to script primary key constraints”

[string]$dbname = "Adventureworks2012";
[string]$server = ".";
 
[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.SMO") | out-null
$SMOserver = New-Object ('Microsoft.SqlServer.Management.Smo.Server') -argumentlist $server
$db = $SMOserver.databases[$dbname]
 
"" > drop_indexes.sql
"" > create_indexes.sql
"" > drop_foreign_keys.sql
"" > create_foreign_keys.sql
 
$option_drop = new-object Microsoft.SqlServer.Management.Smo.ScriptingOptions;
$option_drop.ScriptDrops = $true;
foreach ($table in $db.Tables) {
	foreach ($index in $table.Indexes) {
		$index.Script( $option_drop ) >> drop_indexes.sql
		$index.Script() >> create_indexes.sql
	}
	foreach ($foreign_key in $table.ForeignKeys) {
		$foreign_key.Script( $option_drop ) >> drop_foreign_keys.sql
		$foreign_key.Script() >> create_foreign_keys.sql
	}
}

Now follow these steps. Use the generated SQL to:

  1. Drop the foreign keys
  2. Drop the indexes
  3. Open up the create_indexes.sql script in an editor and swap ASC for DESC and vice versa
  4. Create the reversed indexes
  5. Recreate the foreign keys

That’s it! Now unsorted results will be returned in a format convenient to SQL Server which should be opposite to the original order.

Test

Remember how these ORDER BY assumptions are human or application problems? It’s time to bring them into this process. Test your applications/reports manually, or if you’re fortunate enough to have them, run your automated tests.

I’m one of the fortunate ones. I have access to a suite of automated tests that includes thousands of integration tests. In my case, roughly one percent of them failed after this experiment. Colleagues reactions were varied. They ranged from “That many?” to “That few?”

This experiment cannot identify all ordering assumptions, but it has a good chance at identifying many.

Fix

First let me give some advice on how not to fix this. Don’t begin by indiscriminately sprinkling ORDER BY clauses throughout all your queries. I found the best approach is to handle each problem on a case-by-case basis.

Here are some approaches:

  • Fix the test For automated tests, sometimes the test itself assumed an order. This is an easy case to deal with.
  • Order results in the app If you’re using C#, try using Linq’s Enumerable.OrderBy. And if you’re using some other language or reporting framework, you should be able to sort there too.
  • Order in SQL If necessary order your results using SQL with the ORDER BY clause.

Happy ordering!

January 24, 2016

Two Scripts That Help You Monitor Deadlocks with Extended Events

I want to use extended events to store recent deadlock graphs. And I want to create an asynchronous file target instead of using the existing system_health session. I don’t like the system_health target for a couple reasons. It’s too slow to query and it rolls over too quickly and it disappears after server restarts.

So I searched the web for a solution and when I couldn’t find one, I wrote my own solution, I tested it and I decided to blog about it.

Guess what? Apparently I “reinvented the wheel”. The extended events session I created is equivalent to one that Jeremiah Peschka wrote two years ago in Finding Blocked Processes and Deadlocks using SQL Server Extended Events. The embarrassing thing is that in Jeremiah’s article, he references a tool I wrote. And the first comment was written by yours truly.

So go read Jeremiah’s article, it’s really well written. What follows is my solution. The only difference is that mine only focuses on deadlocks. Jeremiah’s focuses on both deadlocks and blocked processes.

Create The Session

Here’s the session that I use. It

  • has five rollover files so that a couple server restarts don’t lose any recent deadlock graphs
  • uses an asynchronous_file_target which I prefer over the ring buffer,
  • and it cleans itself up over time. I don’t need a maintenance job to remove ancient data
DECLARE @ExtendedEventsTargetPath sysname = 'Change this string to something like "D:\XEvents\Traces"';
 
DECLARE @SQL nvarchar(max) = N'
CREATE EVENT SESSION [capture_deadlocks] ON SERVER 
ADD EVENT sqlserver.xml_deadlock_report( ACTION(sqlserver.database_name) ) 
ADD TARGET package0.asynchronous_file_target(
  SET filename = ''' + @ExtendedEventsTargetPath + N'\capture_deadlocks.xel'',
      max_file_size = 10,
      max_rollover_files = 5)
WITH (
    STARTUP_STATE=ON,
    EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
    MAX_DISPATCH_LATENCY=15 SECONDS,
    TRACK_CAUSALITY=OFF
    )
 
ALTER EVENT SESSION [capture_deadlocks] ON SERVER
    STATE=START';
 
exec sp_executesql @SQL;

Query the Results

Oh great. Now I’ve got to dig through several files. That’s a lot of work.
… but not if you have this query:

declare @filenamePattern sysname;
 
SELECT @filenamePattern = REPLACE( CAST(field.value AS sysname), '.xel', '*xel' )
FROM sys.server_event_sessions AS [session]
JOIN sys.server_event_session_targets AS [target]
  ON [session].event_session_id = [target].event_session_id
JOIN sys.server_event_session_fields AS field 
  ON field.event_session_id = [target].event_session_id
  AND field.object_id = [target].target_id	
WHERE
    field.name = 'filename'
    and [session].name= N'capture_deadlocks'
 
SELECT deadlockData.*
FROM sys.fn_xe_file_target_read_file ( @filenamePattern, null, null, null) 
    as event_file_value
CROSS APPLY ( SELECT CAST(event_file_value.[event_data] as xml) ) 
    as event_file_value_xml ([xml])
CROSS APPLY (
    SELECT 
        event_file_value_xml.[xml].value('(event/@name)[1]', 'varchar(100)') as eventName,
        event_file_value_xml.[xml].value('(event/@timestamp)[1]', 'datetime') as eventDate,
        event_file_value_xml.[xml].query('//event/data/value/deadlock') as deadlock	
  ) as deadlockData
WHERE deadlockData.eventName = 'xml_deadlock_report'
ORDER BY eventDate

October 6, 2015

Don’t Abandon Your Transactions

Filed under: Miscelleaneous SQL,SQL Scripts,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 11:59 am

DGQuote
About eight years ago, Dan Guzman wrote a post called Use Caution with Explicit Transactions in Stored Procedures. In it, he talks about error handling and transactions, specifically with respect to the XACT_ABORT setting.

XACT_ABORT

Microsoft’s docs for XACT_ABORT are pretty clear. The setting determines whether “SQL Server automatically rolls back the current transaction when a statement raises an error”.

And in nearly every scenario I can think of that uses a transaction, this automatic rollback is the desired behavior. The problem is that it’s not the default behavior. And this leads to Dan Guzman’s advice where he strongly recommends that SET XACT_ABORT ON be included “in all stored procedures with explicit transactions unless you have a specific reason to do otherwise.”

What Could Go Wrong?

When a statement inside a transaction fails (for whatever reason) and XACT_ABORT is set to off, then…

  • That transaction is abandoned.
  • Any locks taken during that transaction are still held.
  • Even if you close the connection from the application, .NET’s connection pooling will keep that connection alive and the transaction on SQL Server stays open.
  • Fortunately, if someone reuses the same database connection from the connection pool, the old transaction will be rolled back.
  • Unfortunately developers can’t count on that happening immediately.
  • Abandoned transactions can cause excessive blocking leading to a concurrency traffic jam.
  • Also, abandoned transactions can interfere with downstream solutions. Specifically ones that depend on the transaction log. Transaction logs can grow indefinitely. Replication solutions can suffer. If RCSI is enabled, the version store can get out of hand.

Some (or all) of those things happened to us last week.

Steps To Take

Here are some things you can do:

Do you have abandoned transactions right now?
It’s not too hard to identify these abandoned transactions:

-- do you have abandoned transactions?
select p.spid, s.text as last_sql
from sys.sysprocesses p
cross apply sys.dm_exec_sql_text(p.sql_handle) s
where p.status = 'sleeping'
and p.open_tran > 0

Also if you use sp_whoisactive, you can identify these processes as those with a sleeping status and at least one open transaction. But there’s a trick I use to identify these quickly. The sql_text value in the output of sp_whoisactive will typically begin with CREATE PROCEDURE. When I see that, I know it’s time to check whether this connection is sleeping or not.

SET XACT_ABORT ON
Follow Dan Guzman’s advice to include SET XACT_ABORT ON in all stored procedures with explicit transactions.
You can actually find the procedures in your database that need a closer look

-- find procedures that could suffer from abandoned transactions
SELECT * 
FROM sys.procedures 
where OBJECT_DEFINITION(object_id) like '%BEGIN TRAN%'
and OBJECT_DEFINITION(object_id) not like '%XACT_ABORT%'
order by name

Set XACT_ABORT ON server-wide
If you choose, you can decide to set the default value for all connections to your server. You can do that using Management Studio:
ServerProperties

Or via a script:

-- turn the server's xact_abort default on
declare @user_options_value bigint;
select @user_options_value = cast(value as bigint)
from sys.configurations 
where name = 'user options';
set @user_options_value = @user_options_value | 0x4000; 
exec sp_configure N'user options', @user_options_value;
RECONFIGURE WITH OVERRIDE;
 
-- (if necessary) turn the server's xact_abort default off
declare @user_options_value bigint;
select @user_options_value = cast(value as bigint)
from sys.configurations 
where name = 'user options';
set @user_options_value = @user_options_value & 0x3fff; 
exec sp_configure N'user options', @user_options_value;
RECONFIGURE WITH OVERRIDE;

Code Review

I love code reviews. They’re more than just a tool for improving quality. They’re learning opportunities and teaching opportunities for all involved.

Last week, I invited readers to have a look at a procedure in a post called Code Review This Procedure. I was looking for anyone to suggest turning on XACT_ABORT as a best practice. It’s a best practice where I work, but things like this slip through. We should have caught this not just during testing, but during development. It’s obvious with hindsight. But I wanted to determine how obvious it was without that hindsight. I guess it was pretty subtle, the XACT_ABORT was not mentioned once. That’s either because the setting is not often used by most developers, or because it is easily overlooked.

But here are some other thoughts that readers had:

Concurrency
Many people pointed at concurrency and transaction isolation levels as a problem. It turns out that concurrency is very hard to do right and nearly impossible to verify by inspection. In fact one of my favorite blog posts is about getting concurrency right. It’s called Mythbusting: Concurrent Update/Insert Solutions. The lesson here is just try it.

Cody Konior (blog) submitted my favorite comment. Cody writes “I often can’t disentangle what the actual impact of various isolation levels would be so I go a different route; which is to create a quick and dirty load test”. I can’t determine concurrency solely by inspection either, which is why I never try. Cody determined that after hammering this procedure, it never failed.

He’s entirely right. Concurrency is done correctly here. Ironically, most of the fixes suggested in other people’s code reviews actually introduced concurrency issues like deadlocks or primary key violations.

People also suggested that blocking would become excessive. It turns out that throughput does not suffer either. My testing framework still managed to process 25,000 batches per second on my desktop without error.

Validating inputs
Some people pointed out that if NULL values or other incorrect values were passed in, then a foreign key violation could be thrown. And they suggested that the procedure should validate the inputs. But what then? If there’s a problem, then there are two choices. Choice one, raise no error and exit quietly which is not ideal. Or choice 2, raise a new error which is not a significant improvement over the existing implementation.

Avoiding the transaction altogether
It is possible to rewrite this procedure without using an explicit transaction. Without the explicit transaction, there’s no chance of abandoning it. And no chance of encountering the trouble that goes with abandoned transactions. But it’s still necessary to worry about concurrency. Solutions that use single statements like MERGE or INSERT...WHERE NOT EXISTS still need SERIALIZABLE and UPDLOCK.

Error handling
I think Aaron Mathison (blog) nailed it: I’m just going to quote his review entirely:

Since your EVENT_TICKETS table has required foreign keys (evidenced by NOT NULL on all columns with foreign key references) the proc should be validating that the input parameter values exist in the foreign key tables before trying to insert into EVENT_TICKETS. If it doesn’t find any one of them it should throw an error and gracefully rollback the transaction and return from the proc.

The way it’s designed currently I think you could get an error on inserting to EVENT_TICKETS that would fail the proc and leave the transaction open.

October 1, 2015

Code Review This Procedure

Filed under: SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 10:28 am

Earlier this week we encountered a web-site outage caused by a database procedure. I’m going to blog about that problem (in detail) in a post scheduled for next week. But before I publish that post, I want to know how subtle or obvious the problem was. It seems obvious to me now, but I have the benefit of hindsight. I wonder whether we could we have avoided this during the code review stage of development.

So before I publish the details, I invite you to do a code review of this procedure in the comment section.

The Procedure

Here’s the procedure. It suffers from the same thing that burned us this week. Do you see any issues with it? Tell me in the comment section.

CREATE PROCEDURE dbo.s_EVENT_TICKETS_GetOrCreate (
  @EventId BIGINT,
  @VenueSeatId BIGINT,
  @PurchaserId BIGINT,
  @PurchaseMethodId BIGINT
)
AS
  SET NOCOUNT ON;
 
  DECLARE @pid BIGINT;
  DECLARE @pmid BIGINT;
  DECLARE @dt DATETIME2
 
  SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
  BEGIN TRAN
 
    -- If the row exists, grab details about the purchaser
    SELECT
      @pid = PurchaserId,
      @pmid = PurchaseMethodId,
      @dt = PurchaseDate
    FROM dbo.EVENT_TICKETS WITH (UPDLOCK)
    WHERE EventId = @EventId
      AND VenueSeatId = @VenueSeatId;
 
    IF ( @pid IS NULL )
    BEGIN
 
      -- The row doesn't exist, insert the row
      SET @dt = SYSDATETIME()
 
      INSERT INTO dbo.EVENT_TICKETS 
        ( EventId, VenueSeatId, PurchaserId, PurchaseMethodId, PurchaseDate )
      VALUES 
        ( @EventId, @VenueSeatId, @PurchaserId, @PurchaseMethodId, @dt );
 
      SELECT @pid = @PurchaserId,
             @pmid = @PurchaseMethodId;
    END
 
  COMMIT TRAN
 
  -- return details about the purchaser
  SELECT 
    @pid as PurchaserId,
    @pmid as PurchaseMethodId,
    @dt as PurchaseDate;

The Schema

Here’s a subset of the table definitions that this procedure is meant to use.

CREATE TABLE dbo.[EVENTS]
(
  EventId BIGINT NOT NULL IDENTITY,
  CONSTRAINT PK_EVENTS 
    PRIMARY KEY (EventId)
  -- etc...
);
 
CREATE TABLE dbo.VENUE_SEATS
(
  VenueSeatId BIGINT NOT NULL IDENTITY,
  CONSTRAINT PK_VENUE_SEATS 
    PRIMARY KEY (VenueSeatId)
  -- etc...
);
 
CREATE TABLE dbo.PURCHASERS
(
  PurchaserId BIGINT NOT NULL IDENTITY,
  CONSTRAINT PK_PURCHASERS 
    PRIMARY KEY (PurchaserId)
  -- etc...
);
 
CREATE TABLE dbo.PURCHASE_METHODS
(
  PurchaseMethodId BIGINT NOT NULL IDENTITY,
  CONSTRAINT PK_PURCHASE_METHODS 
    PRIMARY KEY (PurchaseMethodId)
  -- etc...
);
 
CREATE TABLE dbo.EVENT_TICKETS
(
  EventId BIGINT NOT NULL,
  VenueSeatId BIGINT NOT NULL,
  PurchaserId BIGINT NOT NULL,
  PurchaseMethodId BIGINT NOT NULL,
  PurchaseDate DATETIME2 NOT NULL,
  CONSTRAINT PK_EventId 
    PRIMARY KEY CLUSTERED (EventId, VenueSeatId),
  CONSTRAINT FK_EVENT_TICKETS_EVENTS
    FOREIGN KEY (EventId) REFERENCES dbo.[EVENTS] (EventId),
  CONSTRAINT FK_EVENT_TICKETS_VENUE_SEATS 
    FOREIGN KEY (VenueSeatId) REFERENCES dbo.VENUE_SEATS (VenueSeatId),
  CONSTRAINT FK_EVENT_TICKETS_PURCHASERS 
    FOREIGN KEY (PurchaserId) REFERENCES dbo.PURCHASERS (PurchaserId),
  CONSTRAINT FK_EVENT_TICKETS_PURCHASE_METHODS 
    FOREIGN KEY (PurchaseMethodId) REFERENCES dbo.PURCHASE_METHODS (PurchaseMethodId),
);
GO
Older Posts »

Powered by WordPress