Michael J. Swart

November 21, 2016

Do You Have Check Constraints On Your Start and End Dates?

Filed under: Miscelleaneous SQL,SQL Scripts,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 10:13 am

Adam Machanic tweeted this advice last week:

Are you missing any of these check constraints? Run this query to check.
This query looks for any columns in the same table that begin with “Start” and “End”. It then looks for check constraints that reference both these columns. If it doesn’t find them, it suggests a check constraint.

WITH StartColumnNames AS
(
    SELECT object_id,
           column_id,
           name AS column_name
      FROM sys.columns
     WHERE name like 'start%'
),
EndColumnNames AS
(
    SELECT object_id,
           column_id,
           name AS column_name
      FROM sys.columns
     WHERE name like 'end%'
)
SELECT t.object_id,            
       OBJECT_SCHEMA_NAME(t.object_id) AS [schema_name], 
       t.[name] AS table_name,
       s.column_name AS start_column,
       e.column_name AS end_column,
       N'ALTER TABLE ' + QUOTENAME(OBJECT_SCHEMA_NAME(t.object_id)) + N'.' + QUOTENAME(t.name) + 
           N' ADD CONSTRAINT ' + 
           QUOTENAME(N'CK_' + t.name + N'_' + s.column_name + N'_' + e.column_name) + 
           N' CHECK (' + QUOTENAME(s.column_name) + N' <= ' + QUOTENAME(e.column_name) + N');' as check_suggestion
  FROM StartColumnNames s
  JOIN EndColumnNames e
       ON s.object_id = e.object_id
       AND s.column_id <> e.column_id
       AND REPLACE(s.column_name, 'start', 'end') = e.column_name        
  JOIN sys.tables t
       ON t.object_id = s.object_id
 WHERE NOT EXISTS 
       (
           SELECT *
             FROM sys.check_constraints c
             JOIN sys.sql_expression_dependencies start_dependency
                  ON start_dependency.referencing_id = c.object_id
                  AND start_dependency.referenced_id = t.object_id
                  AND start_dependency.referenced_minor_id = s.column_id
             JOIN sys.sql_expression_dependencies end_dependency
               ON end_dependency.referencing_id = c.object_id
              AND end_dependency.referenced_id = t.object_id
              AND end_dependency.referenced_minor_id = e.column_id
            WHERE c.parent_object_id = t.object_id
       )

Caveats

Don’t blindly run scripts that you got from some random guy’s blog. Even if that someone is me. That’s terribly irresponsible.

But this query may be useful if you do want to look for a very specific, simple kind of constraint that may match your business specs. These constraints are just suggestions and may not match your business rules. For example, when I run this query on Adventureworks, I get one “missing” check constraint for HumanResources.Shift(StartTime, EndTime) and when I look at the contents of the Shift table, I get this data:
shifttimes
Notice that I can’t create a constraint on this table because of the night shift. The constraint doesn’t make sense here.

Creating constraints on existing tables may take time if the table is huge. Locks may be held on that table for an uncomfortably long time.

Of course if your table has data that would violate the constraint, you can’t create it. But now you have to make some other choices. You can correct or delete the offending data or you can add the constraint with NOCHECK.

September 23, 2016

Build Your Own Tools

There are so many ways to look inside SQL Server. New extended events and dynamic management views are introduced every version. But if you want to collect something that’s unavailable, with a little bit of creativity, you can create your own tools.

Inspired by Richie Rump’s latest project, I’m going to describe a tool I wrote and the process I took to get there.

The Problem To Tackle

I had a deadlock graph I wanted to tackle, but I was having trouble reproducing it. I needed to know more about the queries involved. But the query plans were no longer in cache. So here’s the problem

Can I collect the execution plans that were used for the queries involved in a deadlock graph?


I want to use that information to reproduce – and ultimately fix – the deadlock.

The Right Tool For the Job

If you don’t have enough data to get to the root cause of an issue, put something in place for next time.

Can I Use Out-Of-The-Box Extended Events?
I’m getting used to extended events and so my first thought was “Is there a query plan field I can collect with the deadlock extended event? There is not. Which isn’t too surprising. Deadlock detection is independent of any single query.

So How Do I Get To The Execution Plans?
So when I look at a deadlock graph, I can see there are sql_handles. Given that, I can grab the plan_handle and then the query plan from the cache, but I’m going to need to collect it automatically at the time the deadlock is generated. So I’m going to need

  • XML shredding skills
  • Ability to navigate DMVs to get at the cached query plans
  • A way to programatically respond to deadlock graph events (like an XE handler or a trigger)

Responding Automatically to Extended Events
This is when I turned to #sqlhelp. And sure enough, six minutes later, Dave Mason helped me out:

I had never heard of Event Notifications, so after some googling, I discovered two things. The first thing is that I can only use Event Notifications with DDL or SQLTrace events rather than the larger set of extended events. Luckily deadlock graphs are available in both. The second thing is that Event Notifications aren’t quite notifications the way alerts are. They’re a way to push event info into a Service Broker queue. If I want automatic actions taken on Service Broker messages, I have to define and configure an activation procedure to process each message. In pictures, here’s my plan so far:

My Plan

Will It Work?
I think so, I’ve had a lot of success creating my own tools in the past such as

Also, looking at the different pieces, they’re probably robust enough to do what I need them to do. One surprise involved security concerns for activated procedures that need to access system views

The Deadlock Logging Script

Here’s the tool!

Create the Database

USE master;
 
IF (DB_ID('DeadlockLogging') IS NOT NULL)
BEGIN
    ALTER DATABASE DeadlockLogging SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE DeadlockLogging;
END
 
CREATE DATABASE DeadlockLogging WITH TRUSTWORTHY ON;
GO

Create the Service Broker Objects
I’ve never used Service Broker before, so a lot of this comes from examples found in Books Online.

use DeadlockLogging;
 
CREATE QUEUE dbo.LogDeadlocksQueue;
 
CREATE SERVICE LogDeadlocksService
    ON QUEUE dbo.LogDeadlocksQueue 
    ([http://schemas.microsoft.com/SQL/Notifications/PostEventNotification]);
 
CREATE ROUTE LogDeadlocksRoute
    WITH SERVICE_NAME = 'LogDeadlocksService',
    ADDRESS = 'LOCAL';
 
-- add server level notification
IF EXISTS (SELECT * FROM sys.server_event_notifications WHERE [name] = 'LogDeadlocks')
    DROP EVENT NOTIFICATION LogDeadlocks ON SERVER;
 
DECLARE @SQL NVARCHAR(MAX);
SELECT @SQL = N'
    CREATE EVENT NOTIFICATION LogDeadlocks 
    ON SERVER 
    FOR deadlock_graph -- name of SQLTrace event type
    TO SERVICE ''LogDeadlocksService'', ''' + CAST(service_broker_guid as nvarchar(max))+ ''';'
FROM sys.databases 
WHERE [name] = DB_NAME();
EXEC sp_executesql @SQL;
GO

The dynamic SQL is used to fetch the database guid of the newly created database.

a Place to Store Deadlocks

-- Create a place to store the deadlock graphs along with query plan information
CREATE SEQUENCE dbo.DeadlockIdentity START WITH 1;
 
CREATE TABLE dbo.ExtendedDeadlocks 
(
    DeadlockId bigint not null,
    DeadlockTime datetime not null,
    SqlHandle varbinary(64),
    StatementStart int,
    [Statement] nvarchar(max) null,
    Deadlock XML not null,
    FirstQueryPlan XML
);
 
CREATE CLUSTERED INDEX IX_ExtendedDeadlocks 
    ON dbo.ExtendedDeadlocks(DeadlockTime, DeadlockId);
GO

The Procedure That Processes Queue Messages

CREATE PROCEDURE dbo.ProcessDeadlockMessage
AS
  DECLARE @RecvMsg NVARCHAR(MAX);
  DECLARE @RecvMsgTime DATETIME;
  SET XACT_ABORT ON;
  BEGIN TRANSACTION;
 
    WAITFOR ( 
        RECEIVE TOP(1)
            @RecvMsgTime = message_enqueue_time,
            @RecvMsg = message_body
        FROM dbo.LogDeadlocksQueue
    ), TIMEOUT 5000;
 
    IF (@@ROWCOUNT = 0)
    BEGIN
      ROLLBACK TRANSACTION;
      RETURN;
    END
 
    DECLARE @DeadlockId BIGINT = NEXT VALUE FOR dbo.DeadlockIdentity;
    DECLARE @RecsvMsgXML XML = CAST(@RecvMsg AS XML);
    DECLARE @DeadlockGraph XML = @RecsvMsgXML.query('/EVENT_INSTANCE/TextData/deadlock-list/deadlock');
 
    WITH DistinctSqlHandles AS
    (
        SELECT DISTINCT node.value('@sqlhandle', 'varchar(max)') as SqlHandle
        FROM @RecsvMsgXML.nodes('//frame') AS frames(node)            
    )
    INSERT ExtendedDeadlocks (DeadlockId, DeadlockTime, SqlHandle, StatementStart, [Statement], Deadlock, FirstQueryPlan)
    SELECT @DeadlockId,
        @RecvMsgTime,
        qs.sql_handle,
        qs.statement_start_offset,
        [statement],
        @DeadlockGraph, 
        qp.query_plan
    FROM DistinctSqlHandles s
    LEFT JOIN sys.dm_exec_query_stats qs 
        on qs.sql_handle = CONVERT(VARBINARY(64), SqlHandle, 1) 
    OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
    OUTER APPLY sys.dm_exec_sql_text (CONVERT(VARBINARY(64), SqlHandle, 1)) st
    OUTER APPLY ( 
      SELECT SUBSTRING(st.[text],(qs.statement_start_offset + 2) / 2,
          (CASE 
             WHEN qs.statement_end_offset = -1  THEN LEN(CONVERT(NVARCHAR(MAX), st.text)) * 2
             ELSE qs.statement_end_offset + 2
             END - qs.statement_start_offset) / 2)) as sqlStatement([statement]);
 
    -- clean up old deadlocks
    DECLARE @limit BIGINT
    SELECT DISTINCT TOP (500) @limit = DeadlockId 
    FROM ExtendedDeadlocks 
    ORDER BY DeadlockId DESC;
    DELETE ExtendedDeadlocks 
    WHERE DeadlockId < @limit;
 
  COMMIT
 
GO

Activating the Procedure

ALTER QUEUE dbo.LogDeadlocksQueue
    WITH ACTIVATION
    ( STATUS = ON,
      PROCEDURE_NAME = dbo.ProcessDeadlockMessage,
      MAX_QUEUE_READERS = 1,
      EXECUTE AS SELF
    );
GO

Clean Up
And when you’re all done, this code will clean up this whole experiment.

use master;
 
if (db_id('DeadlockLogging') is not null)
begin
    alter database DeadlockLogging set single_user with rollback immediate 
    drop database DeadlockLogging
end
 
if exists (select * from sys.server_event_notifications where name = 'DeadlockLogging')
    DROP EVENT NOTIFICATION LogDeadlocks ON SERVER;

July 27, 2016

Do you Use CLR in SQL Server?

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 11:11 am

We don’t use CLR assemblies in SQL Server. For us, programming in the database means that maybe “you’re doing it wrong”. But there have been rare circumstances where I’ve wondered about what the feature can do for us.

For example, creating a CLR assembly to do string processing for a one-time data migration might be preferable to writing regular SQL using SQL Server’s severely limited built-in functions that do string processing.

Deployment Issues

I’ve always dismissed CLR as part of any solution because the deployment story was too cumbersome. We enjoy some really nice automated deployment tools. To create an assembly, SQL Server needs to be able to access the dll. And all of a sudden our deployment tools need more than just a connection string, the tools now need to be able to place a file where SQL Server can see it… or so I thought.

Deploy Assemblies Using Bits

CREATE ASSEMBLY supports specifying a CLR assembly using bits, a bit stream that can be specified using regular T-SQL. The full method is described in Deploying CLR Database Objects. In practice, the CREATE ASSEMBLY statement looks something like:

CREATE ASSEMBLY [MyAssembly]
FROM 0x4D5A900003000000040000... -- truncated binary literal
WITH PERMISSION_SET = SAFE

This completely gets around the need for deployments to use the file system. I was unaware of this option until today.

Your Experience

So what’s your experience? My mistaken assumptions kept me from evaluating CLR properly. I wonder if anyone is in the same position I was in and I wonder if this accounts for the low adoption in general of CLR in SQL Server. Answer this survey, Which option best describes you?


External Link to Survey Monkey survey.

Update July 29 Here are the answers so far:
CLRAnswers

July 20, 2016

Simplified Order Of Operations

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 8:00 am

I recently learned that when combining multiple operators in a SQL expression, AND has a higher precedence than OR but & has the same precedence as |. I expected the precedence rules for the logical operators to be consistent with the bitwise operators.

Even Stephen Sondheim seemed to struggle with this.

AND is Always Evaluated Before OR

SELECT 'TRUE' 
WHERE (1 = 1) OR (1 = 1) AND (1 = 0)
-- returns TRUE
 
SELECT 'TRUE' 
WHERE (1 = 0) AND (1 = 1) OR (1 = 1) 
-- returns TRUE

& and | are Evaluated Left To Right

SELECT 1 | 1 & 0
-- returns 0
 
SELECT 0 & 1 | 1
-- returns 1

Here Are The Official Docs

Here what Microsoft says about SQL Server’s Operator Precedence.

  1. ~ (Bitwise NOT)
  2. * (Multiply), / (Division), % (Modulo)
  3. + (Positive), – (Negative), + (Add), (+ Concatenate), – (Subtract), & (Bitwise AND), ^ (Bitwise Exclusive OR), | (Bitwise OR)
  4. =, >, <, >=, <=, <>, !=, !>, !< (Comparison operators)
  5. NOT
  6. AND
  7. ALL, ANY, BETWEEN, IN, LIKE, OR, SOME
  8. = (Assignment)

Practical Subset

I have a book on my shelf called Practical C Programming published by O’Reilly (the cow book) by Steve Oualline. I still love it today because although I don’t code in C any longer, the book remains a great example of good technical writing.

That book has some relevance to SQL today. Instead of memorizing the full list of operators and their precedence, Steve gives a practical subset:

  1. * (Multiply), / (Division)
  2. + (Add), – (Subtract)
  3. Put parentheses around everything else.

July 7, 2016

Prioritize This List of Issues (Results)

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 8:00 am

Earlier this week I asked people to help me out prioritizing a list of issues. I was surprised by the number of people who participated. I think I missed an opportunity to crowd-source a large part of my job by including my real issues.

Results

Thanks for participating. After the results started coming in, I realized that my question was a bit ambiguous. Does first priority mean tackle an issue first? Or does a higher numbered issue mean a higher priority? I clarified the question and took that into account for entries that picked sproc naming conventions as top priority.

Votes

The results were cool. I expected a variety of answers but I found that most people’s priorities lined up pretty nicely.

For example, even though I wrote a list of issues all with different severity, there were three issues that stood out as most critical: Corrupted databases, a SQL injection vulnerability and No automated backups. Keeping statistics up to date seemed to be the most important non-critical issue.

But there is one issue that I thought had a lot of variety, index fragmentation. I personally placed this one second last. I got a friend to take the survey and I got to hear him explain his choices. He wanted to tackle index fragmentation early because it’s so easily fixable. “It’s low hanging fruit right? Just fix it and move on.”

My friend also pointed out that this technique would work well as an interview technique. Putting priorities in an order is important but even better is that it invites so much discussion about the reasons behind the order.

Speaking of which, go check out Chuck Rummel’s submission. He wins the prize for most thorough comment on my blog.

My Priorities

Here they are:

  • Corrupted database – serving data is what databases are supposed to do
  • No automated backups – protect that data from disasters
  • A SQL injection vulnerability – protect the data from unauthorized users
  • Stale statistics – serve data efficiently
  • Cursors – a common cause of performance issues, but I’d want to be reactive
  • GUID identifiers – meh
  • NOLOCK hints – meh
  • Developers use a mix of parameterized SQL and stored procedures – It’s not a performance concern for me
  • Fragmented indexes – supposedly better performance?
  • Sprocs prefixed with “sp-” – aesthetics?

July 5, 2016

Prioritize This List of Issues

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 9:14 am

“One thing at a time / And that done well / Is a very good thing / As any can tell”

But life isn’t always that easy is it? I spend a lot of my workday juggling priorities. And I want to compare what I think to others. So I wrote a survey which explores the importance people place on different SQL Server issues. It’s easy to say avoid redundant indexes. But does it follow that it’s more important to clean up redundant indexes before rewriting cursors?

The List

Prioritize this list from greatest concern to least. So if an item appears above another item, then you would typically tackle that issue first.

  • Corrupted database
  • A SQL injection vulnerability
  • Stale statistics
  • Fragmented indexes
  • Developers use a mix of parameterized SQL and stored procedures
  • Sprocs prefixed with “sp_”
  • Cursors
  • GUID identifiers
  • NOLOCK hints
  • No automated backups

I want to hear what you think. Submit your prioritized list in the comments, or by using this survey: https://www.surveymonkey.com/r/MV9F9YT

I’ll be posting my own answers on Thursday, July 5, 2016.

Update: I’ve shared the results. Prioritize This List Of Issues (Results)

May 30, 2016

One SSMS Improvement You Might Have Missed

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 2:36 pm

Takeaway: Undocked query windows in SSMS are now top-level windows.

SSMS Release Cycle

As you may know, SQL Server Management Studio (SSMS) now has its own release cycle independent of SQL Server’s release cycle. This means the Microsoft team who work on SSMS now get to release as often as they like. And it looks like they are. In fact it looks like they’ve released five times so far in 2016.

Many of the changes are small changes, and many of them don’t impact me, but I noticed one cool change that I’d like to draw more attention to.

Undocked Query Windows are Spiffier

The March 2016 Refresh (13.0.13000.55 Changelog) updates SSMS to use the new Visual Studio 2015 shell. Part of that change means that undocked windows are now top-level windows.

Top level windows are windows without parents so the undocked window is not a child window of the main SSMS window (but it is part of the same process). And so it gets its own space in the task bar, and participates in alt+tab when you switch between windows.

Also these undocked windows can be a collection of query windows. Compare the new style with the old style.

Old Style, limit of one query window:

Undocked-Old

New Style, many query windows:

Undocked-New

If you’re a multitasking Developer or DBA who works with SSMS a lot, I think you’ll like this new feature. Undocked query windows now feel like real windows.

Remember SSMS is free (even though SQL Server is not). If you want to download the latest version of SSMS, you can do that here.

April 27, 2016

You Can’t Force Query Plans If They Use TVPs With PKs

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 12:24 pm

Have you ever played “Fortunately/Unfortunately”? It’s a game where players alternately give good news and bad news. It goes something like this:

Databases such as SQL Server make it easy to retrieve sets of data.
Unfortunately, it’s kind of awkward to send sets of data to SQL Server.
Fortunately, table-valued parameters (TVPs) make this easier.
Unfortunately, queries that use TVPs often suffer from non-optimal query plans.
Fortunately, creating primary key or unique key constraints gives us a way to index table types.
Unfortunately, those constraints prevent any kind of plan forcing.
Fortunately, SQL Server 2014 lets us create named indexes for table types which lets us force plans if we need to.

otherhand

Let’s break this down:

Sending Sets of Data to SQL Server is Awkward

It always has been. Originally, developers were forced to send a CSV string to SQL Server and write a do-it-yourself function to split the string into a set of values.

  • In 2005, Microsoft introduced XML and CLR which let developers shred or split strings in new ways,
  • In 2008, Microsoft introduced table-valued parameters,
  • In 2014, they introduced In-Memory TVPs,
  • In 2016, there’s a new SPLIT_STRING() function

So there are more options now then there ever have been and they each have their own issues.

Aaron Bertrand explores some of those performance issues in STRING_SPLIT() in SQL Server 2016. It’s a specific use-case where he focuses on duration. In our case, we focus on aggregated system load like worker time or reads so we don’t necessarily value parallel query plans. But I love his methods. He gives us tools that let us evaluate our own situation based on our own criteria.

I’m going to focus on TVPs which is the most natural method of sending sets of data to SQL Server from a syntax point of view.

Indexes on Table Types

Table-valued parameters are implemented using table types. Before SQL Server 2014, the only way to index a table type was to define a primary key or a unique key on it like this:

create type dbo.TypeWithPK 
    as table ( id int primary key );

The syntax for CREATE TYPEprevents us from naming our primary key and this turns out to be important. Every time I define and use a table variable, SQL Server will dynamically generate a name for the primary key. So when I look at the plan for

declare @ids dbo.TypeWithPK;
select * from @ids

I see that it has a primary key named [@ids].[PK__#A079849__3213E83FDB6D7A43]:
PKName

As I’ll show later, this dynamically generated name prevents any kind of query plan forcing. But as of SQL Server 2014, we can include indexes in our table type definitions. More importantly, we can name those indexes:

create type dbo.TypeWithIndex 
    as table ( id int index IX_TypeWithIndex );
go
declare @ids dbo.TypeWithIndex;
select * from @ids;

This has a primary key named [@ids].[IX_TypeWithIndex] which is what we expect.

Plan Forcing is Not Allowed For TVPs with PKs

Where does plan forcing fit in your tool belt? For me, I’ve never used plan forcing as a permanent solution to a problem, but when I see a query that often suffers from suboptimal query plan choices, I look to plan guides to give me some stability while I work at fixing and deploying a permanent solution.

Plan forcing in SQL Server involves specifying a plan for a particular query. But the primary key name for a table variable is always different so the specified query plan is never going to match. In other words SQL Server is never going to use your query plan because your plan includes index [@ids].[PK__#A079849__3213E83FDB6D7A43], but the query it’s compiling has a differently named index like [@ids].[PK__#AA02EED__3213E83FAF123E51].

If you try, this is what that failure looks like:

USE PLAN
If you try to use the USE PLAN query hint, you’ll get error 8712:

Msg 8712, Level 16, State 0, Line 15
Index '@ids.PK__#B305046__3213E83F57A32F24', specified in the USE PLAN hint, does not exist. Specify an existing index, or create an index with the specified name.

Plan Guides
If you try to force the plan by creating a plan guide, you’ll also see message 8712:

select 
from sys.plan_guides
cross apply fn_validate_plan_guide(plan_guide_id)
-- Index '@ids.PK__#BA711C0__3213E83F44A3F2C8', specified in the USE PLAN hint, does not exist. Specify an existing index, or create an index with the specified name.

Query Store
And if you try to force a plan using SQL Server 2016’s Query Store, you’ll see this:

select plan_id, last_force_failure_reason_desc
from sys.query_store_plan
where is_forced_plan = 1
-- last_force_failure_reason_desc = 'NO_INDEX'

Summary

When defining table variables, avoid primary key or unique key constraints. Opt instead for named indexes if you’re using SQL Server 2014 or later. Otherwise, be aware that plan forcing is limited to queries that don’t use these table variables.

April 20, 2016

Are You Programming In The Database?

Typically, T-SQL is not the best platform for programming (understatement). If you have many procedures that call other procedures, that’s a signal that you might be programming in the database.

Find out using this query:

select 
    OBJECT_SCHEMA_NAME(p.object_id) as schemaName, 
    OBJECT_NAME(p.object_id) as procedureName,
    count(*) as [calls to other procedures]	
from sys.procedures p
cross apply sys.dm_sql_referenced_entities(schema_name(p.schema_id) + '.' + p.name, 'OBJECT') re
where re.referenced_entity_name in (select name from sys.procedures)
group by p.object_id
order by count(*) desc;

in Adventureworks, we see this result:
ProcsCallingProcs1

To drill down into those results, use this query:

select distinct
    QUOTENAME(OBJECT_SCHEMA_NAME(p.object_id)) + '.' 
        + QUOTENAME(OBJECT_NAME(p.object_id)) [This procedure...], 
    QUOTENAME(OBJECT_SCHEMA_NAME(p_ref.object_id)) + '.' 
        + QUOTENAME(OBJECT_NAME(p_ref.object_id)) [... calls this procedure]
from sys.procedures p
cross apply sys.dm_sql_referenced_entities(schema_name(p.schema_id) + '.' + p.name, 'OBJECT') re
join sys.procedures p_ref
	on re.referenced_entity_name = p_ref.name
order by 1,2

which gives results like this:
ProcsCallingProcs2

Adventureworks seems just fine to me. Only four instances of procedures calling procedures. I looked at the database I work with most. Hundreds of procedures (representing 15% of the procedures) call other procedures. On the other end of the spectrum is Stackoverflow. I understand that they don’t use stored procedures at all.

April 11, 2016

Tackle WRITELOG Waits Using the Transaction Log and Extended Events

Takeaway: WRITELOG waits are associated with a busy or slow transaction log. To tackle these waits, we need to measure transaction log activity. I describe a lightweight way to examine transaction log usage for busy OLTP systems.

Tackle WRITELOG

Start with Microsoft’s Advice: I’m not going to introduce the topic of transaction log performance. Microsoft’s SQL Customer Advisory Team already provides a great introduction with Diagnosing Transaction Log Performance Issues and Limits of the Log Manager. Their advice includes watching the “Log Bytes Flushed/sec” performance counter found in the “SQL Server:Databases” object.

Reactive Efforts: If you’re curious about transaction log activity for open transactions, Paul Randal has a script at Script: open transactions with text and plans.

Spiky Activity: It’s not too difficult to find infrequent activities that write a lot of data to the transaction log; activities like data warehouse ETLs, or index rebuilds. Use a trace or extended events to look for statements with large values for “writes”.

Scalability of OLTP Workloads

WRITELOG waits are a scalability challenge for OLTP workloads under load. Chris Adkin has a lot of experience tuning SQL Server for high-volume OLTP workloads. So I’m going to follow his advice when he writes we should minimize the amount logging generated. And because I can’t improve something if I can’t measure it, I wonder what’s generating the most logging? OLTP workloads are characterized by frequent tiny transactions so I want to measure that activity without filters, but I want to have as little impact to the system as I can. That’s my challenge.

Getting #SQLHelp

So I asked twitter. And I got some great advice from Erin Stellato:
sqlhelp
Erin also pointed out that the UI warns you that it’s a very high volume event.

Combining fn_dblog With Extended Events

So to avoid that kind of volume, I got the idea to read straight from the transaction log and combine that with a lighter extended events session to get the SQL text. The transaction_id captured by the extended events session corresponds to the XAct ID column in fn_dblog.

Here’s how that went:

The Script
The details for this script are kind of fussy, but it all comes together in a solution that won’t drag a server down. Care is still recommended; start with 10 seconds and go from there.

declare @Duration varchar(10) = '00:00:10';
declare @FileSize varchar(10) = '5'; -- in megabytes
 
-- create session
DECLARE @CreateSessionSQL nvarchar(max) = N'
    CREATE EVENT SESSION query_writes ON SERVER 
    ADD EVENT sqlserver.sp_statement_completed ( 
        SET collect_statement=(0)
        ACTION(sqlserver.transaction_id, sqlserver.database_name)
        WHERE sqlserver.transaction_id > 0
          AND sqlserver.database_name = ''' + DB_NAME() + N''')
    ADD TARGET package0.asynchronous_file_target(
      SET filename = N''query_writes.xel'',
          max_file_size = ' + @FileSize + N',
          max_rollover_files = 1)
    WITH (
        STARTUP_STATE=ON,
        EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
        TRACK_CAUSALITY=OFF)';
exec sp_executesql @CreateSessionSQL;
 
ALTER EVENT SESSION query_writes ON SERVER
    STATE=START;
 
-- get the latest lsn for current DB
declare @xact_seqno binary(10);
declare @xact_seqno_string_begin varchar(50);
exec sp_replincrementlsn @xact_seqno OUTPUT;
set @xact_seqno_string_begin = '0x' + CONVERT(varchar(50), @xact_seqno, 2);
set @xact_seqno_string_begin = stuff(@xact_seqno_string_begin, 11, 0, ':')
set @xact_seqno_string_begin = stuff(@xact_seqno_string_begin, 20, 0, ':');
 
-- wait a minute
waitfor delay @Duration;
 
-- get the latest lsn for current DB
declare @xact_seqno_string_end varchar(50);
exec sp_replincrementlsn @xact_seqno OUTPUT;
set @xact_seqno_string_end = '0x' + CONVERT(varchar(50), @xact_seqno, 2);
set @xact_seqno_string_end = stuff(@xact_seqno_string_end, 11, 0, ':')
set @xact_seqno_string_end = stuff(@xact_seqno_string_end, 20, 0, ':');
 
-- Stop the session
ALTER EVENT SESSION query_writes ON SERVER
    STATE=STOP;
 
-- read from transaction log
select 
    max([Xact ID]) as transactionId,
    max([Transaction Name]) as transactionName, 
    sum([Log Record Length]) as logSize,
    count(*) as [logRowCount]
into #TLOGS
from fn_dblog(@xact_seqno_string_begin, @xact_seqno_string_end) f
group by [Transaction Id]
 
-- read from session data
CREATE TABLE #SessionData (
    id int identity primary key,
    XEXml xml NOT NULL    
)
 
INSERT #SessionData(XEXml)
SELECT CAST(fileData.[event_data] as xml)
FROM sys.fn_xe_file_target_read_file ( 'query_writes*xel', null, null, null) fileData;
 
-- find minimum transactionId captured by xes 
-- (almost always the first one, depending on luck here)
declare @minTXFromSession bigint;
select TOP (1) @minTXFromSession = S.XEXml.value(
    '(/event/action[(@name=''transaction_id'')]/value)[1]', 'bigint')
from #SessionData S;
 
WITH SD AS
(
    SELECT 
        S.XEXml.value(
            '(/event/action[(@name=''transaction_id'')]/value)[1]', 'bigint') as transactionId,
        S.XEXml.value(
            '(/event/data[(@name=''object_id'')]/value)[1]', 'bigint') as objectId
    FROM #SessionData S
)
SELECT 
    ISNULL(T.transactionName, 'Unknown') as transactionTypeName, 
    OBJECT_NAME(S.objectid) as ObjectName,
    SUM(T.logsize) as totalLogSizeBytes,
    SUM(T.logRowCount) as totalLogRowCount,
    COUNT(*) as executions
FROM #TLOGS T
LEFT JOIN (SELECT DISTINCT * FROM SD) S
    ON T.transactionId = S.transactionId
WHERE T.transactionId >= @minTXFromSession
GROUP BY T.transactionName, S.objectId
ORDER BY SUM(T.logsize) DESC
 
-- clean up
DROP EVENT SESSION query_writes ON SERVER;
DROP TABLE #TLOGS
DROP TABLE #SessionData

Sample Results

Here’s an example of what the results would look like. It’s an aggregated view of all transaction log activity in a database for 10 seconds.
Example Results

Notes

  • Notice that the session is database specific. That’s because transaction logs are database specific. To help focus on the right database, use the “Log Bytes Flushed/sec” performance counter found in the “SQL Server:Databases” object.
  • Also notice that I’m tracking ObjectIds. That’s because we use procedures quite heavily. You may want to adapt this code to use query_hash instead. In both cases, collecting the statement text is not recommended.
  • The sample of data is limited by the size of the extended events target file or the duration variable, whichever is smaller.
  • @sqL_handLe pointed out to me that reading the log using fn_dblog will prevent the transaction log from truncating. Reading from the transaction log can be very tricky to do efficiently. Luckily we can use the sp_replincrementlsn trick to get LSN parameter values for fn_dblog.
Older Posts »

Powered by WordPress