Michael J. Swart

July 27, 2016

Do you Use CLR in SQL Server?

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 11:11 am

We don’t use CLR assemblies in SQL Server. For us, programming in the database means that maybe “you’re doing it wrong”. But there have been rare circumstances where I’ve wondered about what the feature can do for us.

For example, creating a CLR assembly to do string processing for a one-time data migration might be preferable to writing regular SQL using SQL Server’s severely limited built-in functions that do string processing.

Deployment Issues

I’ve always dismissed CLR as part of any solution because the deployment story was too cumbersome. We enjoy some really nice automated deployment tools. To create an assembly, SQL Server needs to be able to access the dll. And all of a sudden our deployment tools need more than just a connection string, the tools now need to be able to place a file where SQL Server can see it… or so I thought.

Deploy Assemblies Using Bits

CREATE ASSEMBLY supports specifying a CLR assembly using bits, a bit stream that can be specified using regular T-SQL. The full method is described in Deploying CLR Database Objects. In practice, the CREATE ASSEMBLY statement looks something like:

CREATE ASSEMBLY [MyAssembly]
FROM 0x4D5A900003000000040000... -- truncated binary literal
WITH PERMISSION_SET = SAFE

This completely gets around the need for deployments to use the file system. I was unaware of this option until today.

Your Experience

So what’s your experience? My mistaken assumptions kept me from evaluating CLR properly. I wonder if anyone is in the same position I was in and I wonder if this accounts for the low adoption in general of CLR in SQL Server. Answer this survey, Which option best describes you?


External Link to Survey Monkey survey.

Update July 29 Here are the answers so far:
CLRAnswers

July 20, 2016

Simplified Order Of Operations

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 8:00 am

I recently learned that when combining multiple operators in a SQL expression, AND has a higher precedence than OR but & has the same precedence as |. I expected the precedence rules for the logical operators to be consistent with the bitwise operators.

Even Stephen Sondheim seemed to struggle with this.

AND is Always Evaluated Before OR

SELECT 'TRUE' 
WHERE (1 = 1) OR (1 = 1) AND (1 = 0)
-- returns TRUE
 
SELECT 'TRUE' 
WHERE (1 = 0) AND (1 = 1) OR (1 = 1) 
-- returns TRUE

& and | are Evaluated Left To Right

SELECT 1 | 1 & 0
-- returns 0
 
SELECT 0 & 1 | 1
-- returns 1

Here Are The Official Docs

Here what Microsoft says about SQL Server’s Operator Precedence.

  1. ~ (Bitwise NOT)
  2. * (Multiply), / (Division), % (Modulo)
  3. + (Positive), – (Negative), + (Add), (+ Concatenate), – (Subtract), & (Bitwise AND), ^ (Bitwise Exclusive OR), | (Bitwise OR)
  4. =, >, <, >=, <=, <>, !=, !>, !< (Comparison operators)
  5. NOT
  6. AND
  7. ALL, ANY, BETWEEN, IN, LIKE, OR, SOME
  8. = (Assignment)

Practical Subset

I have a book on my shelf called Practical C Programming published by O’Reilly (the cow book) by Steve Oualline. I still love it today because although I don’t code in C any longer, the book remains a great example of good technical writing.

That book has some relevance to SQL today. Instead of memorizing the full list of operators and their precedence, Steve gives a practical subset:

  1. * (Multiply), / (Division)
  2. + (Add), – (Subtract)
  3. Put parentheses around everything else.

July 7, 2016

Prioritize This List of Issues (Results)

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 8:00 am

Earlier this week I asked people to help me out prioritizing a list of issues. I was surprised by the number of people who participated. I think I missed an opportunity to crowd-source a large part of my job by including my real issues.

Results

Thanks for participating. After the results started coming in, I realized that my question was a bit ambiguous. Does first priority mean tackle an issue first? Or does a higher numbered issue mean a higher priority? I clarified the question and took that into account for entries that picked sproc naming conventions as top priority.

Votes

The results were cool. I expected a variety of answers but I found that most people’s priorities lined up pretty nicely.

For example, even though I wrote a list of issues all with different severity, there were three issues that stood out as most critical: Corrupted databases, a SQL injection vulnerability and No automated backups. Keeping statistics up to date seemed to be the most important non-critical issue.

But there is one issue that I thought had a lot of variety, index fragmentation. I personally placed this one second last. I got a friend to take the survey and I got to hear him explain his choices. He wanted to tackle index fragmentation early because it’s so easily fixable. “It’s low hanging fruit right? Just fix it and move on.”

My friend also pointed out that this technique would work well as an interview technique. Putting priorities in an order is important but even better is that it invites so much discussion about the reasons behind the order.

Speaking of which, go check out Chuck Rummel’s submission. He wins the prize for most thorough comment on my blog.

My Priorities

Here they are:

  • Corrupted database – serving data is what databases are supposed to do
  • No automated backups – protect that data from disasters
  • A SQL injection vulnerability – protect the data from unauthorized users
  • Stale statistics – serve data efficiently
  • Cursors – a common cause of performance issues, but I’d want to be reactive
  • GUID identifiers – meh
  • NOLOCK hints – meh
  • Developers use a mix of parameterized SQL and stored procedures – It’s not a performance concern for me
  • Fragmented indexes – supposedly better performance?
  • Sprocs prefixed with “sp-” – aesthetics?

July 5, 2016

Prioritize This List of Issues

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 9:14 am

“One thing at a time / And that done well / Is a very good thing / As any can tell”

But life isn’t always that easy is it? I spend a lot of my workday juggling priorities. And I want to compare what I think to others. So I wrote a survey which explores the importance people place on different SQL Server issues. It’s easy to say avoid redundant indexes. But does it follow that it’s more important to clean up redundant indexes before rewriting cursors?

The List

Prioritize this list from greatest concern to least. So if an item appears above another item, then you would typically tackle that issue first.

  • Corrupted database
  • A SQL injection vulnerability
  • Stale statistics
  • Fragmented indexes
  • Developers use a mix of parameterized SQL and stored procedures
  • Sprocs prefixed with “sp_”
  • Cursors
  • GUID identifiers
  • NOLOCK hints
  • No automated backups

I want to hear what you think. Submit your prioritized list in the comments, or by using this survey: https://www.surveymonkey.com/r/MV9F9YT

I’ll be posting my own answers on Thursday, July 5, 2016.

Update: I’ve shared the results. Prioritize This List Of Issues (Results)

June 21, 2016

T-SQL Tuesday #079 Roundup: It’s 2016!

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication — Michael J. Swart @ 7:54 pm

The invite post

Twenty Snazzy Bloggers!

There’s always some anxiety when throwing a party. Wondering whether it will be a smash. Well I had nothing to worry about with the twenty bloggers who participated last week. You guys hit it out of the park!
Everyone

The Round Up

Let’s get to it:

Rob Farley
1. Rob Farley (@rob_farley) SQL Server 2016 Temporal Table Query Plan Behaviour
Rob digs into the query optimizer to highlight an interesting plan choice when temporal tables are involved. So extra care is needed when indexing and testing features that use temporal tables.
What I Thought: I really enjoy Rob’s posts that digs into query plans and the optimizer in general.

Did you know that Rob has participated in every single T-SQL Tuesday? Even the rotten months when there’s like only three participants. Rob’s one of them.

Ginger Grant
2. Ginger Grant (@DesertIsleSQL) Creating R Code to run on SQL Server 2016
Ginger writes about how to get started coding with the R language.
What I Thought: If you want to do the kind of analysis that R enables then bookmark her website! Her blog is becoming a real resource when it comes to working with R and SQL Server.

Ewald Cress
3. Ewald Cress (@sqlOnIce) Unsung SQLOS: the 2016 SOS_RWLock
Have you ever had to troubleshoot spinlock bottlenecks? You have my sympathies. Spinlocks are meant to be internal to SQL Server, something that Microsoft worries about. Ewald writes about new SOS_RWLOCK improvements.
What I Thought: Do you like talks by Bob Ward? Do you like to dig into SQL Server’s internals? Does inspecting SQL Server debuggers and callstacks sound like a fun evening? This post is for you.

Russ Thomas
4. Russ Thomas (@SQLJudo) When the Memory is All Gone
Russ hasn’t posted to his blog recently because he’s been busy creating In-Memory OLTP courseware. So I’m glad that Russ is taking a break from that, returning to blogging and writing about … In-Memory OLTP. Specifically
What I Thought: I like Russ’s style. He has to keep things professional and respectable for Pluralsight, but on his own blog he gets to talk about sandwiches, ulcers and tempdb.

Guy Glanster
5. Guy Glanster (@guy_glantser) We Must Wait for SP1 before Upgrading to SQL Server 2016
To be clear, Guy talks about why the wait-for-sp1 advice doesn’t apply.
What I Thought: One sentence Guy wrote struck a chord with me: “I have never agreed with this claim, but in my first years as a DBA I didn’t have enough experience and confidence to claim otherwise, so I just assumed those people know what they’re talking about.” Wow, I’ve recognize that feeling. All I can say is that when you get past that feeling. That’s a really really good sign.

Patrick Keisler
6. Patrick Keisler (@patrickkeisler) SQL Server 2016 Launch Discovery Day (aka Hackathon)
Patrick gives a recap on a SQL Server hackathon called “SQL Server 2016 Launch Discovery Day”.
What I Thought: Wow, read the whole post, but if you can’t, at least check out the challenges and the scoring criteria. For example, the first challenge is to answer “From how far away do attendees travel to SQL Saturday? Are there any geographic trends to the distances traveled”. How would you approach this?”

Justin Goodwin
7. Justin Goodwin (@SQL_Jgood) Use the Query Store In SQL Server 2016 to Improve HammerDB Performance
Justin describes query regressions he found in benchmark workloads when run against SQL Server 2014 and SQL Server 2016. Then, in 2016, he shows how to use Query Store to fix it!
What I Thought: Fantastic post. And probably the best organized post of the month.

Kenneth Fisher
8. Kenneth Fisher (@sqlstudent144) Comparing two query plans
My friend Ken introduces the SSMS feature of comparing query plans.
What I Thought: I have to admit that after reading his post I tried the feature out last Tuesday. I’ve since found myself using the feature a few times since. It’s a valuable tool for database developers. Thanks Ken!

Andy Mallon
9. Andy Mallon (@Amtwo) COMPRESS()ing LOB data
Andy wrote about new functions COMPRESS and DECOMPRESS which do that using the gzip algorithm. Andy gives an example of how you would use COMPRESS and what situations would make best use of this feature.
What I Thought: I did not know about this. It’s a feature that I will use (once we adopt 2016).

Erik Darling
10. Erik Darling (Erik Darling) Availability Groups, Direct Seeding and You
Erik introduces direct seeding for Availability Groups. Direct seeding lets DBAs avoid a step when launching a new replica.
What I Thought: Apparently if you’re a DBA this is a CoolThing™. I kind of lost my taste for replication solutions – specifically transactional replication on flaky networks – in 2004. It’s nice to see that eleven years later, Microsoft is still working on making it easier to “move data from here to there”. (On the sincerity-sarcasm meter, that lies somewhere in the middle)

Deb Melkin
11. Deb Melkin (@dgmelkin) Temporal Tables
Temporal tables – not to be confused with temp tables – are the subject of Deb’s post. Earlier Rob Farley warned us to be careful about indexes on temporal tables. In Deb’s post, we’re warned about the precision of datetime2 columns in temporal tables.
What I Thought: Thanks for writing Deb, I like your perspective and how you walk us through your thinking. Very compelling.

Chrissy LeMaire
12. Chrissy LeMaire (@cl) So Why is sp_help_revlogin Still a Thing?
Chrissy Lemaire, Powershell MVP writes about how it’s still hard to migrate logins using only SQL Server. However there’s an easy solution using powershell.
What I Thought: Chrissy is one of the bloggers who used my “It’s 2016, why is X still a thing?” writing prompt. Chrissy also points out how rapidly powershell is growing to better support DBAs. It turns into a very exciting story.

Aaron Bertrand
13. Aaron Bertrand (@AaronBertrand) This Weird Trick Still Missing From SQL Server 2016
Fellow Canadian and all around good guy Aaron also used the “It’s 2016, why is X still (not) a thing?” format. As developers, we’d like a way to use Developer edition to be able to deploy to Standard edition confidently without worry that we’ve accidentally used some feature only available in Enterprise Edition.
What I Thought: Aaron’s absolutely right and I’ve been burned by this missing feature in the past. He links to two connect items and asks us to vote on them and leave business impact comments. I encourage you to do that too.

Robert Pearl
14. Robert Pearl (@PearlKnows) It’s SQL Server 2016 – stupid
Robert writes about SQL Server 2016 launch activities and three features he’s excited about, temporal tables, query store and better default tempdb configuration.
What I Thought: It appears that NYC was the place to be when SQL Server 2016 launched. Maybe I’ll keep my calendar clear for SQL Server 2018.

Steve Jones
15. Steve Jones (@way0utwest) Row Level Security
Row Level Security is Steve’s topic. He describes what it is, how it’s used and how he’ll never have to implement that feature from scratch again.
What I Thought: Steve is optimistic about the feature and I am too. I think I was first introduced (in a way) to the idea behind the feature when I tried to look at sys.procedure definitions as a data reader.

Lori Edwards
16. Lori Edwards (@loriedwards) It’s 2016
Lori’s writes a good introduction to temporal tables.
What I Thought: You know how when you visit documentation on a SQL Server keyword or function, you have to scroll to the bottom of the page to see an example? Not so with Lori’s introduction, her examples are up front and excellent.

Mike Fal
17. Mike Fal (@Mike_Fal) SQL 2016 Direct Seeding
Mike Fal is the second one to mention direct seeding for Availability Groups. Nice! This promises to be a good feature for DBAs.
What I Thought: In Erik’s post, he mentioned the lack of SSMS support. And so he pointed to a connect item about it. Mike Fal recognizing the issue tackled it with powershell. Good job.

Taiob Ali
18. Taiob Ali (@SqlWorldWide) JSON Support in SQL 2016
New JSON support! I was glad that someone wrote about it. And Taiob’s not just a JSON user. His workplace uses MongoDB as well.
What I Thought: Thanks Taiob, At first I noticed your website’s domain: “SQLWorldWide” and wondered if you chose that name based on Microsoft’s new sample database World Wide Importers, but then I noticed you’ve been blogging for a while. If you know SQL Server and want to know what JSON is all about, this post is for you. If you’re a developer who uses JSON and want to know how to incorporate it into a SQL Server solution, this post is for you too.

Kennie Nybo Pontoppidan
19. Kennie Nybo Pontoppidan (@KennieNP) Temporal Tables are Just System History Tables (For Now)
Kennie writes about temporal tables. Wow, this seems to be a popular feature this year. He explains how relevant this feature is when compared with more traditional ways to implement type 2 history tables (slowly changing dimensions).
What I Thought: Kennie mentions watching Chris Date and warehouse scenarios, so it’s nice to have the perspective of someone familiar with relational theory and also with data warehouses.

Riley Major
20. Riley Major (@RileyMajor) Let me count the ways…
Riley: I’d like a numbers table please
Microsoft: SQL Azure Features? You got it.
Riley: No, I said I’d like a…
Microsoft: MORE AZURE COMING UP!

Thanks again everyone. Have a wonderful summer and a wonderful 2016!

May 31, 2016

Some Thoughts On Logos

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication — Michael J. Swart @ 10:28 am

So today is the last Tuesday in May which means that next Tuesday is the first Tuesday in June. On that day, you can expect me to invite all SQL bloggers to participate in June’s T-SQL Tuesday. So I’m thinking about my invite post: What will be the topic? What illustration will I include?

The T-SQL Tuesday Logo

When thinking about an illustration to include, I began to look more closely at the T-SQL Tuesday logo:

T-SQL Tuesday Logo

The logo includes a cylinder which is the standard way to represent a database (did you ever wonder why?). That’s what ties “T-SQL” to the logo.

But I want to point out something that not a lot of people notice. If you look really closely, you can see that the grid is actually a calendar for some month and the second Tuesday is highlighted. And that’s what ties “Tuesday” to the logo. Here, I’ll blow it up a bit:

Original

But the resolution makes it hard to read or notice so as an exercise (and for my invite post illustration), I recreated the logo:

Rebuilt

Another Take on the Logo

I happen to sit near some really cool graphics designers. And after some discussions about what makes a good logo, I came up with
Different

Now don’t get too excited, it’s definitely not Machanic-approved. And I won’t be using this logo, it’s just an exercise.
But here are some of my thoughts.

  • It gets away from gradients which is a recent trend in logos and I keep it as uncomplicated as possible.
  • I stuck with blue (or Cyan actually). Microsoft seems to do that with Azure for example and there’s no sense in changing that.
  • I dropped the tie with Tuesday. When I think of T-SQL Tuesday, I think of databases and blogging, not the day of the week.
  • It’s meant to remind you of ERDs. Join diagrams are such a visual thing already and they’re closer to what we deal with on a day to day basis rather than the stereotypical cylinder.

So… watch for the invite post in one week!

May 30, 2016

One SSMS Improvement You Might Have Missed

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 2:36 pm

Takeaway: Undocked query windows in SSMS are now top-level windows.

SSMS Release Cycle

As you may know, SQL Server Management Studio (SSMS) now has its own release cycle independent of SQL Server’s release cycle. This means the Microsoft team who work on SSMS now get to release as often as they like. And it looks like they are. In fact it looks like they’ve released five times so far in 2016.

Many of the changes are small changes, and many of them don’t impact me, but I noticed one cool change that I’d like to draw more attention to.

Undocked Query Windows are Spiffier

The March 2016 Refresh (13.0.13000.55 Changelog) updates SSMS to use the new Visual Studio 2015 shell. Part of that change means that undocked windows are now top-level windows.

Top level windows are windows without parents so the undocked window is not a child window of the main SSMS window (but it is part of the same process). And so it gets its own space in the task bar, and participates in alt+tab when you switch between windows.

Also these undocked windows can be a collection of query windows. Compare the new style with the old style.

Old Style, limit of one query window:

Undocked-Old

New Style, many query windows:

Undocked-New

If you’re a multitasking Developer or DBA who works with SSMS a lot, I think you’ll like this new feature. Undocked query windows now feel like real windows.

Remember SSMS is free (even though SQL Server is not). If you want to download the latest version of SSMS, you can do that here.

April 27, 2016

You Can’t Force Query Plans If They Use TVPs With PKs

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 12:24 pm

Have you ever played “Fortunately/Unfortunately”? It’s a game where players alternately give good news and bad news. It goes something like this:

Databases such as SQL Server make it easy to retrieve sets of data.
Unfortunately, it’s kind of awkward to send sets of data to SQL Server.
Fortunately, table-valued parameters (TVPs) make this easier.
Unfortunately, queries that use TVPs often suffer from non-optimal query plans.
Fortunately, creating primary key or unique key constraints gives us a way to index table types.
Unfortunately, those constraints prevent any kind of plan forcing.
Fortunately, SQL Server 2014 lets us create named indexes for table types which lets us force plans if we need to.

otherhand

Let’s break this down:

Sending Sets of Data to SQL Server is Awkward

It always has been. Originally, developers were forced to send a CSV string to SQL Server and write a do-it-yourself function to split the string into a set of values.

  • In 2005, Microsoft introduced XML and CLR which let developers shred or split strings in new ways,
  • In 2008, Microsoft introduced table-valued parameters,
  • In 2014, they introduced In-Memory TVPs,
  • In 2016, there’s a new SPLIT_STRING() function

So there are more options now then there ever have been and they each have their own issues.

Aaron Bertrand explores some of those performance issues in STRING_SPLIT() in SQL Server 2016. It’s a specific use-case where he focuses on duration. In our case, we focus on aggregated system load like worker time or reads so we don’t necessarily value parallel query plans. But I love his methods. He gives us tools that let us evaluate our own situation based on our own criteria.

I’m going to focus on TVPs which is the most natural method of sending sets of data to SQL Server from a syntax point of view.

Indexes on Table Types

Table-valued parameters are implemented using table types. Before SQL Server 2014, the only way to index a table type was to define a primary key or a unique key on it like this:

create type dbo.TypeWithPK 
    as table ( id int primary key );

The syntax for CREATE TYPEprevents us from naming our primary key and this turns out to be important. Every time I define and use a table variable, SQL Server will dynamically generate a name for the primary key. So when I look at the plan for

declare @ids dbo.TypeWithPK;
select * from @ids

I see that it has a primary key named [@ids].[PK__#A079849__3213E83FDB6D7A43]:
PKName

As I’ll show later, this dynamically generated name prevents any kind of query plan forcing. But as of SQL Server 2014, we can include indexes in our table type definitions. More importantly, we can name those indexes:

create type dbo.TypeWithIndex 
    as table ( id int index IX_TypeWithIndex );
go
declare @ids dbo.TypeWithIndex;
select * from @ids;

This has a primary key named [@ids].[IX_TypeWithIndex] which is what we expect.

Plan Forcing is Not Allowed For TVPs with PKs

Where does plan forcing fit in your tool belt? For me, I’ve never used plan forcing as a permanent solution to a problem, but when I see a query that often suffers from suboptimal query plan choices, I look to plan guides to give me some stability while I work at fixing and deploying a permanent solution.

Plan forcing in SQL Server involves specifying a plan for a particular query. But the primary key name for a table variable is always different so the specified query plan is never going to match. In other words SQL Server is never going to use your query plan because your plan includes index [@ids].[PK__#A079849__3213E83FDB6D7A43], but the query it’s compiling has a differently named index like [@ids].[PK__#AA02EED__3213E83FAF123E51].

If you try, this is what that failure looks like:

USE PLAN
If you try to use the USE PLAN query hint, you’ll get error 8712:

Msg 8712, Level 16, State 0, Line 15
Index '@ids.PK__#B305046__3213E83F57A32F24', specified in the USE PLAN hint, does not exist. Specify an existing index, or create an index with the specified name.

Plan Guides
If you try to force the plan by creating a plan guide, you’ll also see message 8712:

select 
from sys.plan_guides
cross apply fn_validate_plan_guide(plan_guide_id)
-- Index '@ids.PK__#BA711C0__3213E83F44A3F2C8', specified in the USE PLAN hint, does not exist. Specify an existing index, or create an index with the specified name.

Query Store
And if you try to force a plan using SQL Server 2016’s Query Store, you’ll see this:

select plan_id, last_force_failure_reason_desc
from sys.query_store_plan
where is_forced_plan = 1
-- last_force_failure_reason_desc = 'NO_INDEX'

Summary

When defining table variables, avoid primary key or unique key constraints. Opt instead for named indexes if you’re using SQL Server 2014 or later. Otherwise, be aware that plan forcing is limited to queries that don’t use these table variables.

April 20, 2016

Are You Programming In The Database?

Typically, T-SQL is not the best platform for programming (understatement). If you have many procedures that call other procedures, that’s a signal that you might be programming in the database.

Find out using this query:

select 
    OBJECT_SCHEMA_NAME(p.object_id) as schemaName, 
    OBJECT_NAME(p.object_id) as procedureName,
    count(*) as [calls to other procedures]	
from sys.procedures p
cross apply sys.dm_sql_referenced_entities(schema_name(p.schema_id) + '.' + p.name, 'OBJECT') re
where re.referenced_entity_name in (select name from sys.procedures)
group by p.object_id
order by count(*) desc;

in Adventureworks, we see this result:
ProcsCallingProcs1

To drill down into those results, use this query:

select distinct
    QUOTENAME(OBJECT_SCHEMA_NAME(p.object_id)) + '.' 
        + QUOTENAME(OBJECT_NAME(p.object_id)) [This procedure...], 
    QUOTENAME(OBJECT_SCHEMA_NAME(p_ref.object_id)) + '.' 
        + QUOTENAME(OBJECT_NAME(p_ref.object_id)) [... calls this procedure]
from sys.procedures p
cross apply sys.dm_sql_referenced_entities(schema_name(p.schema_id) + '.' + p.name, 'OBJECT') re
join sys.procedures p_ref
	on re.referenced_entity_name = p_ref.name
order by 1,2

which gives results like this:
ProcsCallingProcs2

Adventureworks seems just fine to me. Only four instances of procedures calling procedures. I looked at the database I work with most. Hundreds of procedures (representing 15% of the procedures) call other procedures. On the other end of the spectrum is Stackoverflow. I understand that they don’t use stored procedures at all.

April 11, 2016

Tackle WRITELOG Waits Using the Transaction Log and Extended Events

Takeaway: WRITELOG waits are associated with a busy or slow transaction log. To tackle these waits, we need to measure transaction log activity. I describe a lightweight way to examine transaction log usage for busy OLTP systems.

Tackle WRITELOG

Start with Microsoft’s Advice: I’m not going to introduce the topic of transaction log performance. Microsoft’s SQL Customer Advisory Team already provides a great introduction with Diagnosing Transaction Log Performance Issues and Limits of the Log Manager. Their advice includes watching the “Log Bytes Flushed/sec” performance counter found in the “SQL Server:Databases” object.

Reactive Efforts: If you’re curious about transaction log activity for open transactions, Paul Randal has a script at Script: open transactions with text and plans.

Spiky Activity: It’s not too difficult to find infrequent activities that write a lot of data to the transaction log; activities like data warehouse ETLs, or index rebuilds. Use a trace or extended events to look for statements with large values for “writes”.

Scalability of OLTP Workloads

WRITELOG waits are a scalability challenge for OLTP workloads under load. Chris Adkin has a lot of experience tuning SQL Server for high-volume OLTP workloads. So I’m going to follow his advice when he writes we should minimize the amount logging generated. And because I can’t improve something if I can’t measure it, I wonder what’s generating the most logging? OLTP workloads are characterized by frequent tiny transactions so I want to measure that activity without filters, but I want to have as little impact to the system as I can. That’s my challenge.

Getting #SQLHelp

So I asked twitter. And I got some great advice from Erin Stellato:
sqlhelp
Erin also pointed out that the UI warns you that it’s a very high volume event.

Combining fn_dblog With Extended Events

So to avoid that kind of volume, I got the idea to read straight from the transaction log and combine that with a lighter extended events session to get the SQL text. The transaction_id captured by the extended events session corresponds to the XAct ID column in fn_dblog.

Here’s how that went:

The Script
The details for this script are kind of fussy, but it all comes together in a solution that won’t drag a server down. Care is still recommended; start with 10 seconds and go from there.

declare @Duration varchar(10) = '00:00:10';
declare @FileSize varchar(10) = '5'; -- in megabytes
 
-- create session
DECLARE @CreateSessionSQL nvarchar(max) = N'
    CREATE EVENT SESSION query_writes ON SERVER 
    ADD EVENT sqlserver.sp_statement_completed ( 
        SET collect_statement=(0)
        ACTION(sqlserver.transaction_id, sqlserver.database_name)
        WHERE sqlserver.transaction_id > 0
          AND sqlserver.database_name = ''' + DB_NAME() + N''')
    ADD TARGET package0.asynchronous_file_target(
      SET filename = N''query_writes.xel'',
          max_file_size = ' + @FileSize + N',
          max_rollover_files = 1)
    WITH (
        STARTUP_STATE=ON,
        EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
        TRACK_CAUSALITY=OFF)';
exec sp_executesql @CreateSessionSQL;
 
ALTER EVENT SESSION query_writes ON SERVER
    STATE=START;
 
-- get the latest lsn for current DB
declare @xact_seqno binary(10);
declare @xact_seqno_string_begin varchar(50);
exec sp_replincrementlsn @xact_seqno OUTPUT;
set @xact_seqno_string_begin = '0x' + CONVERT(varchar(50), @xact_seqno, 2);
set @xact_seqno_string_begin = stuff(@xact_seqno_string_begin, 11, 0, ':')
set @xact_seqno_string_begin = stuff(@xact_seqno_string_begin, 20, 0, ':');
 
-- wait a minute
waitfor delay @Duration;
 
-- get the latest lsn for current DB
declare @xact_seqno_string_end varchar(50);
exec sp_replincrementlsn @xact_seqno OUTPUT;
set @xact_seqno_string_end = '0x' + CONVERT(varchar(50), @xact_seqno, 2);
set @xact_seqno_string_end = stuff(@xact_seqno_string_end, 11, 0, ':')
set @xact_seqno_string_end = stuff(@xact_seqno_string_end, 20, 0, ':');
 
-- Stop the session
ALTER EVENT SESSION query_writes ON SERVER
    STATE=STOP;
 
-- read from transaction log
select 
    max([Xact ID]) as transactionId,
    max([Transaction Name]) as transactionName, 
    sum([Log Record Length]) as logSize,
    count(*) as [logRowCount]
into #TLOGS
from fn_dblog(@xact_seqno_string_begin, @xact_seqno_string_end) f
group by [Transaction Id]
 
-- read from session data
CREATE TABLE #SessionData (
    id int identity primary key,
    XEXml xml NOT NULL    
)
 
INSERT #SessionData(XEXml)
SELECT CAST(fileData.[event_data] as xml)
FROM sys.fn_xe_file_target_read_file ( 'query_writes*xel', null, null, null) fileData;
 
-- find minimum transactionId captured by xes 
-- (almost always the first one, depending on luck here)
declare @minTXFromSession bigint;
select TOP (1) @minTXFromSession = S.XEXml.value(
    '(/event/action[(@name=''transaction_id'')]/value)[1]', 'bigint')
from #SessionData S;
 
WITH SD AS
(
    SELECT 
        S.XEXml.value(
            '(/event/action[(@name=''transaction_id'')]/value)[1]', 'bigint') as transactionId,
        S.XEXml.value(
            '(/event/data[(@name=''object_id'')]/value)[1]', 'bigint') as objectId
    FROM #SessionData S
)
SELECT 
    ISNULL(T.transactionName, 'Unknown') as transactionTypeName, 
    OBJECT_NAME(S.objectid) as ObjectName,
    SUM(T.logsize) as totalLogSizeBytes,
    SUM(T.logRowCount) as totalLogRowCount,
    COUNT(*) as executions
FROM #TLOGS T
LEFT JOIN (SELECT DISTINCT * FROM SD) S
    ON T.transactionId = S.transactionId
WHERE T.transactionId >= @minTXFromSession
GROUP BY T.transactionName, S.objectId
ORDER BY SUM(T.logsize) DESC
 
-- clean up
DROP EVENT SESSION query_writes ON SERVER;
DROP TABLE #TLOGS
DROP TABLE #SessionData

Sample Results

Here’s an example of what the results would look like. It’s an aggregated view of all transaction log activity in a database for 10 seconds.
Example Results

Notes

  • Notice that the session is database specific. That’s because transaction logs are database specific. To help focus on the right database, use the “Log Bytes Flushed/sec” performance counter found in the “SQL Server:Databases” object.
  • Also notice that I’m tracking ObjectIds. That’s because we use procedures quite heavily. You may want to adapt this code to use query_hash instead. In both cases, collecting the statement text is not recommended.
  • The sample of data is limited by the size of the extended events target file or the duration variable, whichever is smaller.
  • @sqL_handLe pointed out to me that reading the log using fn_dblog will prevent the transaction log from truncating. Reading from the transaction log can be very tricky to do efficiently. Luckily we can use the sp_replincrementlsn trick to get LSN parameter values for fn_dblog.
Older Posts »

Powered by WordPress