"sql server" | Michael J. Swart

June 15, 2011

Enforcing Business Rules Vs. Avoiding Triggers: Which Is Better?

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: "sql server", business rules, triggers — Michael J. Swart @ 12:00 pm

Takeaway: I like to develop database stuff using rules of thumb. When two rules of thumb are at odds, what should I do? This post, I explore the advice: It depends with an example.

I first give the two rules of thumb:

Rule Of Thumb #1: Avoid Triggers

I don’t like triggers too much and I avoid them whenever possible. It’s just that they

hide so much that might go wrong,
They’re a huge source of frustration when debugging problem queries
They’re often a symptom of sloppy db design and
it’s a common mistake to write them using the assumption that the inserted virtual table contains a single row

Rule Of Thumb #2: Enforce Business Rules In The Database

It’s good to enforce business rules at the database:

Databases often outlive the applications that are built against it
It’s too difficult to enforce the rules in a single place only (e.g. the application). The effort needed to keep the data clean becomes unsustainable.
They can help development. It shouldn’t need saying that if foreign keys throw errors during development, don’t get rid of them! They’re doing their job!

The Example: No Overlapping Intervals

This is where the two rules of thumb collide! It’s because not all business rules are easy to implement. Take this example, the case of preventing overlapping intervals. It’s a common scenario. Some examples are seen

when scheduling resources (planes at a gate, weddings at a hall, meetings in rooms)
when enforcing one thing at a time (terms of a particular office, factory equipment status)

To restrict overlapping intervals, the usual business rule enforcement tricks don’t work:

They cannot be enforced easily with database table design (~~foreign keys don’t apply because the constraint applies to a single table~~ See Comments).
They can’t be enforced easily with checks because a check constraint only enforces rules on values in a single record. But the overlapping interval restriction is a restriction based on two records.
And unique indexes can’t help either.

But the constraint we want to enforce could be implemented decently with a trigger. Consider this table that tracks what I call shifts:

USE tempdb
GO
 
CREATE TABLE SHIFTS (
	shiftId int identity primary key,
	beginTime datetime2 not null,
	endTime datetime2 not null,
	check (beginTime < endTime )
)
CREATE UNIQUE INDEX IX_SHIFTS 
	ON SHIFTS(beginTime)
	INCLUDE(endTime)
GO

We’ve already enforced that shifts go forward in time with a check and that no two shifts will start at the same time with a unique index. Here’s a trigger which will enforce that no two shifts overlap:

CREATE TRIGGER t_NoOverlappingShifts 
   ON  SHIFTS
   AFTER UPDATE, INSERT
AS 
 
IF EXISTS (
	--next shift starts after this one ends
	SELECT 1
	FROM inserted
	CROSS APPLY (
		SELECT TOP 1 beginTime
		FROM SHIFTS 
		WHERE beginTime > inserted.beginTime
		ORDER BY beginTime ASC
	) AS NextShift(beginTime)
	WHERE NextShift.beginTime < inserted.endTime
)
BEGIN
	RAISERROR ('Error: Shifts cannot overlap.', 16, 1)
	ROLLBACK TRANSACTION
END
 
IF EXISTS (
	--previous shift ends before this one starts
	SELECT 1
	FROM inserted
	CROSS APPLY (
		SELECT TOP 1 endTime
		FROM SHIFTS 
		WHERE beginTime < inserted.beginTime
		ORDER BY beginTime DESC
	) AS PrevShift(endTime)
	WHERE PrevShift.endTime > inserted.beginTime
)
BEGIN
	RAISERROR ('Error: Shifts cannot overlap.', 16, 2)
	ROLLBACK TRANSACTION
END
GO

The trigger performs decently, and it does its job, but it’s not suited for every situation (e.g. bulk load operations)

My Opinion

So I ask myself the question: Is it good to create the trigger in this case?. I’m going to say that this trigger is good and that the value of enforcing business rules here is better than avoiding triggers because

the trigger doesn’t have any “side effects”. What I mean is that besides sometimes failing, the data in the database won’t be different than if the trigger didn’t exist
the trigger handles inserts or updates of multiple rows correctly
this trigger is not a bandaid for sloppy database design (like those maintenance triggers that keep redundant data in synch)

But I’d like to hear what you think in the comments.

P.S. It occurs to me that maybe you don’t even have the same rules of thumb that I mentioned. So what do you think? triggers: good or bad?

-- Comments (29)

June 8, 2011

Forget About PIVOT, You Don’t Need It

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: "sql server", PIVOT, pivot tables, t-sql — Michael J. Swart @ 12:00 pm

Takeaway: Until recently, I was a huge T-SQL PIVOT fan. But the more I think about it, the more I realize that it’s probably better to pivot data anywhere else than to pivot data in SQL Server. I talk about easier and better alternatives to PIVOT. Remember PIVOT is a T-SQL command that was introduced in SQL Server 2005 that lets you return results as pivot tables or more commonly as cross tabulations.

SQL Server 2000 and Earlier

I remember the first time I came across the need for pivot tables. I knew what I needed the data to look like, but I didn’t know what they were called. It didn’t matter, PIVOT wasn’t yet a T-SQL keyword and I couldn’t make use of it any way.

I was asked to write a query that returned data about the number of widgets and their categories. That seems like no problem right? Except I was asked to return the data in a two dimensional grid (with widgets as rows and categories as columns). It was a perfect use case for PIVOT! Except that it wasn’t yet available and so I was forced to write something complicated in T-SQL that used subqueries, CASE statements and other crazy stuff.

The resulting query was at least 100 lines. It looked complicated but it did the trick. I was actually congratulated on my SQL skillz (although in hindsight I shouldn’t have been).

But it got me to thinking. Why was that so hard? I used to think that any reasonable English question can be translated into simple SQL. But here was a big fat hairy counter-example. I eventually came up with an answer: It’s because the requested data has information about different sets of data in the same row.

SQL Server 2005 and Later

Fast forward a few years. I was learning about all the new stuff that SQL Server 2005 could do. And I came across the new PIVOT keyword. I knew immediately what it could do for me. I knew immediately how it was used. And I used it when the results called for it. I never remembered the syntax because I knew I could always look it up.

Eventually I gained a reputation as a SQL know-it-all (whether I deserved it or not) and I started fielding database questions. If any colleagues were struggling to return a cross-table, it was easy to recognize what they were doing and easier to point them to the online docs for PIVOT.

Or Just Let The App Deal With It

But I realized recently that it’s really not necessary. At the database level, I probably don’t have to pivot this. I can ask the question: Do I really need to deliver the data that way – pre-pivoted? Heck No! I can delegate that stuff. It’s actually a lot easier to pivot this data almost anywhere else besides SQL Server.

So in essence I’m giving myself (and you!) permission to forget something: T-SQL’s PIVOT syntax.

Pivot Inside Excel

Say the data is a one-time-only query and you want to include a cross table in a spreadsheet or email. Well Excel’s pivot feature turns out to be dead simple. This pivot functionality is also found in Open Office’s Calc and any other spreadsheet application built this century. Just a couple extra tips:

Using the “format as table” feature can save yourself a couple clicks.
Find the “Create Pivot Table” feature under the Insert tab.
If you’re still having any trouble, I’m sure there’s tons more help here.

Pivot Inside a .Net App

Okay, so say you’re writing some C# and you have a DataTable that you wish were more pivot-y. Ask and ye shall receive:

DataTable Pivot( DataTable dt, DataColumn pivotColumn, DataColumn pivotValue ) {
	// find primary key columns 
	//(i.e. everything but pivot column and pivot value)
	DataTable temp = dt.Copy();
	temp.Columns.Remove( pivotColumn.ColumnName );
	temp.Columns.Remove( pivotValue.ColumnName );
	string[] pkColumnNames = temp.Columns.Cast<DataColumn>()
		.Select( c => c.ColumnName )
		.ToArray();
 
	// prep results table
	DataTable result = temp.DefaultView.ToTable(true, pkColumnNames).Copy();
	result.PrimaryKey = result.Columns.Cast<DataColumn>().ToArray();
	dt.AsEnumerable()
		.Select(r => r[pivotColumn.ColumnName].ToString())
		.Distinct().ToList()
		.ForEach (c => result.Columns.Add(c, pivotValue.DataType));
 
	// load it
	foreach( DataRow row in dt.Rows ) {
		// find row to update
		DataRow aggRow = result.Rows.Find(
			pkColumnNames
				.Select( c => row[c] )
				.ToArray() );
		// the aggregate used here is LATEST 
		// adjust the next line if you want (SUM, MAX, etc...)
		aggRow[row[pivotColumn.ColumnName].ToString()] = row[pivotValue.ColumnName];
	}
 
	return result;
}

If you know the shape of your data ahead of time, you could have coded this more easily by hard coding column names. But what I’ve given here is a general procedure (It works on most test cases. Error handling and extra testing are left as an exercise for the reader 🙂 ).

Pivot Inside Other B.I. Clients

Most other clients I can think of have even easier pivot features:

Reporting Service’s matrix control is a pivot table.
Integration Service’s pivot transformation is easy and handy.
Analysis Service’s cubes are nothing but a multi-dimensional pivot tables.
Other Business Intelligence software is all over this. If your B.I. software can’t do cross-tables or pivot tables, I would ask for your money back.

You don’t need T-SQL’s PIVOT table at all. Forget that stuff…

… Except When You Do Need T-SQL’s PIVOT

Okay, maybe I’m a bit over-zealous there. Here are some reasons why you might want to use T-SQL’s PIVOT tables. They’re not great reasons but I present them any way. You’ll need to know this stuff:

when studying for certifications, interviews or other tests,
when the application or data destination you’re working with is not suited to pivoting (although nothing comes to mind). Your choice is then PIVOT using T-SQL or not at all.
when the data’s destination is the SSMS results pane. That’s fair.
when you don’t have control over the app, or the app is too difficult to modify and it’s expecting a cross-table for data.

So for the most part, you can forget about PIVOT syntax. Next week maybe I’ll talk about remembering the syntax for the MERGE statement. In my opinion, SQL’s MERGE statement is a statement worth remembering, but that’s another story.

-- Comments (35)

June 1, 2011

A Round-up of Round-ups and other Meta Things

Filed under: Data Cartoons,Miscelleaneous SQL,SQLServerPedia Syndication — Tags: "sql server", community — Michael J. Swart @ 12:00 pm

This article is very meta.

No, I’m not talking about the friendly Mehta family a few doors down. I’m talking about the prefix Meta. In particular I’ve noticed a few SQL Blog post aggregator sites and I wanted to aggregate the aggregators, review the reviews and round up the round-ups.

But How Do I Use This Post?

I wrote this post because a friend of mine wanted to keep up with what’s new in the industry. When I showed him my list of RSS feeds, it was a bit overwhelming. So I wrote this post to be an easier and less overwhelming intro to the online SQL community. If you’re in the same boat, just follow these steps:

Start by following the weekly round ups (see below). At the very beginning, it’s a very good place to start.
Then if you’re still curious and your free time allows, subscribe to some of the collection feeds (see below). This gives a large variety of decent quality content.
Create a category called Elite in your RSS reader and promote or tag your favorite writers to that category.

Without further ado

Weekly SQL Article Roundups

These are weekly recaps of what’s new in SQL Server are probably the highest concentration of quality links for the week. I would recommend any one of them for anyone who only has 15 to 30 minutes to spare a week but wants to keep an eye on the industry.

Database Weekly (website | newsletter) A weekly newsletter put on by the folks at SQL Server Central (Or Red Gate or Simple Talk; they’re all friends). It’s probably the most popular weekly round up for SQL Server and well deserved. It’s got at least a couple dozen links a week. It seems like a lot, but they’re broken out into categories so it’s manageable and very comprehensive.
Something For The Weekend (website | email | rss) By John Sansom (good guy). He usually has a dozen or so handpicked articles that he posts on Fridays. If you’re in the UK, he’ll also tell you about local user group events.
Things Brent, Jeremiah, Kendra and Tim Liked This Week (website | email) Another weekly newsletter that sprouted up about a month ago. It’s not as much of a round up as the other two in this list; It’s four people selecting three or four of their favorite links weekly. It’s only been a little over a month, but these links are cool (concentrated cool) and don’t tend to overlap with the other round-ups above.

Blog Collections

If you thought that Ulysses was a nice light read; And you remember Moby Dick as a quick fable; Maybe you’ve accidentally called the phone book a pamphlet and your favorite drink is fire-hose water. Then these are for you.

These are aggregate feeds of syndicated bloggers. Why subscribe to hundreds of SQL Bloggers separately when you can subscribe to most of them in one feed? With these feeds, you’ve got tons of articles to read and will likely not have enough time to read all of them. But if you eat\breathe\sleep SQL Server, these are for you.

SQLServerPedia (website | rss) Hosted by Quest software, they’ve got about 100 bloggers and I’m proud to be one of them. It’s a syndicated list which means that each article actually lives on its own bloggers’ site. But the articles are collected by SQLServerPedia’s website and rss feed. It’s a very diverse group of bloggers and together they average about 10 to 20 articles a day.
SQLBlog (website | rss) Run by Adam Machanic and Peter DeBetta. These blogs are popular and they’ve been around a while. They’ve got about 30-ish active bloggers under their wing and the quality is top knotch. There’s a number of things that make this site stand out.
- Quality, You can trust the information in these articles.
- Focus, Most of the articles are technical. Occasionally there are a few posts entitled I’ll be speaking at some user group you don’t care about. But those are easy to skip.
- Active, The comment section is lively. Because the articles are hosted at SQL Blog, so are the comments.
SQLServerCentral (rss) This has a couple posts a day and includes articles from Simple Talk. The volume makes keeping up with this feed manageable. The quality of each article is first class. I believe these articles are really polished because they’re reviewed by tech editors. My only gripes are that …
- The feed only includes a one paragraph teaser so that you have to click through to read the articles.
- Often the teaser has no indication of who the author is. So when the teaser article is something generic like Make the most of your Database it’s tempting to simply mark-as-read.

There are lots of others and I know I’m missing a few (SQLTeam, SQLskills, SSWUG etc…) you can check them out if you’re a fire-hose water connoisseur. (Holy cow, I just spelled connoisseur correctly without spell check!!)

Michael, What Else Have You Got?

If none of the above information is new to you, good job, you’re well ahead of the curve. This post still has something new for you though, a “meta” joke (a joke about “meta”, not a joke about jokes)

The humour here is all Karen Lopez (@DataChick), I just supplied the pixels.

-- Comments (8)

May 25, 2011

Another Advantage of Consultants

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication — Tags: "sql server", consulting — Michael J. Swart @ 12:00 pm

Takeaway: Veering away from technical content for a bit, I talk about the advantages of hiring an outside consultant to help with technical problems.

(I’m a full time employee and I have been since University. So when I push the advantages of hiring consultants, I’m not trying to sell anything. )

Many consultants market themselves by blogging and training. If you follow any SQL Server bloggers, there’s probably a fair number of them that are consultants. So they’re the first ones who will tell you all the reasons you can and should hire them.

But Michael Chrichton or Steven Spielberg can warn you against hiring the wrong one. Hiring the wrong consultant or contractor can prove deadly:

"... I should have hired my granddaughter, she knows Unix systems"

John Hammond's Fatal Mistake

But enough about that.

Why Do People Hire Consultants?

The Obvious

To solve technical problems of course!
You want to hire someone with expertise that you don’t currently have on staff. (But not on a permanent basis)

The Not So Obvious

Having more exposure to the industry, they are in a better position to spot things that are unusual or non-standard. They’ve got a more objective second pair of eyes.
They’re usually really well connected. Imagine they’re a SQL Internals expert, but you’ve got a Business Intelligence problem. They usually know how to find good help.
Somehow you’ve got extra budget and if you don’t spend it, your group will be punished next year with a smaller budget. (That situation always struck me as weird).
You’ve got an idea or solution and you know it’s the right solution. But you can’t implement it because you just don’t (yet) have the pull at your company. Consultants can be your hired clout.

The Surprising

My company recently engaged Microsoft’s CAT team to come talk to us. It’s not that there are any fires we want to put out. We’re just looking for the best way to continue grow and support our systems for the next 5 to 10 years.

I was really excited to talk to SQL Server experts for three days. To make the most of the consultation, we booked a meeting room for the entire time and got some of our most experienced people on hand to participate. We talked about lots of things, like scalability and about all the little headaches and issues that we face on a day to day basis.

Just stop there for a second. Imagine that same scenario in your workplace but now take out the consultant. What have you got now?.

You’ve got your best and most experienced people taking three days to discuss the largest headaches and issues facing your team. THAT’S GOLD JERRY, GOLD! Now add in a consultant as a mediator and it works even smoother! It worked for us. There was actually a point during the engagement where the consultant (Hi Chuck!) was simply standing at the white board writing down pros and cons of various solutions that our team had come up with.

And when your consultant/mediator starts contributing good ideas that you haven’t thought of yet. Well that’s really really fun.

Conclusion: Consultants 1, Problems 0. Just don’t let them network your dinosaur park.

-- Comments (5)

May 18, 2011

Okay, You’ve Found Blocking, Now What?

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: "sql server", blocked process report, blocked processes — Michael J. Swart @ 12:00 pm

Takeaway: You’ve detected, analyzed and understood blocked processes that have happened on your machine. What can you do to avoid this blocking in the future?

Those who know me well, know I’ve started a project that helps database professionals analyze the blocked process reports that SQL Server produces. I’ve talked about it before:

The Blocked Process Report Viewer github project where it’s hosted.
The original blog post that explains how to use it.
The last blog post that explains when to use it (with fun flowcharts).

What Next?

But I got an excellent comment from Wallace Houston. He writes:

“What I’m looking for is “what to do” when blocks are already identified. I want to try to prevent them. In other words, how to modify my code in such a way as to avoid as many blocks as possible.”

That’s an excellent question. Wallace went on to talk about a “chain reaction” of blocking where everything locked up. Man, I’ve been there and it can be disastrous. It sometimes seems like the only thing to do is to kill the lead blocker’s process. And that feels like giving up.

So after giving it some thought. I came up with this (slightly reworded from my reply to Wallace). These are actions you can take to avoid blocking in the future.

This is Next:

There’s a lot you can do to avoid excessive blocking. I’ve learned very recently that peoples’ ideas of excessive blocking vary a lot. In my environment, I look for blocking longer than 10 seconds. Other people watch for a threshold of 10 minutes!

Either way,

If at all possible, tune the lead blocker. If a lead blocker can get take its lock and let it go immediately then there’s no more blocking.
Avoid transactions if you don’t need them (but don’t be afraid of them if you do).
Pick the nicest isolation level for your transaction. Read committed is nicer than repeatable read is nicer than serializable.
If you can get away with it, maybe put the query that’s the lead blocker in a transaction that uses “snapshot isolation”. It uses a bit of tempdb resources, but it’s awesome for concurrency problems.
There’s table lock hints (readpast, nolock, holdlock) for more targeted locking, but really you want to understand who’s doing what before you go down that road.
I’m not ashamed to say that I’ve used isolation level “read uncommitted” which is equivalent to the NOLOCK table hint. Especially when the caller can tolerate the very very rare cases of inconsistency that might result. NOLOCK is easy and you can’t argue with results – it works – but snapshot isolation is preferred because it is guaranteed to be consistent (if not current).

I remember once I had a table, let’s call it T whose columns a,b,c,d, and e got queried a lot. But column x got updated a lot and we had blocking issues. The solution was to split the table T into T1 (a,b,c,d,e) and T2(x) with a foreign key from T2 to T1. We then updated queries that used it and got seriously reduced contention.

Progress on my Blocked Process Report Viewer

And for those that are curious, my Blocked Process Report (BPR) Viewer is coming along nicely.

Lately I’ve

fixed a few bugs
added some documentation (which you’ve already seen if you read this blog).
added some SQL Profiler template files and SQL Trace scripts to help collect BPR
added a quick and dirty test suite.

It’s pretty much good to go and release as version 1.0. But I still want to add features that makes it easier to analyze.

Stay tuned and happy block busting!

-- Comments (10)

May 4, 2011

When To Use Blocked Processes Reports

Filed under: SQL Scripts,SQLServerPedia Syndication,Technical Articles — Tags: "sql server", blocked process report, blocked processes, sql — Michael J. Swart @ 12:00 pm

I introduced the SQL Server Blocked Process Report Viewer a couple weeks ago and I realize that I took something for granted. Not everyone has practice watching for Blocked Process Reports and not everyone understands how or when they can use my utility to help troubleshoot concurrency issues.

Steps For Finding Concurrency Problems

Remember: When it comes to concurrency problems, you don’t have to guess what’s wrong!!!

And I explain it flowchart style!

Checking for LCK_M_XX waits Paul Randal has a great script that interprets the data in the dmv sys.dm_os_wait_stats. I like this script because when all else fails. This script is a great starting point for understanding where the system’s bottlenecks are.
Using sp_WhoIsActive Adam Machanic wrote Who Is Active as an tricked out version of sp_who and sp_who2. I recommend it because it is a great view into what’s active on your server right now. And that includes blocked processes and other concurrency issues. (i.e. For any acute problem go there. For chronic concurrency problems, come back here).
Using SQL Trace You might know this as Profiler. Capture a trace with the “Blocked Process Report” event which is located in the Error and Warnings event list. But don’t forget! You first have to decide on what it means for your system to have excessive blocking and configure the blocked process threshold accordingly. I’ve learned very recently that peoples’ ideas of excessive blocking vary a lot. In my own environment, I often look for blocking longer than 10 seconds. Other people use a threshold of 10 minutes!
Analyzing Traces With Blocked Process Report Viewer This is the tool I wrote that I hope you find useful. Right now it tells you who the lead blocker is. And I hope to expand the features into analysis soon.
Configuring Server for Event Notifications I’m really not too familiar with this option and don’t use it much. As an alternative you can also use WMI queries and events mapped to a sql agent job (Thanks Vincent Salard, for that tip).
Using Extended Events Once Denali Arrives Jonathan Kehayias knows extended events backwards and forwards. In his blog post here, he describes how in the next version of SQL Server, the blocked process report will be traceable using extended events.

An Ounce of Prevention

In an extremely timely post Kendra Little writes about understanding locks in It’s a Lock. Following much of the same steps, you can understand what your app is doing beforehand and avoid any blocking problems from the start (e.g. understanding locks held during schema changes).

Next Week

I’ll be releasing SQL Server Blocked Process Report Viewer. (Right now we’re only in beta and I’m also open to suggestions about a new name for the tool)
Understanding blocking is the first step. Next week I’ll talk about what happens after analysis. I’ll write about the steps I’ve taken and had success with. after analysis.

-- Comments (13)

April 27, 2011

Woo Hoo… Microsoft Certification Exam 70-451

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: "sql server", 70-451, exam study guide, mcitp — Michael J. Swart @ 12:00 pm

So I passed the Microsoft Certification Exam 70-451. I know, I know. This blog post seems like a rerun. Wasn’t it only two weeks ago that I passed this exam? What gives?

Actually no. Two weeks ago, I passed the 70-433. That was a prerequisite for this one. To recap:

Pass the 70-433 exam and you’re a MCTS (Microsoft Certified Technology Specialist) in SQL Server 2008 Database Development.
Pass the 70-451 exam and you’re a MCITP (Microsoft Certified IT Professional) Database Developer 2008.

(As long as those titles seem, it’s not longer than the name of my bachelors degree written out. Ask me about it some time).

My Experience

The motivation behind getting the certification was simple. I was asked to get it for work:

I was asked “Can you get the cert in a month?”

“Umm… let’s see. Yep, the testing center has seats available, I’ll give it a try.”

So I did. Other people have different motivations for certification. I’m not sure if I would have gotten these without some prompting. But on the other side, now that I’ve gotten it, I’m glad I did.

But the preparation for these exams meant a ton of studying. I’m pretty confident in my skills, but I realize that the happy ending was not guaranteed. So that meant work. If I was going to fail, it wasn’t going to be because I didn’t prepare enough. (Remember that in the Hare and the Tortoise the real moral of the story is not slow and steady… but actually don’t get cocky).

So I was nervous. I went into the test and started answering questions. About 15 questions in I realized that I was going to be okay and I relaxed a bit. I finished with time to spare and reviewed the questions I was fuzzy on. I clicked submit and BAM, I passed with a score of 914. They tell you your mark immediately.

Celebrating

In an article of his, Brent Ozar reminded me recently about Scott Adams’ champagne moments. In that article he talks about how important an event has to be before it’s worth popping open a bottle of champagne. Like Brent, I have a low threshold for these “champagne-worthy” moments and Monday was one of those days.

I was surprised at how glad I was to get the certification. And that’s worth recognizing.

The Study Guide I Used

So I created a study guide for myself earlier and what’s shown here is a straight dump of that; I include it as-is. That is, I don’t provide links and some of the resources you might not have access to.

I started out making the study guide a list of things to do, and not a list of things to learn. (Just like homework or an assignment).

Designing a Database Strategy (13 percent)

Create SQL Server agent jobs, steps & schedules (through ui and sp_)
Set up DB Mail and use. Notify self when a job is done.
Create a linked server (through ui and sp_)
Review normalization articles
Create an application that uses the Entity Framework
Security Strategies. Put the following terms somewhere into security triples: (prinicpal, object, permission)
- Application Roles
- Schema Ownership
- Execution context
- Window vs SQL Authentication
- Permissions and DB Roles
Security Strategies.
- Create and implement a demonstration of the ideas listed above, both with the UI and with t-sql.
Service Broker
- Go through tutorial, review rusanu’s asynchronous job service broker.
- Look through following ideas for ideas not yet covered
  - Services,
  - Contracts
  - Activation
  - Routes
  - Message types
  - Queues
  - Remote service binding
  - Priorities
- For ideas not yet covered. Create and implement a demonstration of the ideas listed. Both with the UI and with t-sql

Designing Database Tables (16 percent)

Implement and use data types from C# application
- Geography: Implement an app that tells you how close the nearest x is.
- Geometry: Review “Birth of Venus”
- HierarchyId: Review
  - Can you get value of current node?
- Date, time, datetime2, datetimeoffset
  - Watch Kendra Little’s 24HOP session on dates.
- Varbinary (max) (Wasn’t this around in 2005?)
- Filestream ughh…

Size the data types. Review each.
In row data blobs, Create a table that stores data in row by default, and out of row by default. (What is the unspecified default?)
Create some persisted computed columns. Review Determinism, precision etc…
Data integrity ,Enforce some constraints the following way:
- Dml triggers,
- Pk
- Fk
- Check
- Unique
- Null/not null
- Default (???)

Designing Programming Objects (17 percent)

Stored Procedures
- Create procedure exercising different execution contexts (EXECUTE AS)
- Review TVP example, Also create sproc to pass tables between tsql and sproc
- Write sproc that returns data (using RETURN?), using OUTPUT param, using result sets. Are there others?
- Look at interaction of RECOMPILE (table level vs. sproc level)
- Look for Error handling examples, start with Rusanu’s asynch script. Look for TRY CATCH examples. Update sp_blockedprocess* with error handling
Views
- Create views again, CTEs
- What’s a partitioned view.
- Review with check option again (it means updating views shouldn’t violate the view criteria)
- Read BOL on WITH SCHEMABINDING
Functions
- What permissions are there on udfs? flavors of security here: GRANT, DENY, REVOKE versus EXECUTE AS
- What is schemabinding on udf?
- what are the three (four?) kinds of functions
- Review inline table-valued functions vs. views, multi-statement table-valued functions, determinism \
CLR Stuff (This was my favorite part of studying if not the most useful)
- Implement a regex “match” in CLR
- Implement “Concatenate” aggregate in CLR
- Implement “first” aggregate in CLR (if possible)
- Impelement “times table” sproc in CLR (gives times table as resultset from 1 to 12)
- Explore PERMISSION_SET.
- Implement Regex as a clr UDType
Look out for Bobby Tables:
- QUOTENAME,
- PARSENAME,
- too bad we can’t use sys.fn_quotefourpartname
- Especially look out for EXEC or sp_executesql… distrust any parameter values!

Designing a Transaction and Concurrency Strategy (14 percent)

Locking granularity levels (Something I’ve always avoided in the past.)
- Review the different locking granularity hints.
- Understand memory consumption of each.
Implicit transactions vs explicit transactions
- Nested transactions, using XACT ABORT, Nesting levels, (damn, where’d I put my totem). Dive deep into these. Which severity of error is enough to kick you up a level? I know how TRY CATCH behaves, is this affected by aborted transactions?
- Save points, (review sudoku solution)
- Use SET IMPLICIT_TRANSACTION ON setting
Concurrency.
- Read Kendra’s poster
- Review my own series.
- How do are these affected by rowversion datatype?

Designing an XML Strategy (8 percent)

Watch MCM video on xml by Bob Beauchemin

Designing Queries for Performance (17 percent)

Optimize and Tune
- Review what’s sargable
- Look at temp storage (find a definitive blog post on difference between temp table and table variable.
- Find Itzik’s posts on sqlmag.com for GROUP BY (grouping sets, cube, rollup)
Execution plans
- Review my post on table access operators
- Look at join operators,
- Review those fuzzy ones (table spool, parallelism, others)
Row based vs Set based operations
- Enumerate as many non-set based things (cursors, WHILE),
- Do ctes count? or subqueries in SELECT, WHERE clauses count as non-set based?
- Look at batching methods, splitting implicit transactions (this is Rowcount(), top(x), while)

Designing Queries for Optimal Performance (15 percent)

It was at this point in creating my study guide (the last section!) that I discovered Eric Wisdahl (t|b) had created his own study guide earlier this year.

I have no better advice in this section than to point you to his posts http://ericwisdahl.wordpress.com/tag/70-451/.

I found those posts very valuable and I want to end this post by saying thank you Eric!

-- Comments (8)

April 20, 2011

A New Way to Examine Blocked Process Reports

Filed under: SQL Scripts,SQLServerPedia Syndication,Technical Articles — Tags: "sql server", blocked process report, blocked processes, xml, xml shredding — Michael J. Swart @ 12:00 pm

Solving concurrency problems are a large part of troubleshooting. Often solutions include tuning the blockers to minimize the blocked time or tweaking locks and isolation levels to make processes play nicely with each other. But to dig into the problem, you have to understand the blocking chain.

If you’re troubleshooting a concurrency problem that’s happening on your server right now then you can get information from the DMVs or even better, by using Adam Machanic’s Who Is Active stored procedure.

But what if the excessive blocking behavior is intermittent? Then the best strategy is to monitor the server and try to capture a SQL trace that includes the “blocked process report” event. I’ve had a lot of luck with that event, it can really tell you a story about excessive blocking. But I find that interpreting the trace can be tricky, especially when there’s a large blocking chain. Sorting through hundreds of events to find the lead blocker is not fun.

New and Free: sp_blocked_process_report_viewer

So I wrote a script! And I stuffed it in a stored procedure! Here’s the syntax (BOL-Style):
Syntax

sp_blocked_process_report_viewer [@Trace = ] 'TraceFileOrTable'
    [ , [ @Type = ] 'TraceType' ]

Arguments
[@Trace = ] ‘TraceFileOrTable’

Is the name of the trace table or trace file that holds the blocked process reports

[@Trace = ] ‘TraceType’

Is the type of file referenced by TraceFileOrTable. Values can be TABLE, FILE or XMLFILE. The default is FILE

Download it Now!

Go to the https://github.com/mjswart/sqlblockedprocesses site and download it. Once you’re there, click on the big green download button (as shown to the right) and you’ll have the stored procedure!

Here’s a sample output. It shows clearly who the lead blocker is:

A screenshot showing output for this sproc

Nicely organized, at least better than usual

I’m Promoting This Script to a Project

Although, you still have to know how to read a blocked process report, this utility makes the structure of the blocking chain clear. I find this script useful for my own purposes. In fact I like it enough that I’m going to maintain it on github as: SQL Server Blocked Process Report Viewer

Let Me Know How It Goes

Run the script! Use it! Tell your friends. Tell me what you think of it (for once in my life, I’m seeking out criticism).

Going forward, I do have some plans for the script. There’s a number of things I eventually want to do with it:

Add error handling
Really make the sproc perform well (it’s already decent).
Develop a test suite (sample traces that exercise the procedure)
There’s an opportunity to look up object names, index names and sql text based on object ids if the traces belong to the local server.
A SQL Server Management Studio plugin. A treeview control would really be useful here. (This might be easier after Denali comes out)

I plan to do the work, but if you’re really keen and you want to pitch in, you’re welcome to. If you see any errors you can

let me know at the SQL Server Blocked Process Report Viewer github site,
mail me
or simply leave a comment here

-- Comments (38)

April 13, 2011

Whew… Microsoft Certification Exam 70-433

Filed under: SQLServerPedia Syndication,Technical Articles,Tongue In Cheek — Tags: "sql server", 70-433, exam study guide, mcts — Michael J. Swart @ 12:00 pm

Yesterday was April 12, 2011, and as far as I know mainstream support for SQL Server 2005 ended. See Aaron Bertrand’s A Quick Note About SQL Server 2005 Mainstream Support.

And if you care, this also means that the Microsoft Certifications that are tied to that product also expire. At least that’s what I gather from this Microsoft FAQ which says:

Q. How long will my certification be valid?
A. Today, most of our Microsoft Certified Technology Specialist (MCTS), Microsoft Certified IT Professional (MCITP), and Microsoft Certified Professional Developer (MCPD) exams retire when Microsoft discontinues mainstream support for the related technology…

But that’s just the exam, I’m not too sure about the certification. In any case it really doesn’t matter at all. I mean, the skills a person has with SQL Server does not diminish or become obsolete with the expiration of SQL Server 2005. SQL Server 2008 and other versions are still alive and kicking.

Bad News: So my MCTS (Microsoft Certified Technology Specialist) certification for SQL Server 2005 expired yesterday (don’t worry I’ll get over it).
Good News: My MCTS certification for SQL Server 2008 takes effect last Friday when I passed the 70-433 exam (yaay!!) It’s my first step towards a new certification for myself, the MCITP (IT Professional).

It was a decent exam, I thought it was fair. There were a couple of things I would have changed:

I found at least four typos in the questions and at least once I had to answer the question they meant to ask not the question they actually asked.
Two thirds of the way through the exam, there was a power outage. No one could have foreseen it, but I was glad to discover that my questions weren’t lost and the time without electricity was not on-the-clock.

Section 5, My Nemesis

I did well in every section except one, Working With Additional SQL Server Components. In that section I got two questions correct out of six. I guess I should have given more focus to it while studying. To be honest, I didn’t actually work the plan I wrote out for myself for that section. I had heard scary things about some of the other sections and focused on them. But even so, two out of six is pretty rotten for a multiple choice test. Random guesses might have even fared better. A lucky monkey has a good chance (46%) of doing at least as well or better than I did simply by throwing darts randomly as a strategy for choosing answers.

My 70-433 Study Plan

Of course I can’t give details about the content of the exam for obvious reasons, but I do want to share the study plan I followed when studying for this exam. I wrote the plan based on Skills Measured as described by Microsoft. Everything I write here can be found online:

Tables and Views

What does with CHECK OPTION mean when creating views
What does with ENCRYPTION mean when creating views
what’s the syntax for fill factor?
write (without checking syntax) a nci with fill factor
rebuild an index (without checking syntax) with a fillfactor
create statistics without checking syntax
add a pk (without checking syntax)
add a fk with cascading deletes
disable/enable pk
disable/enable check
disable/enable fk
disable/enable unique
Create table with column that has filestream
create a spatial column (without checking)
What’s a structured vs semi-structured column?
name collation varieties (case sensitive etc…)
Create partitioned table (without checking syntax)
Split merged table into two, merge back together
Create empty table and switch into partition

Programming objects

Without checking syntax, pass in a tvp
without checking syntax, write sprocs with all varieties of “Execute as”
Without checking syntax, create sproc that specifies RECOMPILE
Without checking syntax, create sproc that specifies WITH ENCRYPTION
What permissions are there on sprocs? 2 flavors of security here: GRANT, DENY, REVOKE versus EXECUTE AS
Build Function without checking syntax
What permissions are there on udfs? flavors of security here: GRANT, DENY, REVOKE versus EXECUTE AS
What is schemabinding on udf?
what are the three (four?) kinds of functions
Explore flavours of triggers.
Create trigger that uses INSERTED/DELETED
Explore security on triggers, (EXECUTE AS)
Create DDL trigger.
Create DDL trigger that returns data.
Create CLR assembly.
Explore clr: SAFE/UNSAFE/EXTERNAL_ACCESS
Explore CLR: set trustworthy
Implement TRY CATCH RAISERROR
error handling, look at Rusanu’s service broker scripts, he’s an awesome error handler.

Query Fundamentals

Explore different DELETE syntax (i.e. DELETE A FROM A…)
Without checking syntax, use the OUTPUT clause
Without checking syntax, use the MERGE statement
Write a query that uses grouping sets.
What’s the difference between LEN and DATALENGTH
without checking syntax, use patindex and charindex

Additional Query techniques

write a subqueries (correlated)
write a subqueries (simple)
write a subqueries (scalar)
write a subqueries (list)
write a subqueries (table valued)
Without checking syntax, Use rank
Without checking syntax, Use dense rank
Without checking syntax, Use row_number
Without checking syntax, Use ntile
Without checking syntax, Use row_number, and partition by
look through table hints to understand any unused ones
look through query hints to understand any unused ones
manage collations? how can they be managed.
what’s the diff between column/db/server collations
(international) define custom errors

Other SQL Components

Set up db mail
implement full text search
full text: Use CONTAINS
full text: Use CONTAINSTABLE
full text: Use FREETEXT
full text: Use FREETEXTTABLE
full text: Use STOPLIST
Use powershell and smo to create .net cmdlet demonstrates smo
Take service broker tutorial, compare sample syntax to rusanu’s syntax
Implement change tracking
Create db audit spec
Use CHANGETABLE

XML Data

use each FOR XML type
without checking syntax, use OPENXML, and sp_XML_docs
without checking syntax, shred xml.
review xpath syntax
what’s the difference between xquery and xpath?
Read through books online topic on xml

Gathering Perf Info

Use SHOWPLAN to (showplanxml?) to look through query
look at how paul (@sql_kiwi) displays query plan
Run through DTA, Prepare a workload <– remember options in DTA here!!!
Review catalog views. (Is this INFORMATION_SCHEMA?)

-- Comments (17)

February 23, 2011

How Full is Fill Factor 100?

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: "sql server", fill factor, identity column, index, indexes — Michael J. Swart @ 12:00 pm

Takeaway:

Question: For a clustered index on an identity column, is it okay to set the fill factor to 100?
Answer: Most likely, it depends on a lot of things.

Fill Factor

So today I’m talking about the FILL FACTOR setting that can be applied to indexes. Remember, Fill Factor is a percentage you can specify so that when indexes are built (or rebuilt) SQL Server leaves some free space in each data page to accommodate any new data that may come along.

If more data is added to a page which doesn’t have enough room to accommodate it, then a page split occurs – a new page is created elsewhere and roughly half the rows get written to the new page. This leaves two pages roughly half full. So the goal of setting a Fill Factor properly is to prevent these page splits (because too many page splits impacts performance).

Fill Factor of 100 on Clustered Indexes on Identity Columns

So there’s a couple places I’ve found that recommend a fill factor of 100 on indexes that begin with identity columns.

Dave Levy describes a process that includes the tip: “If the index is on an ever increasing value, like an identity column, then the fill factor is automatically 100.”
Pinal Dave gives the same advice: “If the index key column is an IDENTITY column, the key for new rows is always increasing and the index rows are logically added to the end of the index. In such a situation, you should set your FILLFACTOR to 100.”

This makes sense, if you always add rows to the end of an index, then you don’t need to save room in the middle do you? Ahh… But what about UPDATE statements? UPDATE statements can add data into the middle of an index (especially a clustered index which includes all fields). You might think that even updating a record so that it’s one byte larger than it used to be will cause a page split.

Fill Factor of 100 Still Has A Bit Of Wiggle Room

It turns out that there’s still a little bit of wiggle room. It’s very rare for pages to have zero bytes free. Even if the index was built (rebuilt) with Fill Factor 100. The reason is because data pages contain an whole number of records. If there’s space on a page, but not quite enough space for a whole record, then it’s considered full. This tiny space could in theory be used for updates that fit.

What Else to Consider

So one extra byte is rarely going to cause a page split. I would be comfortable recommending a Fill Factor of 100 for any index on an identity column. But before you trust me, there are some other things to consider:

The bit of wiggle room I mentioned above
Know your application! Most OLTP systems do far more Inserts than Updates. (Most OLAP systems do zero updates)
How many fields in the index are variable length? And how many of those get updated? Remember only variable length fields can change the size of a record. No variable length fields means an automatic Fill Factor 100.
SQL Server only pays attention to Fill Factor on Index Rebuilds (or on creation). It doesn’t maintain the fill factor space any other time. So ask yourself how often updates are applied to rows that are older than your last index rebuild. If it’s rare, then Fill Factor 100.
How’s your re-indexing strategy? If you REORGANIZE your indexes instead of REBUILD, the fill factor won’t make a difference at all (If so, better to stop reading this article and work on a comprehensive index maintenance strategy.)
Page splits don’t impact performance of seeks (just scans).
Page splits aren’t the end of the world. In terms of database health. Fragmentation is like a bad cold.

There’s probably even more things I’m missing. But you know what’s better than guessing? Measuring! Go use Dave Levy’s Fill Factor script to know exactly how Fill Factor is impacting your indexes.

-- Comments (4)

« Newer Posts — Older Posts »