SQLServerPedia Syndication | Michael J. Swart

January 5, 2011

Table Valued Parameters, A Short, Complete Example

Filed under: SQL Scripts,SQLServerPedia Syndication,Technical Articles — Tags: "sql server", ado.net, c#, table valued parameters — Michael J. Swart @ 12:00 pm

After surfing for a while I had trouble finding an example of an application that shows how to use table valued parameters (tvp) from beginning to end using C#, ado.net, and SQL Server 2008. Official docs covering TVPs are found at Table-Valued Parameters in SQL Server 2008 (ADO.NET).

So this is my own TVP example, it consists of a SQL Script, a C# script and a batch file that runs and executes the program.

The DB Setup (a SQL Script)

Run this on your SQL Server 2008 (or later) database that you can test on:

use tempdb;
 
CREATE TYPE BigIntList
    AS TABLE (i bigint)
GO
 
CREATE PROCEDURE ReturnEvenNumbers (@list BigIntList READONLY) AS
SELECT i
FROM @list
WHERE i % 2 = 0
GO

The C# program

Aptly called Program.cs, this is the definition of a program that calls the new procedure with a list of seven numbers and prints the list of numbers that comes back (i.e. the even numbers).
Edit the connection string here and then save this as Program.cs in some directory.

using System.Data.SqlClient;
using System.Data;
 
namespace TVParameterTest {
    class Program {
        static void Main( ) {
            // build table
            DataTable dt = new DataTable();
            dt.Columns.Add( "i", typeof( long ) );
            foreach( long l in new long[] {1,2,3,4,5,6,7} )
                dt.LoadDataRow( new object[] { l }, true );
 
            // build connection and command
            SqlCommand cmd = new SqlCommand(
                "ReturnEvenNumbers",
                new SqlConnection() );
            cmd.Connection = new SqlConnection( @"Data Source=.\sql2k8;Initial Catalog=tempdb;Integrated Security=True" );
            cmd.Connection.Open();
            cmd.CommandType = CommandType.StoredProcedure;
            cmd.Parameters.Add( "list", SqlDbType.Structured).Value = dt;
 
            // execute and output
            SqlDataReader reader = cmd.ExecuteReader();
            while( reader.Read() )
                System.Console.WriteLine( reader[0].ToString());
            cmd.Connection.Close();
 
        }
    }
}

Running it!

First make sure you’ve got Visual Studio on your machine, then you should be able to open a command prompt (or powershell!) and see this:

E:\Temp>csc program.cs
Microsoft (R) Visual C# 2008 Compiler version 3.5.30729.1
for Microsoft (R) .NET Framework version 3.5
Copyright (C) Microsoft Corporation. All rights reserved.

E:\Temp>Program.exe
2
4
6

-- Comments (1)

December 23, 2010

Does sp_executesql Handle Batch Separators (i.e. GO)?

Filed under: SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 2:47 pm

The answer is no, but I saw something interesting when I was finding out (sometimes a one minute experiment is more effective than looking up behavior on books online).

I ran the following:

exec sp_executesql N'select 1
go
select 2
go'

Now, before reading further try to guess what the results are (or the error message).

I was surprised with the these results:

/*
go
-----------
1
 
go
-----------
2
*/

Which was unexpected. The query worked! But it treated the “go” keywords as column aliases. Of course if you were to do any one of the following:

Redefine your batch separator.
Use semicolons after the select statements
Add column aliases to these queries

Then you do get the expected error:

/*
Msg 102, Level 15, State 1, Line 2
Incorrect syntax near 'go'.
Msg 102, Level 15, State 1, Line 4
Incorrect syntax near 'go'.
*/

So sp_executesql does not handle batch separators (i.e. GO).

-- Comments (3)

November 7, 2010

Why Michael J Swart won’t be at the PASS Summit this year.

Filed under: SQLServerPedia Syndication,Tongue In Cheek — Michael J. Swart @ 11:45 am

Hiker looking for people at a mountain pass and mountain summit.

I get confused sometimes

But seriously, I went to my first PASS Summit a year ago and I had a blast. As much as I’d like to be in Seattle this week, it’s just not in the cards for me. But I look forward to following all the news via blogs and twitter and I’ll even catch some of the keynotes courtesy of PASS.

-- Comments (2)

October 28, 2010

3 Problem Solving Resources That Make You Look Like A Genius

Filed under: SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 12:45 pm

Lookin' smarter!

Okay, so I consider myself relatively bright. With enough effort I can usually fix nine out of ten database problems that get thrown at me. But if (actually when) I get stuck, there’s a bunch of resources I rely on.

Maybe you’re in the same boat as me. Sometimes a technical problem comes along and there’s no one left around to ask for help. The buck stops with you. Either you crack this problem it or it doesn’t get cracked. Often the people who want an answer or solution don’t care where you get it; They just want to clear away an obstacle that’s stopping them from doing what they want to do.

So here are my three favorite internet resources in no particular order.

Google

Okay, this is kind of an obvious one. But you probably don’t have to think too far back to remember when someone asked you a totally google-able question.

But I’m adding Google in this list because I’d be lost without it (plus it also expands this list to three instead of two). It’s still valid though. Even if all you did is punch in an error code and examined the first solution that Google reports, you still look good.

Twitter + #sqlhelp

Check it out.

I asked a question on Twitter the other day and got great responses almost immediately. I showed a friend at work (Hi Scott) these answers and he said “Huh. Twitter is a great resource if you have enough followers.” *

But that’s just the thing! I explained that I don’t have many followers at all. And you don’t need any followers yourself either. The trick is to add the #sqlhelp tag to your tweet and bingo: there is often a large group of professionals just waiting to help. What’s in it for them? I’m not sure. My guess is that knowing they’ve helped someone out is a great feeling. (Altruism FTW!)

Any way, the success of #sqlhelp is a direct result of the active Sql Server community out there. The hashtag #sqlhelp lives and dies based on participation level. There’s probably some unknown threshold of askers and answers that’s needed for the resource to remain useful. Whatever it is #sqlhelp is way beyond that! I counted about 150 questions in the last 24 hours.

For more info, Brent Ozar covers this method in depth at his site his post. Or just go straight to Twitter’s #sqlhelp page.

Stackoverflow + sql-server tag

Check it out.

So if you’ve got a question that needs some explaining and if your question doesn’t fit into the 140 character limit that Twitter enforces, you’ve got stackoverflow.com.

Newsgroups have been around forever. And technical online forums have been around for a while too. But after discovering stackoverflow, I’ve never needed any other sites (Once you go stack…).

And again, it’s the high number of askers and answerers that make this site so useful. No decent question lasts unasked longer than 20 minutes. And more often than not, your question has already been asked.

Did I Miss One?

So tell me where you look. Is there a favorite internet resource of yours that I missed?

I preemptively stick my tongue out at anyone who mentions MVP newsgroups.

* ~ By the way, that’s Scott in the photo who agreed to sit for the photo (He’s the one on the left) Thanks Scott.

-- Comments (7)

October 11, 2010

Updating Statistics Helps, But Not For The Reasons You Think

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: "sql server", #TSQL2sDay, update statistics — Michael J. Swart @ 8:12 pm

This month it’s Sankar Reddy’s turn to host T-SQL Tuesday. I wish him luck! I know I had a blast last month. This month we’re busting SQL Server misconceptions wherever we find them.

So today I want to explain something about the practice of updating statistics as a technique to improve or maintain performance. It works and it works well, but often not for the reasons you think and I’m going to explain a simpler equivalent technique.

The Story

So you’ve been given the job of finding out why a particular stored procedure is performing poorly. It’s a complicated query plan and one of the first things you notice is that there’s hundreds of thousands of rows coming out of a particular table. After looking closer you figure out that the estimated number of rows is only supposed to be around thirty!

Remember that estimated row counts are based on what the database knows about the data and that info is stored in objects called statistics. As the table grows, these statistics can get out of date. So you try:

UPDATE STATISTICS dbo.MyTable WITH FULLSCAN

which solves the problem. (Good job, take a bow!)

Why It Works

Statistics are used to get an idea of how data is distributed. Based on the data distribution, SQL Server can make better choices on how it chooses query plans. If tables change and stats are out of date, the chosen query plans can be less than optimal. And bad plans have a bad habit of performing orders of magnitude worse than optimal plans.

Why It Really Works

I’ve found that out-of-date statistics can be responsible for bad query plans. But more frequently, I see that the parameter values that are used to compile the plan have just as big an effect.

SQL Server doesn’t just use statistics to estimate the number of rows it’s going to retrieve from a table. It also uses variable values and constants found in the query. And in the case of variables, it always uses the first variable values that it sees when the query gets compiled.

So for example, If I’m looking up a person in the phonebook by last name. The number of people I find is going to depend on whether that last name is common like “Martin” or unique like “Jingleheimer-Schmidt”. Based on the search argument, different plans can get picked which affects database performance.

So even if statistics are already up to date, the act of calling UPDATE STATISTICS can still have an effect. When you update statistics on a table, SQL Server throws out all the query plans that depend on that table (but see comments). And that query gets recompiled with the next parameter values that come along. And these may be different than the original values.

It’s the equivalent of a giving SQL Server a good thumping, or jolt.

So if you’re getting ahead of me, you can see that updating stats won’t always work, but I’ve seen others use it as part of a trial-and-error approach to query tuning. So that when it works sometimes, it becomes part of your mental toolbelt of tricks to try.

Alternative Technique 1

So you’ve used STATS_DATE and you know that statistics are up to date and you want to give SQL Server the same “thump” to build a new query plan without updating stats. You can use:

sp_recompile @objname='MyTable'

Or even more targeted:

DBCC FREEPROCCACHE (plan_handle)

But I would only recommend this for acute performance problems, it by no means addresses the root cause.

Alternative Technique 2

If you decide that SQL Server needs help in picking its plan consistently you can use something more permanent. Use query hints like

OPTIMIZE FOR UNKNOWN,
OPTIMIZE FOR @LastName = “Martin”
USE PLAN …

But this means you’re telling SQL Server you know better than it does. So make sure you understand every nook and cranny of your tables’ data distribution and query plans.

Conclusion

The misconception is that STATISTICS has a greater effect on query performance than it actually does. The query plans that SQL Server chooses depend on both statistics and compiled parameter values.

Try to get away from a trial-and-error method of query tuning. Learn how to look at the following before making a plan:

Find the query plan handle and query plan for a poor performing query.
Estimated row counts vs. Actual row counts
What parameter values were used to compile a query plan.
When statistics were last udpated for a particular index.
The data distribution (histogram) of statistics.

-- Comments (4)

October 5, 2010

DBA humour

Filed under: Data Cartoons,SQLServerPedia Syndication,Tongue In Cheek — Michael J. Swart @ 9:44 pm

Data (Bass)

(Pun so intended)

-- Comments (3)

September 20, 2010

T-SQL Tuesday #10 Round up, An index of participants

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: #TSQL2sDay, index, indexes, indices — Michael J. Swart @ 8:19 am

So, about a week ago, I had the honor of being your host for T-SQL Tuesday for the month of September. And an amazing 22 bloggers responded to that invitation. I want to thank each of the bloggers for the time they put into their contribution. We really had some high quality contributions this month.

And another thanks to Adam Machanic for letting me host this month.

Here are the participating blog posts:

1. Brad Schulz Little Known Index Facts
First to go live with his post was Brad Schulz, awesome as ever. He talks about Fill Factor percents and Unique Indexes. He shows his knack for writing in a way that after having read his post, you’ll never forget what he explained.
What Struck Me: Before he gets into the topic at hand, he throws his support behind team indexes (and takes a few shots at team indices) a must read by itself, I pointed my family at the post and heard out-loud laughs.

2. Rob Farley (@Rob_Farley) Table? No such thing…
Rob Farley explains how there is no such thing as a table. Essentially he explains how a clustered index is the table. (Good on ya Rob).
What Struck Me: Spoofing the Matrix:
“Don’t try to look up information in the table. That’s impossible. Only try to realize the truth.”
“What’s that?”
“There is no table.”

3. Pinal Dave (@pinaldave) Disabled Index and Index Levels and B-Tree
Pinal Dave tackles the question “What will be the status of the B-Tree structure when index is disabled?”
What Struck Me: He investigates and explains the answer thoroughly. He also includes a personal touch with his post when he talks about a mentor Vinod Kumar.

4. Noel McKinney (@NoelMcKinney) Constraints & Unused Indexes
Noel (team indexes) explains how an unused index (an index with few reads & writes) can still be useful.
What Struck Me: I like Noel’s writing style. His articles make me feel like he’s sitting right beside me explaining something at my computer. Maybe it’s his first-person narrative or the fact that he throws in his own experiences. Whichever it is it works.

5. Nicholas Cain (@SirSQL) Applying Indexes To Replicated Tables
Any time I read from someone who has tackled replication in a successful way, I’m immediately impressed. Nicholas explains how to extend a replication solution so that the subscriber has indexes that the publisher doesn’t.
What Struck Me: A must read for any one who uses replication (or who is considering replication)

6. Richard Douglas (@SQLRich) Unused Indexes (Indices)
For the first time, Richard Douglas (Team Indices) contributes a post for T-Sql Tuesday. He explains how indexes are not all rainbows and unicorns all the time. He shines light on problem indices that don’t do anyone any good: Unused Indices.
What Struck Me: With a working vocabulary that includes words like “amended” and “gracious”, Richard gives “credence” to the thought of switching to Team Indices.

7. Erin Stellato (@erinstellato) Desperately Seeking Seeks
Erin Stellato explains the steps she takes in her search for a seek (while simultaneously reminding us of 80’s movies).
What Struck Me: What I like about Erin’s take is that she includes in her explanation an Oracle solution. Keeping other platforms in mind keeps everyone honest.

8. Michael J Swart (@MJSwart) Guts Of A Clustered Index
A post by yours truly with fancy Visio pictures. I try to explain what physically happens when a clustered index grows.
What Struck Me: A frisbee.

9. Bob Pusateri (@SQLBob) Potty Chairs And Duplicate Indexes
Provides an embellished script that has been making the blog rounds to report on redundant indexes. Good job Bob!
What Struck Me: A gold painted potty chair! I can’t describe it properly, you just have to see for yourself.

10. Jeremiah Peschka (@peschkaj) Indexes
Jeremiah takes a (deliberately) rambling look at indexes, b-trees and … erm… horology.
What Struck Me: His post is the number one post here ranked in order of likeness to a daily show monologue. (And I love animated gif lolcats). It’s much appreciated.

Start humming "Girl from Ipanema" to yourself

11. Jen McCown (@MidnightDBA) Brief Intro To Indexes and INCLUDE
Jen pulls her post from the archives to give an introduction to indexes.
What Struck Me: She explains how INCLUDED columns in an index are only included at the leaf levels! This is something I didn’t know until now. It reminds me of something mentioned by Brad Schulz’s post where Unique Indexes can also do without columns in the index nodes.

12. Gail Shaw (@SQLintheWILD) One wide index or multiple narrow indexes?
Gail writes an authoritative post on indexing strategy. Hers is one of the few contribution that addresses the question “Which index is best?”
What Struck Me: I like her conclusion about the so-called “strategy” of having single-column indexes on each column of the table. And her post is a must read for anyone who isn’t clear on this point.

13. Tom LaRock (@SQLRockstar) Big Mac Index
The SQL Rockstar explains indexes using a Big Mac analogy it’s an analogy I guarantee no-one has considered before (making this post legendary).
What Struck Me: A Microsoft Connect Item on the Big Mac Index. Wow.

14. Diane McNurlan (@SQLDevGal) Top 10 Worst Indexing Practices
Another member of team indexes, Diane covered important index topics in a clear way that it can almost be used as a checklist (for creating indexes or querying them).
What Struck Me: Diane is a very competent writer and although I’ve got a bunch of other posts to read for this round up I found myself going through more of her articles. And subscribing to her blog’s feed. Us database developers have to stick together.

15. AJ Mendo (@SQLAJ) Indexes 101
In AJ’s own words, he talks “about what indexes are, why we can benefit from their use and some of the costs associated with using indexes.”
What Struck Me: Along with the standard clustered vs. non-clustered, AJ gives a brief intro to some of the lesser known kinds of indexes: Fulltext, Spatial, XML etc…

16. Jason Strate (@StrateSQL) An Index On Indexing
Just like the cheesy 80’s sitcom clip show, Jason gives a list of his series on Index analysis. It’s a good sign though, only the best 80’s sitcoms ever got clip shows (Joanie Loves Chachi notwithstanding).
What Struck Me: In fact in my own “cheesy clip show” post last year I ran out of clips and ended up linking to Jason’s site.

17. Dave Levy (@Dave_Levy) How is Fill Factor Impacting My Indexes?
Dave queries some DMVs to help you analyze how effective your fill factor settings are.
What Struck Me: Okay, I have to call this one out. If you’re a dba, and you read only one index post this month make it mine, but if you have time to read two read Dave Levy’s. His fill factor script belongs in every DBA’s toolkit alongside queries that analyze things like missing index or fragmentation levels.

18. Jeremy Carter Cut Your Index Bloat
Jeremy explains the process he follows to identify and remove unused indexes.
What Struck Me: Jeremy is probably the most underexposed blogger on this list. There are a ton of MVP participants this month include honest-to-God book authors! Jeremy holds his own quite easily alongside them.

19. Andy Lohn (@SQLQuill) Partitioned Indexes and Data Types
Andy tackles index partitioning in this post. Specifically, he explains two methods of moving from non-partitioned indexes to partitioned indexes.
What Struck Me: I like that this is not a hypothetical situation. Andy and his colleagues are facing this problem as we speak (or as I write). And they are testing to determine which method is faster. The results are not in yet making his post a bit of a cliff-hanger.

20. Robert L Davis (@SQLSoldier) To Be or Not To Be (a B-tree)
Robert (aka SQL Soldier) explains how XML indexes are structured.
What Struck Me: Oh the things you can do using the dedicated administrator connection!!

21. Aaron Nelson (@SQLvariant) Picture The Indexes
A great analogy for indexes is the phone book. Aaron Nelson runs with it.
What Struck Me: The post and comments stretch that metaphor about as far as it can go.

22. Jason Brimhall (@sqlrnnr) Indexes And Blobs
Ending this month’s T-SQL Tuesday, Jason (team indexes) refurbishes an old script of his to report on BLOB index columns.
What Struck Me: He mentions something that I’m surprised never got covered by anyone else this month: Blob’s make lousy indexes.

Encore!

(Update 9/20/2010) One late addition (sorry I missed it, there was no comment at the invite post).

23. Steve Jones (@WayOutWest) Remembering To Index
In Steve’s own words, he reminds us that “Indexing is important, but you can overdo it.”
What Struck Me: He wrote about a third-party db he came across which include the most over-indexed table I’ve ever seen in my life.

-- Comments (22)

September 14, 2010

Guts Of An Clustered Index

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: "sql server", #TSQL2sDay, clustered index — Michael J. Swart @ 8:31 am

So T-SQL Tuesday rolls around again. And this month it’s hosted by some handsome devil named Michael J Swart. I got to pick the topic, so you might expect great things. Well, we’ll see (no pressure) Here we go.

I want to show visually what’s happening on the disk when you create and grow a clustered index. I think this post will be most useful if you’re an application developer. You know what a table is and what a clustered index is, but beyond that, it’s not clear where SQL Server keeps your data physically. Hopefully you’ll soon find it clearer.

So I’m going to keep it simple. I’m going to show a simple table with a clustered index on it’s identity column (no partitions, no compression, one filegroup on one file). Remember a clustered index defines the ordering of the table and so essentially a clustered index is the table. That’s good, because it means that in the example below, I’ll only talk about one thing. You can call it either the table or the clustered index, it’s the same thing.

First Rule of Database Storage…

First thing you should know about database storage is that everything, I mean absolutely everything in the database is stored in pages. Each page is a chunk of storage on disk exactly 8 kilobytes in size, no more, no less. You’ll see that a page never contains records from different tables.

Create A Table

CREATE TABLE TEST (
  	id INT IDENTITY NOT NULL,
  	variableLength VARCHAR (MAX) NOT NULL DEFAULT (REPLICATE('#', 1000)),
  	CONSTRAINT pk_TEST PRIMARY KEY CLUSTERED (id)
);

Nothing happens on the disk here. There is no new space that is reserved in your database for this new table. Think of SQL Server as a lazy librarian. It’s not going to find shelves for books that don’t exist. But if you could look closely, you would see that there are changes to the system. SQL Server records in system tables that there’s a new table. Here’s one of the system tables:

New Row In A System Table

This particular system table describes the space that’s allocated to table TEST. It’s the system table you go to when you want to find where your table is located on disk. Right now no space is allocated until we create some data.

First Row

insert TEST default values;

Okay, something’s happening now. Two new pages are allocated for table TEST after just a single row.

Two New Pages

Page Addresses: Notice that pages always have a name (or address) that looks like 1:153. The first digit is the file number in that database (In our example, the digit will always be 1 because the database uses only one file). The second number here (153) is an incrementing counter that identifies the page number.

So a page with the address (1:153) is the 153^rd page (or stretch of 8 kilobytes) in the 1^st file of the current database.

The IAM page: The first you see here is an Index Allocation Map. If database tables were books on shelves, the IAM page would be like a floor plan of the library. Paul Randal wrote the definitive web resource on this stuff and so it’s something I’m not going to cover any deeper.

The Data Page: This is where your data is. There’s a header that SQL Server uses for some context. And there’s also a footer that acts helps track the order for the rows on a page. But still there’s roughly 8000 bytes left on data pages for your actual data. In my example, the rows I create are pretty big (about 1kb) and so there’s space for about six more rows in this page. What happens when we try to add seven?

First Index Node

set nocount on
insert TEST default values;
GO 7

Okay, There’s two data pages now. The first one, 1:153 got full and SQL Server allocated a second one, 1:156, for the eighth row:

New Index Page

The Index Page: A new kind of page is created. If you’re looking for a single row and you know the id, you don’t start at the first data page. It’s better to look at the root page. Using the information in the root page, you know the address of the data page your row is in.

In fact a page address and a integer Id column are pretty tiny fields and so there’s a lot of space in this index field. There’s space in this page to index over 600 pages of data. That works out to over 4000 rows (inside the 600 data pages). What happens when you create one too many?

More and More Rows

set nocount on
insert TEST (variableLength) default values;
GO 4347

New Index Level

There’s a new level!

Finding Row 42: So let’s say that you want to look up row with Id=42. First read the root page (1:791). This information tells you to look at page (1:155). You’re still not at a data page. So reading (1:155), you know to look at (1:160). And voila, after three page reads, there’s your row.

B-Trees: Tables that are quite large or that have really wide indexes can have more and more levels. This structure is called a B-Tree (B for balanced). The idea is that no matter where stuff is inserted or deleted, it’s not really difficult to shuffle page pointers around to make the tree balanced.

See For Yourself

This is all stuff that you can see for yourself. The following are examples of commands I’ve used. They’re not documented, but they’re extremely useful when trying to understand this stuff.

SELECT pl.*
FROM sys.system_internals_allocation_units iau
JOIN sys.partitions p
	ON iau.container_id = p.partition_id
CROSS APPLY sys.fn_PhysLocCracker(iau.root_page) as pl
WHERE p.object_id = OBJECT_ID('test')
	AND p.index_id = 1 -- the clustered index
	AND iau.type = 1 -- in row data
 
DBCC PAGE('sandbox', 1, 153, 3);
 
DBCC IND('sandbox', 'test',1);

So that’s it. There’s tons left to say about all this stuff (metric tons!) and I’m hoping that the other T-SQL Tuesday bloggers will cover some of it.

-- Comments (5)

September 7, 2010

Invitation to Participate in T-SQL Tuesday #10 – Indexes

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: "sql server", #TSQL2sDay, index, indexes, indices — Michael J. Swart @ 10:00 am

What are your thoughts on Database Indexes?

(Update 09/20/2010: Find the roundup here) September already! Summer holidays are over and that means no more lazing around. The second Tuesday is fast approaching and that means that T-SQL Tuesday is almost here. And this month I will be your humble host and the topic is Indexes.

A book's index is not that different than a database index.

Indexes

Indexes are strange things. You never need to explicitly create one to create a fully-functional database, but if you want a database to perform well, they’re indispensable.

And there are so many aspects to write about! Like internals, covering, clustered, xml, fulltext, b-trees, hints, maintenance of, included columns, filtered, redundant, missing and tons more.

In fact my SQL Server 2008 Administrator’s Pocket Consultant (The first handy textbook I could grab) has an index entry on “indexes” that has 22 sub-entries.

About T-SQL Tuesday

For those not familiar with T-SQL Tuesday, it’s an idea hatched by Adam Machanic (Blog|Twitter). A monthly event, T-SQL Tuesday is a chance for any blogger to write and post an article on a single topic determined by the host (i.e. yours truly). Leave a comment here on that day and a day or two later, I’ll include your post in the round up.

Follow These Rules

The post must go live on your blog between 00:00 GMT Tuesday, September 14, 2010 and 00:00 GMT Wednesday, September 15, 2010.
In other words, set your sql server date, time and timezone properly and run this script:
IF GETUTCDATE() BETWEEN '20100914' AND '20100915' SELECT 'You Can Post' ELSE SELECT 'Not Time To Post'
IF GETUTCDATE() BETWEEN '20100914' AND '20100915' SELECT 'You Can Post' ELSE SELECT 'Not Time To Post'
Your post has to link back to the hosting blog, and the link must be anchored from the logo (found above) which must also appear at the top of the post
Leave a comment here (below) or I won’t be able to find your post. I expect trackbacks work properly, but if they don’t check back here just in case and leave a comment if necessary.

We also encourage you to …

… include a reference to T-SQL Tuesday in the title of your post. (The more we bloggers advertise T-SQL Tuesday, the more we bloggers get T-SQL tuesday readers)
… tweet using the hash tag #TSQL2sDay to follow links and other relevant conversations.
… consider hosting T-SQL Tuesday yourself. If you’re interested let Adam Machanic Know. If you’ve participated in two T-SQL Tuesdays previously and you don’t let your blog go stale (blog once a month for the last six months) then he’ll put you in the rotation.
And out of curiosity… Are you team indexes or are you team indices? Personally I’m team indexes but I work with a guy who seems to be so team indices that I’ve caught him saying indice for the singular!

That’s it! Good luck! Can’t wait to see what you guys have in store.

Your humble host,
Michael J. Swart

-- Comments (48)

August 26, 2010

Ten Things I Hate to See in T-SQL

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: "sql server", t-sql — Michael J. Swart @ 11:35 am

So here’s my top ten list of things that I see in written in SQL that I dislike or that irk me in some way. And yeah, in some cases, I would use the word hate.

Fury

They’re not necessarily things that go against best practices. But each one gets under my skin in some way and I thought I’d share that list at the risk of giving my enemies buttons to push.

So without further ado.

10. Code that’s Been Commented Out

It’s in every set of code I’ve ever worked with and I’m sure you’ve seen it too. It’s not even SQL Server specific, but you’ll see it in any code that changes over time. The code that get’s commented out instead of deleted.

I’m guessing that the reasoning behind this is that some developers want to keep code around to make reverts easier, or to keep a sense of history of the code. But that’s what source control systems were built for!

9. The isDeleted Column

“Deleted records aren’t deleted. Look, they’re right there!”

You sometimes see this. Records that are marked as deleted with a flag. It’s sometimes needed to support an undelete feature but it still bugs me.

It’s also confusing when looking at procedures called s_OBJECT_Delete(). Without looking at the definition, does this procedure delete the record, or just mark it as deleted?

8. Fluff in Object Definitions

When I create a table from scratch like:

CREATE TABLE MyTest
(
	Id INT IDENTITY NOT NULL,
	Value1 NVARCHAR (100) NOT NULL,
	Value2 NVARCHAR (max),
	CONSTRAINT PK_MyTest PRIMARY KEY (Id)
)

I can get SQL Server to script out the table for me later on and it will look like this:

/****** Object:  Table [dbo].[MyTest]    Script Date: 08/25/2010 19:08:29 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[MyTest](
	[Id] [int] IDENTITY(1,1) NOT NULL,
	[Value1] [nvarchar](100) NOT NULL,
	[Value2] [nvarchar](max) NULL,
 CONSTRAINT [PK_MyTest] PRIMARY KEY CLUSTERED
(
	[Id] ASC
) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, 
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, 
ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

All the added stuff is marked in green. This is stuff that is either optional, or that specifies an option that happens to be the default. This is okay, because SQL Server didn’t know if the table was created with options left as the default, or whether they were explicitly set. So the scripter spells it out.

And that’s totally fine. I just get peeved when some of this stuff makes it into checked-in object definitions.

7. Filter in the JOIN Clause Instead of the WHERE Clause

Look at this query:

SELECT p.name, COUNT(1)
FROM Sales.SalesOrderDetail sod
JOIN Production.Product p
	ON sod.ProductID = p.ProductID
	AND p.Name like 'Sport%'
GROUP BY p.Name

I’d rather see that filter on p.Name in a where clause:

SELECT p.name, COUNT(1)
FROM Sales.SalesOrderDetail sod
JOIN Production.Product p
	ON sod.ProductID = p.ProductID
WHERE p.Name like 'Sport%'
GROUP BY p.Name

Of course if you’re dealing with LEFT OUTER JOINS, then it matters where the filter is placed.

6. Concatenating Strings to Build SQL in C#

This is more of a C# gripe than a SQL gripe but it’s my list 🙂

Old C++ developers create strings filled with SQL to be passed to the database. But long statements would get broken up in code like:

string sql = "SELECT name, location " +
    "FROM MyTable " +
    "WHERE id = 12312 " +
    "ORDER BY name";

I hate that format because it makes things impossible to cut and paste. But C# has this nifty string prefix @ that allows newlines inside string literals:

string sql = @"SELECT name, location
    FROM MyTable
    WHERE id = 12312
    ORDER BY name";

5. Concatenating Variables Values Into SQL

Pretty self-explanatory. This practice puts a database in danger of SQL injection attacks.

(Always happy for an excuse to mention little Bobby Tables from xkcd)

4. Triggers that maintain data in other tables

This is a symptom of a database design that’s not normalized. Essentially, the same data is stored in two places. The development effort needed to maintain both tables properly is often more trouble than it’s worth.

Very closely related is the indexed view. But I have nothing but love for these.

3. Missing Foreign Keys

Missing foreign keys are more common than other problems because there’s no obvious symptom to react to:

A new table will support an application just as nicely with or without foreign keys.
Software testers will see no error messages coming from the database.

It’s often later on that data integrity becomes a problem, now you’ve got more things to worry about:

Adding the foreign key.
Cleaning up dirty data.
and Public Relations.

Here’s a post I wrote earlier on finding missing keys.

2. Identifiers That Are Also Reserved Key Words

It’s confusing to have column names, alias names or table names that are also keywords. In some cases the syntax is allowed and what’s left is merely confusing.

I often see this pet peeve when columns are named the same as a particular type. like Date or Text. I’d rather see names like CreateDate, LogDate or CommentText. The following script can identify those:

SELECT c.TABLE_NAME, c.COLUMN_NAME, c.DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS c
JOIN sys.types t
	ON t.name = c.COLUMN_NAME
ORDER BY c.TABLE_NAME, c.COLUMN_NAME

1. Column Names That End in a Digit

Now, I don’t mind stuff like AddressLine1 and AddressLine2. They serve a purpose in their own way.

But normally when I see this it smells like unnormalized database design (not denormalized, there’s a distinction). So this is a good litmus test to check to see whether tables are at least in first normal form.

I use this script to check:

select t.name as table_name,
	c.name as column_name
from sys.tables t
join sys.columns c
	on t.[object_id] = c.[object_id]
where c.name like '%[0-9]'
order by t.name, c.name

~ Thanks to xkcd.com and victorianweb.org for sharing their images

-- Comments (26)

« Newer Posts — Older Posts »