Ten Things I Hate to See in T-SQL

August 26, 2010

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: "sql server", t-sql — Michael J. Swart @ 11:35 am

So here’s my top ten list of things that I see in written in SQL that I dislike or that irk me in some way. And yeah, in some cases, I would use the word hate.

Fury

They’re not necessarily things that go against best practices. But each one gets under my skin in some way and I thought I’d share that list at the risk of giving my enemies buttons to push.

So without further ado.

10. Code that’s Been Commented Out

It’s in every set of code I’ve ever worked with and I’m sure you’ve seen it too. It’s not even SQL Server specific, but you’ll see it in any code that changes over time. The code that get’s commented out instead of deleted.

I’m guessing that the reasoning behind this is that some developers want to keep code around to make reverts easier, or to keep a sense of history of the code. But that’s what source control systems were built for!

9. The isDeleted Column

“Deleted records aren’t deleted. Look, they’re right there!”

You sometimes see this. Records that are marked as deleted with a flag. It’s sometimes needed to support an undelete feature but it still bugs me.

It’s also confusing when looking at procedures called s_OBJECT_Delete(). Without looking at the definition, does this procedure delete the record, or just mark it as deleted?

8. Fluff in Object Definitions

When I create a table from scratch like:

CREATE TABLE MyTest
(
	Id INT IDENTITY NOT NULL,
	Value1 NVARCHAR (100) NOT NULL,
	Value2 NVARCHAR (max),
	CONSTRAINT PK_MyTest PRIMARY KEY (Id)
)

I can get SQL Server to script out the table for me later on and it will look like this:

/****** Object:  Table [dbo].[MyTest]    Script Date: 08/25/2010 19:08:29 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[MyTest](
	[Id] [int] IDENTITY(1,1) NOT NULL,
	[Value1] [nvarchar](100) NOT NULL,
	[Value2] [nvarchar](max) NULL,
 CONSTRAINT [PK_MyTest] PRIMARY KEY CLUSTERED
(
	[Id] ASC
) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, 
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, 
ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

All the added stuff is marked in green. This is stuff that is either optional, or that specifies an option that happens to be the default. This is okay, because SQL Server didn’t know if the table was created with options left as the default, or whether they were explicitly set. So the scripter spells it out.

And that’s totally fine. I just get peeved when some of this stuff makes it into checked-in object definitions.

7. Filter in the JOIN Clause Instead of the WHERE Clause

Look at this query:

SELECT p.name, COUNT(1)
FROM Sales.SalesOrderDetail sod
JOIN Production.Product p
	ON sod.ProductID = p.ProductID
	AND p.Name like 'Sport%'
GROUP BY p.Name

I’d rather see that filter on p.Name in a where clause:

SELECT p.name, COUNT(1)
FROM Sales.SalesOrderDetail sod
JOIN Production.Product p
	ON sod.ProductID = p.ProductID
WHERE p.Name like 'Sport%'
GROUP BY p.Name

Of course if you’re dealing with LEFT OUTER JOINS, then it matters where the filter is placed.

6. Concatenating Strings to Build SQL in C#

This is more of a C# gripe than a SQL gripe but it’s my list 🙂

Old C++ developers create strings filled with SQL to be passed to the database. But long statements would get broken up in code like:

string sql = "SELECT name, location " +
    "FROM MyTable " +
    "WHERE id = 12312 " +
    "ORDER BY name";

I hate that format because it makes things impossible to cut and paste. But C# has this nifty string prefix @ that allows newlines inside string literals:

string sql = @"SELECT name, location
    FROM MyTable
    WHERE id = 12312
    ORDER BY name";

5. Concatenating Variables Values Into SQL

Pretty self-explanatory. This practice puts a database in danger of SQL injection attacks.

(Always happy for an excuse to mention little Bobby Tables from xkcd)

4. Triggers that maintain data in other tables

This is a symptom of a database design that’s not normalized. Essentially, the same data is stored in two places. The development effort needed to maintain both tables properly is often more trouble than it’s worth.

Very closely related is the indexed view. But I have nothing but love for these.

3. Missing Foreign Keys

Missing foreign keys are more common than other problems because there’s no obvious symptom to react to:

A new table will support an application just as nicely with or without foreign keys.
Software testers will see no error messages coming from the database.

It’s often later on that data integrity becomes a problem, now you’ve got more things to worry about:

Adding the foreign key.
Cleaning up dirty data.
and Public Relations.

Here’s a post I wrote earlier on finding missing keys.

2. Identifiers That Are Also Reserved Key Words

It’s confusing to have column names, alias names or table names that are also keywords. In some cases the syntax is allowed and what’s left is merely confusing.

I often see this pet peeve when columns are named the same as a particular type. like Date or Text. I’d rather see names like CreateDate, LogDate or CommentText. The following script can identify those:

SELECT c.TABLE_NAME, c.COLUMN_NAME, c.DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS c
JOIN sys.types t
	ON t.name = c.COLUMN_NAME
ORDER BY c.TABLE_NAME, c.COLUMN_NAME

1. Column Names That End in a Digit

Now, I don’t mind stuff like AddressLine1 and AddressLine2. They serve a purpose in their own way.

But normally when I see this it smells like unnormalized database design (not denormalized, there’s a distinction). So this is a good litmus test to check to see whether tables are at least in first normal form.

I use this script to check:

select t.name as table_name,
	c.name as column_name
from sys.tables t
join sys.columns c
	on t.[object_id] = c.[object_id]
where c.name like '%[0-9]'
order by t.name, c.name

~ Thanks to xkcd.com and victorianweb.org for sharing their images

-- Comments (26)

26 Comments »

Ha ha great list Michael. Just thinking about some of these is making me angry 🙂

Comment by John Sansom — August 27, 2010 @ 4:55 am
Fantastic list Michael, number 8 (Fluff in Object Definitions) can be nasty. Once had to deal with developer who did work on his laptop with a different default collation than the one on production, and the collation would be included in his “fluff” that got checked in. Another of those “but it worked on my machine” gotchas!

Comment by Noel McKinney — August 28, 2010 @ 2:10 pm
Thanks for the comment Noel. That’s been my experience too. If you want a specific collation (like binary or whatever) then specify it. But if you want it to match the database’s default collation, then let it be. Just like you, this issue has cost me time (allocated to maintenance and support work).

In fact each of the gripes in this list has cost me some time. Some more than others.

Comment by Michael J. Swart — August 28, 2010 @ 3:32 pm
[…] be familiar with him. He is a very interesting blogger for sure. He recently wrote an article about Ten Things I hate to See in T-SQL; it was really fun, but the thing which caught my eyes was the subject of isDeleted Column. First […]

Pingback by SQL SERVER – Soft Delete – IsDelete Column – Your Opinion « Journey to SQL Authority with Pinal Dave — September 2, 2010 @ 9:31 pm
Very nice article sir….all the things u hate in T-Sql really deserve to be hated…thanks for the beautiful article

Comment by Sumit Thapar — September 3, 2010 @ 1:57 am
Thanks Sumit,
Have you got any things that I didn’t mention that bug you?

Comment by Michael J. Swart — September 3, 2010 @ 9:33 am
[…] Delete – IsDelete Column – Your Opinion – Pinal Dave gives his thoughts in response to a recent post from Michael J […]

Pingback by SQL Server Links and news the week 02/09/10 | John Sansom - SQL Server DBA in the UK — September 3, 2010 @ 9:39 am
Re number 10, I have yet to see a standard simple way of source controlling your stored procedures that doesn’t cost, so folks will source control their code but not their SPs.

Comment by Greg — September 5, 2010 @ 6:10 am
Hey Greg, What extra cost is there to putting procedures into source control that isn’t a cost for code?

Comment by Michael J. Swart — September 5, 2010 @ 9:11 am
Your code files live in your filesystem, so are easily kept in your source control eg tortoise svn,cvs, right click commit from windows explorer.

Your stored procs live in your database, so are easily kept in your source control by, er…

Comment by Greg — September 5, 2010 @ 2:23 pm
Hey Greg,

Change your point of view. You don’t put assemblies in source control either. Sproc scripts are to sprocs as code files are to assemblies.

You have to make a very conscious choice about where the definition of a thing lives. In my fifteen years with DBs, I’ve never seen the definition of a stored procedure live inside a db as part of a proper software dev life cycle.

In fact keeping a stored procedure definition only inside a database never occurred to me so I didn’t understand what extra costs you were talking about in your first comment.

Comment by Michael J. Swart — September 5, 2010 @ 5:30 pm
Hi Michael,

I appreciate the advice. All the books and courses I have done have never covered this particular situation (source controlling SQL server stored procs).

My background has been source control your code, have a development database instance containing SPs, which is backed up.

So how do you do it? Have a single file with all TSQL in the db? Have them split out into one file per SP? If the latter, how can you do a build in a single step?

Comment by Greg — September 5, 2010 @ 7:16 pm
That’s a great question Greg. It’s a topic that probably deserves its own blog post (look for it next week). But the short answer is: One file per SP, run separately, not in a single step, but all run together at once.

At least that’s my experience, because the dbs I’ve worked on have always been rolled out to many many servers (which has its own challenges).

But thanks for the comments. It’s a topic I’ve never given a lot of thought about until now.

Comment by Michael J. Swart — September 5, 2010 @ 8:56 pm
Informative article:)
How about the below point:
Column names with out follwing any naming conventions. Sometimes developers use mix of upper and small case letters.
ex: EmployeeName
Instead this can be designed as Employee_Name

Comment by Vijaya Kadiyala — September 7, 2010 @ 2:53 am
Not following naming conventions would have probably made the list (too bad it wasn’t a top 11 list).
But personally I don’t have any big problems with any naming convention as long as
1. It’s clear and doesn’t hinder understanding (following the self-documenting precept of agile).
and
2. Above all it’s consistent: A mix of naming conventions can’t be called a convention!

Comment by Michael J. Swart — September 7, 2010 @ 7:50 am
Hey guys, you might want to check out a relatively new tool from Red Gate for source control of your databases: http://www.red-gate.com/products/SQL_Source_Control/index.htm. Just downloaded it yesterday myself. It automatically scripts every object in your database, and currently supports committing database changes, and updating a “working copy” database to the latest changes from within Management Studio. You still have to manually check the history logs of your VCS for past changes though. But everything can now just live in the database with no extra manual steps on your part. This may be the easy way for db source control from now, and a simple answer to your question other Greg 😉 Seems to only support SVN and Team Foundation Server at this time though.

Comment by Greg Jacobs — September 7, 2010 @ 12:19 pm
In #1, I love the distinction between “unnormalized” and “denormalized”. The latter implies a deliberate action which can be debated on the merits. The former implies laziness.

Comment by Jeff Brumley — September 7, 2010 @ 1:29 pm
You got it in one guess Jeff! De-normalized also implies that the tables were at some point normalized.

Comment by Michael J. Swart — September 7, 2010 @ 1:48 pm
Michael,

Great article, please permit me to share this to our team.

Comment by Simhadri — September 8, 2010 @ 10:38 am
I agree with the majority of your things but I have a comment to point 9, The isDeleted Column:

Imagine a company where employees may post on an internal site. The site’s code tracks the employee. Employee data is in a separate table. Once the employee leaves the company his posts should be remain visible along with the name.
You cannot delete the employee’s record, so you must mark it as…as something: inactive, hasLeftTheCompany, isDeleted…etc.
I agree isDeleted is not the best name 🙂 …

I just wanted to highlight a point where it is better (or it’s actually a requirement) to mark inactive records in a database instead of deleting them. Sometimes the law does not allow you to delete such records for a specified period of time.

Again, I agree isDeleted is not the best name…

Regards,

Comment by S.E: — September 12, 2010 @ 3:14 pm
Thanks for your comment. I think your last statement is exactly the point I want to make, if you’re using isDeleted for the purposes you mention, then isDeleted is not the best name. Maybe call it IsDeactivated, or toBeArchived. To be clear, this is not a hard and fast rule.

But I have seen it as a lazy way to delete things when foreign keys are in the way.

But you also bring up another interesting topic. That of having posts depend on employees. Having a post depend on an employee when you want that post to live even without the employee is a challenge in db design.

Comment by Michael J. Swart — September 12, 2010 @ 3:24 pm
There is actually a full blog post spurred from this one just about the idea of the “soft delete” / “isDeleted” column (which was actually how I found this post in the first place). http://blog.sqlauthority.com/2010/09/03/sql-server-soft-delete-isdelete-column-your-opinion/ Good comments on there.

Comment by Greg Jacobs — September 12, 2010 @ 4:32 pm
Read your article for the first time. Great Article.

Comment by Muhammad Abbas — September 16, 2010 @ 6:19 am
Re:
So how do you do it? Have a single file with all TSQL in the db? Have them split out into one file per SP? If the latter, how can you do a build in a single step?

I have gotten in the habit of using Visual Studio to work on stored procs (although I start them in Management Studio for the quick, minimal documenting that it lets you do ;-). By having VS connected with a source repository, you can have the process of the Stored Proc script saving and retrieving from the repository pretty much be an automatic function of opening the script. (it ain’t “perfect” but “ittledoo”. 😉

Comment by Ralph Wilson — September 24, 2010 @ 11:54 pm
Thanks for the comment Ralph,
I’m learning that there’s more than one way to tackle this.

I only wish MS put some effort into SQL Server Management Studio. Their Solution Explorer has room for improvement.

Comment by Michael J. Swart — September 25, 2010 @ 10:13 am
[…] Ten Things I Hate to See in T-SQL – Are some of your pet peeves on the list? Find out what gets on Michael J Swart‘s goat. […]

Pingback by SQL Server Links and news the week 27/08/10 — March 9, 2012 @ 6:10 am