So here’s my top ten list of things that I see in written in SQL that I dislike or that irk me in some way. And yeah, in some cases, I would use the word hate.
They’re not necessarily things that go against best practices. But each one gets under my skin in some way and I thought I’d share that list at the risk of giving my enemies buttons to push.
So without further ado.
10. Code that’s Been Commented Out
It’s in every set of code I’ve ever worked with and I’m sure you’ve seen it too. It’s not even SQL Server specific, but you’ll see it in any code that changes over time. The code that get’s commented out instead of deleted.
I’m guessing that the reasoning behind this is that some developers want to keep code around to make reverts easier, or to keep a sense of history of the code. But that’s what source control systems were built for!
9. The isDeleted Column
“Deleted records aren’t deleted. Look, they’re right there!”
You sometimes see this. Records that are marked as deleted with a flag. It’s sometimes needed to support an undelete feature but it still bugs me.
It’s also confusing when looking at procedures called s_OBJECT_Delete(). Without looking at the definition, does this procedure delete the record, or just mark it as deleted?
8. Fluff in Object Definitions
When I create a table from scratch like:
CREATE TABLE MyTest ( Id INT IDENTITY NOT NULL, Value1 NVARCHAR (100) NOT NULL, Value2 NVARCHAR (max), CONSTRAINT PK_MyTest PRIMARY KEY (Id) )
I can get SQL Server to script out the table for me later on and it will look like this:
/****** Object: Table [dbo].[MyTest] Script Date: 08/25/2010 19:08:29 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[MyTest]( [Id] [int] IDENTITY(1,1) NOT NULL, [Value1] [nvarchar](100) NOT NULL, [Value2] [nvarchar](max) NULL, CONSTRAINT [PK_MyTest] PRIMARY KEY CLUSTERED ( [Id] ASC ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO
All the added stuff is marked in green. This is stuff that is either optional, or that specifies an option that happens to be the default. This is okay, because SQL Server didn’t know if the table was created with options left as the default, or whether they were explicitly set. So the scripter spells it out.
And that’s totally fine. I just get peeved when some of this stuff makes it into checked-in object definitions.
7. Filter in the JOIN Clause Instead of the WHERE Clause
Look at this query:
SELECT p.name, COUNT(1) FROM Sales.SalesOrderDetail sod JOIN Production.Product p ON sod.ProductID = p.ProductID AND p.Name like 'Sport%' GROUP BY p.Name
I’d rather see that filter on p.Name in a where clause:
SELECT p.name, COUNT(1) FROM Sales.SalesOrderDetail sod JOIN Production.Product p ON sod.ProductID = p.ProductID WHERE p.Name like 'Sport%' GROUP BY p.Name
Of course if you’re dealing with LEFT OUTER JOINS, then it matters where the filter is placed.
6. Concatenating Strings to Build SQL in C#
This is more of a C# gripe than a SQL gripe but it’s my list 🙂
Old C++ developers create strings filled with SQL to be passed to the database. But long statements would get broken up in code like:
string sql = "SELECT name, location " + "FROM MyTable " + "WHERE id = 12312 " + "ORDER BY name";
I hate that format because it makes things impossible to cut and paste. But C# has this nifty string prefix @ that allows newlines inside string literals:
string sql = @"SELECT name, location FROM MyTable WHERE id = 12312 ORDER BY name";
5. Concatenating Variables Values Into SQL
Pretty self-explanatory. This practice puts a database in danger of SQL injection attacks.
(Always happy for an excuse to mention little Bobby Tables from xkcd)
4. Triggers that maintain data in other tables
This is a symptom of a database design that’s not normalized. Essentially, the same data is stored in two places. The development effort needed to maintain both tables properly is often more trouble than it’s worth.
Very closely related is the indexed view. But I have nothing but love for these.
3. Missing Foreign Keys
Missing foreign keys are more common than other problems because there’s no obvious symptom to react to:
- A new table will support an application just as nicely with or without foreign keys.
- Software testers will see no error messages coming from the database.
It’s often later on that data integrity becomes a problem, now you’ve got more things to worry about:
- Adding the foreign key.
- Cleaning up dirty data.
- and Public Relations.
Here’s a post I wrote earlier on finding missing keys.
2. Identifiers That Are Also Reserved Key Words
It’s confusing to have column names, alias names or table names that are also keywords. In some cases the syntax is allowed and what’s left is merely confusing.
I often see this pet peeve when columns are named the same as a particular type. like Date or Text. I’d rather see names like CreateDate, LogDate or CommentText. The following script can identify those:
SELECT c.TABLE_NAME, c.COLUMN_NAME, c.DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS c JOIN sys.types t ON t.name = c.COLUMN_NAME ORDER BY c.TABLE_NAME, c.COLUMN_NAME
1. Column Names That End in a Digit
Now, I don’t mind stuff like AddressLine1 and AddressLine2. They serve a purpose in their own way.
But normally when I see this it smells like unnormalized database design (not denormalized, there’s a distinction). So this is a good litmus test to check to see whether tables are at least in first normal form.
I use this script to check:
select t.name as table_name, c.name as column_name from sys.tables t join sys.columns c on t.[object_id] = c.[object_id] where c.name like '%[0-9]' order by t.name, c.name
~ Thanks to xkcd.com and victorianweb.org for sharing their images