I’m participating in this month’s T-SQL Tuesday. The host – handsome genius – has asked us to argue against a popular opinion. So the opinion I’m tackling today is that
SELECT * should always be avoided.
I’ve heard that a lot. For example, it seems to be a popular question on Stack Overflow. The consensus is that
SELECT * is discouraged in favor of
SELECT col1, col2, col3 where you select the columns you need and no others.
I’m going to look at some of the reasons that are given to support this advice. I’m also going to qualify when it might not matter so much.
Unused columns cause excessive load on the database.
Touché. This is true,
SELECT * often causes SQL Server to use the clustered index because it’s usually the only covering one. This can waste an opportunity to use a more efficient index.
However, even when us developers follow the “Query only what you need” advice, our SELECT queries often do in fact need every single column. In this case,
SELECT * is fine.
Columns are returned in an unknown order
That’s true. The fear is that an application will contain a bug when it depends on a column order that is incorrect. Well simply don’t do that. Most applications are able to retrieve row values in the record set by column name rather than column position. Identify columns by name. Then the column order doesn’t matter.
You can’t search code for the use of a column
That’s an interesting one. Say I’ve got a
USERS table with a
Title column. This column is getting dusty and so we plan to deprecate it and ultimately remove it.
It would be great to search our code base and find where this column gets used. Presumably the code that contains
SELECT * would be missed in the search but any code that contains
SELECT ... Title would be found.
But how big a problem is this? The only trouble is when an application uses
SELECT * and then after it gets the results, it takes that data set and looks for the column
Title by name. If that’s the case, then you will probably find that instance of the
Title string in your code search.
The development effort is not really impacted (because
SELECT * doesn’t need to change). The testing effort is not impacted terribly either. I can’t think of a scenario where Ctrl+F
column name is the only thing standing in the way of a bug getting into production. If you can think of one, then it’s probably not the only quality issue that needs attention.
I don’t mind
SELECT * when I’m using every column and when I don’t depend on the order of the columns.