I have trouble with procedures that use SELECT *
. They are often not “Blue-Green safe“. In other words, if a procedure has a query that uses SELECT *
then I can’t change the underlying tables can’t change without causing some tricky deployment issues. (The same is not true for ad hoc queries from the application).
I also have a lot of procedures to look at (about 5000) and I’d like to find the procedures that use SELECT *
.
I want to maybe ignore SELECT *
when selecting from a subquery with a well-defined column list.
I also want to maybe include related queries like OUTPUT inserted.*
.
The Plan
- So I’m going to make a schema-only copy of the database to work with.
- I’m going to add a new dummy-column to every single table.
- I’m going to use
sys.dm_exec_describe_first_result_set_for_object
to look for any of the new columns I created
Any of my new columns that show up, were selected with SELECT *
.
The Script
use master; DROP DATABASE IF EXISTS search_for_select_star; DBCC CLONEDATABASE (the_name_of_the_database_you_want_to_analyze, search_for_select_star); ALTER DATABASE search_for_select_star SET READ_WRITE; GO use search_for_select_star; DECLARE @SQL NVARCHAR(MAX); SELECT @SQL = STRING_AGG( CAST( 'ALTER TABLE ' + QUOTENAME(OBJECT_SCHEMA_NAME(object_id)) + '.' + QUOTENAME(OBJECT_NAME(object_id)) + ' ADD NewDummyColumn BIT NULL' AS NVARCHAR(MAX)), N';') FROM sys.tables; exec sp_executesql @SQL; SELECT SCHEMA_NAME(p.schema_id) + '.' + p.name AS procedure_name, r.column_ordinal, r.name FROM sys.procedures p CROSS APPLY sys.dm_exec_describe_first_result_set_for_object(p.object_id, NULL) r WHERE r.name = 'NewDummyColumn' ORDER BY p.schema_id, p.name; use master; DROP DATABASE IF EXISTS search_for_select_star; |
Update
Tom from StraightforwardSQL pointed out a nifty feature that Microsoft has already implemented.
Not sure here, but doesn't dm_sql_referenced_entities.is_select_all achieve the same thing?
— Zikato (@Zikato25) August 9, 2021
Yes it does! You can use it like this:
select distinct SCHEMA_NAME(p.schema_id) + '.' + p.name AS procedure_name from sys.procedures p cross apply sys.dm_sql_referenced_entities( object_schema_name(object_id) + '.' + object_name(object_id), default) re where re.is_select_all = 1 |
Comparing the two, I noticed that my query – the one that uses dm_exec_describe_first_result_set_for_object – has some drawbacks. Maybe the SELECT * isn’t actually included in the first result set, but some subsequent result set. Or maybe the result set couldn’t be described for one of these various reasons
On the other hand, I noticed that dm_sql_referenced_entities has a couple drawbacks itself. It doesn’t seem to capture select statements that use `OUTPUT INSERTED.*` for example.
In practice though, I found the query that Tom suggested works a bit better. In the product I work most closely with, dm_sql_referenced_entities only missed 3 procedures that dm_exec_describe_first_result_set_for_object caught. But dm_exec_describe_first_result_set_for_object missed 49 procedures that dm_sql_referenced_entities caught!
[…] Michael J. Swart hunts for the real performance killer: […]
Pingback by Finding Procedures Using SELECT * – Curated SQL — August 10, 2021 @ 8:10 am