It’s not surprising that my blog’s syntax colorer has trouble with this statement. But SQL Server will run this statement without complaining. Management Studio doesn’t even show any red squiggly lines anywhere. The same statement without comments looks like this:
I want a program to remove comments from any valid SQL and I want it to handle even this crazy example. I describe a handy method that lets me do that.
Add using Microsoft.SqlServer.Management.TransactSql.ScriptDom; to your “usings”.
Then add this method to your class:
publicstring StripCommentsFromSQL(string SQL ){
TSql110Parser parser =new TSql110Parser(true);
IList<ParseError> errors;var fragments = parser.Parse(newSystem.IO.StringReader( SQL ), out errors );// clear commentsstring result =string.Join(string.Empty,
fragments.ScriptTokenStream.Where( x => x.TokenType!= TSqlTokenType.MultilineComment).Where( x => x.TokenType!= TSqlTokenType.SingleLineComment).Select( x => x.Text));return result;}
public string StripCommentsFromSQL( string SQL ) {
TSql110Parser parser = new TSql110Parser( true );
IList<ParseError> errors;
var fragments = parser.Parse( new System.IO.StringReader( SQL ), out errors );
// clear comments
string result = string.Join (
string.Empty,
fragments.ScriptTokenStream
.Where( x => x.TokenType != TSqlTokenType.MultilineComment )
.Where( x => x.TokenType != TSqlTokenType.SingleLineComment )
.Select( x => x.Text ) );
return result;
}
… and profit! This method works as well as I hoped, even on the given SQL example.
Why I Prefer This Method
A number of reasons. By using Microsoft’s own parser, I don’t have to worry about comments in strings, or strings in comments which are problems with most T-SQL-only solutions. I also don’t have to worry about nested multiline comments which can be a problem with regex solutions.
Did you know that there’s another sql parsing library by Microsoft? It’s found at Microsoft.SqlServer.Management.SqlParser.Parser. This was the old way of doing things and it’s not supported very well. I believe this library is mostly intended for use by features like Management Studio’s Intellisense. The ScriptDom library is better supported and it’s easier to code with.
Let Me Know If You Found This Useful
Add comments below. Be warned though, if you’re a spammer, I will quickly remove your comments. I’ve had practice.
So last week, I explained different ways to generate concurrent activity. I have my own favorites, but you may have your own. So pick your favorite method; whichever method you picked, it will help you call the procedure s_DoSomething super-often.
Now comes the task of defining the procedure s_DoSomething. It can be whatever you like depending on what functionality you want to test or exercise. I want to demonstrate some patterns that I follow when I define that procedure. Those patterns all start with …
the Basic Definition
To test the concurrency of a single procedure just call it:
use tempdb
GO
CREATEPROCEDURE dbo.s_DoSomethingASEXEC Adventureworks2012.dbo.uspGetManagerEmployees14;
use tempdb
GO
CREATE PROCEDURE dbo.s_DoSomething AS
EXEC Adventureworks2012.dbo.uspGetManagerEmployees 14;
Of course, I could have changed the framework to call my procedure directly but I don’t out of habit. I always leave s_DoSomething in tempdb hard-coded in the framework.
With Arbitrary Parameter Values
Often the procedures I want to test are defined with parameters. If variety is important, but the parameter values are not, then that’s when the random tricks come in:
ALTER PROCEDURE dbo.s_DoSomething AS
DECLARE @BusinessEntityId int;
SELECT TOP 1 @BusinessEntityId = BusinessEntityID
FROM AdventureWorks2012.HumanResources.Employee
ORDER BY newid();
EXEC AdventureWorks2012.dbo.uspGetEmployeeManagers @BusinessEntityId;
Calling More Than One Procedure
It’s as simple as calling one after the other. But sometimes I want the frequency of the calls “weighted”.
For example, I want to have a DELETE, INSERT and UPDATE statements called 10% of the time each. The remaining 70% of the time I want to call a SELECT statement. Then I have something like:
ALTERPROCEDURE dbo.s_DoSomethingASdeclare @r int=RAND()*10;
IF(@r =0)-- delete 10% of the timeDELETE AdventureWorks2012.Person.BusinessEntityWHERE BusinessEntityID =CAST(RAND()*1000asINT);
IF(@r =1)-- insert 10% of the timeINSERT AdventureWorks2012.Person.BusinessEntity(rowguid)VALUES(newid());
IF(@r =2)-- update 10% of the timeUPDATE AdventureWorks2012.Person.BusinessEntitySET rowguid = newid()WHERE BusinessEntityID =CAST(RAND()*1000asINT);
IF(@r >2)-- select the rest of the timeSELECT BusinessEntityId, rowguid, ModifiedDate
FROM AdventureWorks2012.Person.BusinessEntityWHERE BusinessEntityID =CAST(RAND()*1000asINT);
ALTER PROCEDURE dbo.s_DoSomething AS
declare @r int = RAND() * 10;
IF (@r = 0)
-- delete 10% of the time
DELETE AdventureWorks2012.Person.BusinessEntity
WHERE BusinessEntityID = CAST(RAND()*1000 as INT);
IF (@r = 1)
-- insert 10% of the time
INSERT AdventureWorks2012.Person.BusinessEntity (rowguid)
VALUES (newid());
IF (@r = 2)
-- update 10% of the time
UPDATE AdventureWorks2012.Person.BusinessEntity
SET rowguid = newid()
WHERE BusinessEntityID = CAST(RAND()*1000 as INT);
IF (@r > 2)
-- select the rest of the time
SELECT BusinessEntityId, rowguid, ModifiedDate
FROM AdventureWorks2012.Person.BusinessEntity
WHERE BusinessEntityID = CAST(RAND()*1000 as INT);
Counting Errors Concurrently
I want to track (server side) how often s_DoSomething fails. But I don’t want tracking to be a concurrency bottleneck itself. Here’s a cool trick for that:
First define these procedures:
createprocedure dbo.s_IFailedas
go
createprocedure dbo.s_IPassedas
go
alterprocedure dbo.s_DoSomethingasbegintrydeclare @i int=rand()*10;
select @i =1/ @i -- might divide by 0!exec dbo.s_IPassed;
endtrybegincatchexec dbo.s_IFailed;
endcatch
create procedure dbo.s_IFailed as
go
create procedure dbo.s_IPassed as
go
alter procedure dbo.s_DoSomething as
begin try
declare @i int = rand() * 10;
select @i = 1 / @i -- might divide by 0!
exec dbo.s_IPassed;
end try
begin catch
exec dbo.s_IFailed;
end catch
This lets me use DMVs to monitor the success rate because I can check the execution count of my dummy procedures. For example,
execsp_recompile'dbo.s_IFailed'; -- reset counts from other tests.execsp_recompile'dbo.s_IPassed';
go
/* run concurrent test here, or...*/setnocounton;
exec dbo.s_DoSomething;
go 10000selectobject_name(object_id), execution_count
from sys.dm_exec_procedure_statswhereobject_name(object_id)in('s_IFailed','s_IPassed')/*
-- count
s_IPassed 9031
s_IFailed 969
*/
exec sp_recompile 'dbo.s_IFailed'; -- reset counts from other tests.
exec sp_recompile 'dbo.s_IPassed';
go
/* run concurrent test here, or...*/
set nocount on;
exec dbo.s_DoSomething;
go 10000
select object_name(object_id), execution_count
from sys.dm_exec_procedure_stats
where object_name(object_id) in ('s_IFailed','s_IPassed')
/*
-- count
s_IPassed 9031
s_IFailed 969
*/
This relies on the DMV sys.dm_exec_procedure_stats which was introduced in 2008. It’s like a cheap do-it-yourself, performance counter.
Next
Next week I want to show a demo. I want to show this technique in action. I’ll be troubleshooting a common concurrency problem, the deadlock.
I’ve discovered that DBAs use different methods to accomplish what amounts to the same thing: Generating concurrent activity on a server. I want to explore a number of methods I’ve seen. For each method, I want to call one particular procedure, many times at once, and often… Oh, and for free.
Why are we doing this?
For Testing: The whole point is to generate concurrent activity on the server. Testing with this kind of activity is one of the only ways to avoid resource contention issues.
For Demos: Concurrency issues are so common and so varied, that it’s not surprising how often we see demos that depend on this kind of generated database activity.
For Troubleshooting: This is my favorite reason. A concurrency issue can’t be fixed well unless it can be reproduced reliably on some test environment. That’s why the methods below have a spot on my troubleshooting toolbelt.
The whole idea is to get many workers active on SQL Server at once.
For each method below, look out for the place where I specify these “parameters”
Number of virtual workers (50 in my example)
Number of iterations (for instance 10,000)
Connection string
Procedure to execute (s_DoSomething in my case)
One other thing to watch for is the overhead that this tool puts on the machine. Ideally, the method is suitable to run this from the same test environment that the SQL Server sits on. So I want my method to be fairly lightweight. This means that it’s best to handle iterations on SQL Server which cuts down on overhead associated with opening connections. So in most cases, instead of
exec dbo.s_DoSomething
exec dbo.s_DoSomething
I have
declare @i int=0; while(@i <10000)beginexec dbo.s_DoSomething; set @i+=1; end
declare @i int = 0; while (@i < 10000) begin exec dbo.s_DoSomething; set @i+= 1; end
Notepad + DOS Method
I adapted this method from a clever trick I saw once. It was Paul Randal giving a demo on tempdb contention. You can find a video of that demo by visiting this newsletter.
It’s a simple idea. You have two batch files, the first is called Run.bat:
echo off
sqlcmd -S MYSERVER\MYINSTANCE -E -Q "set nocount on; declare @i int = 0; while (@i < 10000) begin exec tempdb.dbo.s_DoSomething; set@i+= 1; end" >NULexit
echo off
sqlcmd -S MYSERVER\MYINSTANCE -E -Q "set nocount on; declare @i int = 0; while (@i < 10000) begin exec tempdb.dbo.s_DoSomething; set @i+= 1; end" > NUL
exit
foreach($worker in 1..50) `
{ `
Start-Job `
-ScriptBlock `
{ `
Import-Module sqlps; `
Invoke-Sqlcmd `
-Query "set nocount on; declare @i int = 0; while (@i < 10000) begin exec tempdb.dbo.s_DoSomething; set @i+= 1; end" `
-ServerInstance "MYSERVER\MYINSTANCE" `
| Out-Null `
} `
}
Except that I don’t think I got it quite right. Whatever I changed makes this method unsuitable.
The script schedules a job 50 times, but it takes about a minute just to schedule them all. Once they’re scheduled, the jobs take time to start and not enough of them work in parallel to generate the needed concurrent activity so I give this method a “skip”. If you’re really comfortable with powershell, maybe you can get this to work faster. If you can, let me know.
C#
This is my preferred method. It started out as the program I used to test upsert concurrency at https://michaeljswart.com/go/upsert but a friend at work showed me that .net 4.5 has some nifty new asynchronous methods that make it look nicer, perform faster and weigh lighter.
using System;
using System.Linq;
using System.Data.SqlClient;
using System.Threading.Tasks;
namespace AConsoleApplication {
class Program {
static void Main( string[] args ) {
var cs = new SqlConnectionStringBuilder();
cs.DataSource = @"MYSERVER\MYINSTANCE";
cs.InitialCatalog = "tempdb";
cs.IntegratedSecurity = true;
cs.AsynchronousProcessing = true;
string connectionString = cs.ToString();
Task[] tasks = Enumerable.Range( 0, 50 )
.Select( i => RunCommand( connectionString ) )
.ToArray();
Task.WaitAll( tasks );
}
private static async Task RunCommand(
string connectionString
) {
string sql = @"s_DoSomething";
SqlConnection conn = new SqlConnection( connectionString );
conn.Open();
SqlCommand cmd = new SqlCommand( sql, conn );
cmd.CommandType = System.Data.CommandType.StoredProcedure;
for( int i = 0; i < 10000; i++ ) {
try {
await cmd.ExecuteNonQueryAsync();
} catch( Exception ex ) {
Console.WriteLine( ex.Message );
}
}
}
}
}
HammerDB
HammerDB (originally HammerOra) is a free utility that allows users to run benchmarks against various environments. Although it was originally built for running benchmarks on Oracle, the utility now works on Windows and for SQL Server (hence the name change). I was first introduced to the utility via Kendra Little (again):
Follow these links to learn how to use the tool for its typical function, running benchmarks. Then, once you know how to do that, it’s a quick step to repurpose the tool for your own concurrent activity. For example, replace HammerDB’s generated script with this one:
#!/usr/local/bin/tclsh8.6package require tclodbc 2.5.1
database connect odbc "DRIVER=SQL Server Native Client 11.0;SERVER=MYSERVER\\MYINSTANCE;PORT=1433;TRUSTED_CONNECTION=YES";
odbc "set nocount on; declare @i int = 0; while (@i < 10000) begin exec tempdb.dbo.s_DoSomething; set @i+= 1; end"
odbc disconnect
#!/usr/local/bin/tclsh8.6
package require tclodbc 2.5.1
database connect odbc "DRIVER=SQL Server Native Client 11.0;SERVER=MYSERVER\\MYINSTANCE;PORT=1433;TRUSTED_CONNECTION=YES";
odbc "set nocount on; declare @i int = 0; while (@i < 10000) begin exec tempdb.dbo.s_DoSomething; set @i+= 1; end"
odbc disconnect
It’s steady and lightweight and works really well.
SQLQueryStress
SQL Query Stress is a tool written by Adam Machanic that you can download and install for free.
It’s fairly idiot proof and avoids a lot of the complexity of the other methods. For my own purposes, I want the utility to be lightweight. So I …
remember to turn off “Collect I/O Statistics”
remember to turn off “Collect Time Statistics”
Set “Number of iterations” to one.
Iterate in the query window, i.e. modify the query to call s_DoSomething 10,000 times.
My Ranking
Concurrent Activity Method
Stars
Notes
C# console app
★★★★½
It performs fastest and uses the least resources. Also, because I’m biased, it gets a small bump by not suffering from NIH issues.
HammerDB
★★★
Very fast (2nd fastest), but the interface is clumsy and modifications need tclsh experience <yuck> It’s best to use this for its intended purpose, as a benchmark tool.
Notepad and DOS
★★★
Quick and dirty and really fast. However, it’s still pretty clumsy. Interrupting a test part way through is difficult.
Powershell
★½
Yuck. I couldn’t get two runs that looked the same and it was near impossible to sort powershell cpu and memory pressure from SQL Server pressure.
SQLQueryStress
★★★★
It does the job well. It was a little difficult to interrupt a test. It also takes care to make it a lightweight framework.
Next Week
I’ll show some typical ways I define s_DoSomething.
Takeaway: I define a metric which indicates a code-smell for overly complex views, procedures and functions: “FROM” count.
In the programming world, developers have defined metrics for how complex a piece of code is. For example, Visual Studio defines several metrics that are meant to give developers some idea about how complex their code is getting. These measures won’t be perfect because code complexity is not something that can be measured directly. But many think that these measures indicate complexity often enough to be useful.
Code Smells
That’s what a code smell is meant for. My own definition is that a code smell is an anti-rule-of-thumb <cough>cursors!</cough>. It doesn’t necessarily indicate something’s rotten, but like a bad smell, it’s worth checking into.
Measuring Complexity in SQL Modules
And here’s my idea about measuring complexity for SQL. It’s based on the premise that complex procedures or views will have many queries/subqueries and that most queries have FROM clauses. So what are my procedures/views/functions that may be too complex? Let’s find out:
DECLARE @Odor NVARCHAR(30)= N'FROM';
with
L0 as(select1as C unionallselect1)--2 rows
,L1 as(select1as C from L0 as A, L0 as B)--4 rows
,L2 as(select1as C from L1 as A, L1 as B)--16 rows
,L3 as(select1as C from L2 as A, L2 as B)--256 rows
,L4 as(select1as C from L3 as A, L3 as B)--65536 rows
,Nums as(select row_number()over(orderby(select0))as N from L4)SELECT
OBJECT_SCHEMA_NAME(m.object_id)as SchemaName,
OBJECT_NAME(m.object_id)as ObjectName,
count(1)as OdorCount
FROM Nums
CROSSJOIN sys.sql_modules m
WHERE Nums.N<LEN(m.definition)ANDSUBSTRING(m.definition, Nums.N, LEN(@Odor))= @Odor
GROUPBY m.object_idORDERBYcount(1)desc, object_name(m.object_id)asc
DECLARE @Odor NVARCHAR(30) = N'FROM';
with
L0 as (select 1 as C union all select 1) --2 rows
,L1 as (select 1 as C from L0 as A, L0 as B) --4 rows
,L2 as (select 1 as C from L1 as A, L1 as B) --16 rows
,L3 as (select 1 as C from L2 as A, L2 as B) --256 rows
,L4 as (select 1 as C from L3 as A, L3 as B) --65536 rows
,Nums as (select row_number() over (order by (select 0)) as N from L4)
SELECT
OBJECT_SCHEMA_NAME(m.object_id) as SchemaName,
OBJECT_NAME(m.object_id) as ObjectName,
count(1) as OdorCount
FROM Nums
CROSS JOIN sys.sql_modules m
WHERE Nums.N < LEN(m.definition)
AND SUBSTRING(m.definition, Nums.N, LEN(@Odor)) = @Odor
GROUP BY m.object_id
ORDER BY count(1) desc, object_name(m.object_id) asc
I’ve found the metric slightly better than the simpler “procedure size” metric:
SELECT
OBJECT_SCHEMA_NAME(m.object_id)as SchemaName,
OBJECT_NAME(m.object_id)as ObjectName,
LEN(m.definition)as ModuleSize
FROM sys.sql_modules m
ORDERBYLEN(m.definition)desc, object_name(m.object_id)asc
SELECT
OBJECT_SCHEMA_NAME(m.object_id) as SchemaName,
OBJECT_NAME(m.object_id) as ObjectName,
LEN(m.definition) as ModuleSize
FROM sys.sql_modules m
ORDER BY LEN(m.definition) desc, object_name(m.object_id) asc
Try it out on your own environments and let me know if it identifies the monster procedures you know are lurking in your database.
Thanks to Adam Machanic for the substring counting syntax.
Performance Improvement!
Update December 5, 2013: In the comments, George Mastros provided a simpler and faster version of this query which does the same thing:
DECLARE @Odor NVARCHAR(30) = N'FROM';
Select OBJECT_SCHEMA_NAME(object_id) As SchemaName,
OBJECT_NAME(object_id) As ObjectName,
(DataLength(definition) - DataLength(Replace(definition, @Odor, ''))) / DataLength(@Odor) As OdorCount
From sys.sql_modules
Order By OdorCount DESC;
This is much simpler and much faster. What’s extra interesting is that George’s query itself has an odor count of 2 while my original one had a count of 7. Thanks so much George!
Takeaway: This post is for me. This post is a handy place for me to put some queries that I use often. If you find them useful too, that’s wonderful, bookmark this page with https://michaeljswart.com/go/Top20.
These queries will give the top 20 resource consumers for cached queries based on a few different metrics. I’ve posted queries like this before, and others have written many other versions of this query. All these queries are based on sys.dm_exec_query_stats.
But I’ve tweaked my own queries recently based on a couple things I’ve learned. So you could say that I gave my old queries a new coat of paint.
Here what I’ve added to these queries recently:
I added OPTION (RECOMPILE) to the query. It prevents these queries from showing up in their own results on quiet systems.
Joe Sack pointed out that query_hash can be used to aggregate queries that only differ by literal values. I’m doing that here. What’s extra awesome about this technique, is that it can also be used to aggregate the same queries that were executed in different procedures or in different databases. This is critical if we want to measure the impact of a single query on a server regardless of where the query is called.
I’m ignoring (for now) those queries with query_hash equal to 0x0. Again, Joe Sack points out that these are cursor fetches.
You may notice that the database name is sometimes null. It’s null when the query is not part of a stored procedure. I suppose if I was keen, I could pick out a arbitrary database name from the accompanying query plan.
Update September 27, 2013: Actually, In the comment section of this post, Ian Stirk gave me a really nice way to retrieve the database name from the dmvs. I’ve updated the queries below accordingly.
Remember that the stats found in sys.dm_exec_query_stats are only as good as what’s in cache. The cache is not a permanent store. It changes and it gets reset every server restart.
These are queries that run often. Frequent queries can be vulnerable to concurrency problems.
;with frequent_queries as(selecttop20
query_hash,
sum(execution_count) executions
from sys.dm_exec_query_statswhere query_hash <> 0x0
groupby query_hash
orderbysum(execution_count)desc)select@@servernameas server_name,
coalesce(db_name(st.dbid), db_name(cast(pa.valueASINT)), 'Resource')AS[DatabaseName],
coalesce(object_name(st.objectid, st.dbid), '<none>')as[object_name],
qs.query_hash,
qs.execution_count,
executions as total_executions_for_query,
SUBSTRING(st.[text],(qs.statement_start_offset+2)/2,
(CASEWHEN qs.statement_end_offset=-1THENLEN(CONVERT(NVARCHAR(MAX), st.text))*2ELSE qs.statement_end_offset+2END- qs.statement_start_offset)/2)as sql_text,
qp.query_planfrom sys.dm_exec_query_stats qs
join frequent_queries fq
on fq.query_hash= qs.query_hashcross apply sys.dm_exec_sql_text(qs.sql_handle) st
cross apply sys.dm_exec_query_plan(qs.plan_handle) qp
outer apply sys.dm_exec_plan_attributes(qs.plan_handle) pa
where pa.attribute='dbid'orderby fq.executionsdesc,
fq.query_hash,
qs.execution_countdescoption(recompile)
;with frequent_queries as
(
select top 20
query_hash,
sum(execution_count) executions
from sys.dm_exec_query_stats
where query_hash <> 0x0
group by query_hash
order by sum(execution_count) desc
)
select @@servername as server_name,
coalesce(db_name(st.dbid), db_name(cast(pa.value AS INT)), 'Resource') AS [DatabaseName],
coalesce(object_name(st.objectid, st.dbid), '<none>') as [object_name],
qs.query_hash,
qs.execution_count,
executions as total_executions_for_query,
SUBSTRING(st.[text],(qs.statement_start_offset + 2) / 2,
(CASE
WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX), st.text)) * 2
ELSE qs.statement_end_offset + 2
END - qs.statement_start_offset) / 2) as sql_text,
qp.query_plan
from sys.dm_exec_query_stats qs
join frequent_queries fq
on fq.query_hash = qs.query_hash
cross apply sys.dm_exec_sql_text(qs.sql_handle) st
cross apply sys.dm_exec_query_plan (qs.plan_handle) qp
outer apply sys.dm_exec_plan_attributes(qs.plan_handle) pa
where pa.attribute = 'dbid'
order by fq.executions desc,
fq.query_hash,
qs.execution_count desc
option (recompile)
Top 20 I/O Consumers
Specifically logical reads and writes. Still my favorite metric.
;with high_io_queries as
(
select top 20
query_hash,
sum(total_logical_reads + total_logical_writes) io
from sys.dm_exec_query_stats
where query_hash <> 0x0
group by query_hash
order by sum(total_logical_reads + total_logical_writes) desc
)
select @@servername as servername,
coalesce(db_name(st.dbid), db_name(cast(pa.value AS INT)), 'Resource') AS [DatabaseName],
coalesce(object_name(st.objectid, st.dbid), '<none>') as [object_name],
qs.query_hash,
qs.total_logical_reads + total_logical_writes as total_io,
qs.execution_count,
cast((total_logical_reads + total_logical_writes) / (execution_count + 0.0) as money) as average_io,
io as total_io_for_query,
SUBSTRING(st.[text],(qs.statement_start_offset + 2) / 2,
(CASE
WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX),st.[text])) * 2
ELSE qs.statement_end_offset + 2
END - qs.statement_start_offset) / 2) as sql_text,
qp.query_plan
from sys.dm_exec_query_stats qs
join high_io_queries fq
on fq.query_hash = qs.query_hash
cross apply sys.dm_exec_sql_text(qs.sql_handle) st
cross apply sys.dm_exec_query_plan (qs.plan_handle) qp
outer apply sys.dm_exec_plan_attributes(qs.plan_handle) pa
where pa.attribute = 'dbid'
order by fq.io desc,
fq.query_hash,
qs.total_logical_reads + total_logical_writes desc
option (recompile)
;with high_cpu_queries as
(
select top 20
query_hash,
sum(total_worker_time) cpuTime
from sys.dm_exec_query_stats
where query_hash <> 0x0
group by query_hash
order by sum(total_worker_time) desc
)
select @@servername as server_name,
coalesce(db_name(st.dbid), db_name(cast(pa.value AS INT)), 'Resource') AS [DatabaseName],
coalesce(object_name(st.objectid, st.dbid), '<none>') as [object_name],
qs.query_hash,
qs.total_worker_time as cpu_time,
qs.execution_count,
cast(total_worker_time / (execution_count + 0.0) as money) as average_CPU_in_microseconds,
cpuTime as total_cpu_for_query,
SUBSTRING(st.[text],(qs.statement_start_offset + 2) / 2,
(CASE
WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX),st.[text])) * 2
ELSE qs.statement_end_offset + 2
END - qs.statement_start_offset) / 2) as sql_text,
qp.query_plan
from sys.dm_exec_query_stats qs
join high_cpu_queries hcq
on hcq.query_hash = qs.query_hash
cross apply sys.dm_exec_sql_text(qs.sql_handle) st
cross apply sys.dm_exec_query_plan (qs.plan_handle) qp
outer apply sys.dm_exec_plan_attributes(qs.plan_handle) pa
where pa.attribute = 'dbid'
order by hcq.cpuTime desc,
hcq.query_hash,
qs.total_worker_time desc
option (recompile)
Top 20 Queries By Elapsed Time
Results found here, but not in the results of the other queries, usually suffer from things like excessive blocking or ASYNC_NETWORK_IO.
;with long_queries as
(
select top 20
query_hash,
sum(total_elapsed_time) elapsed_time
from sys.dm_exec_query_stats
where query_hash <> 0x0
group by query_hash
order by sum(total_elapsed_time) desc
)
select @@servername as server_name,
coalesce(db_name(st.dbid), db_name(cast(pa.value AS INT)), 'Resource') AS [DatabaseName],
coalesce(object_name(st.objectid, st.dbid), '<none>') as [object_name],
qs.query_hash,
qs.total_elapsed_time,
qs.execution_count,
cast(total_elapsed_time / (execution_count + 0.0) as money) as average_duration_in_microseconds,
elapsed_time as total_elapsed_time_for_query,
SUBSTRING(st.[text],(qs.statement_start_offset + 2) / 2,
(CASE
WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX),st.[text])) * 2
ELSE qs.statement_end_offset + 2
END - qs.statement_start_offset) / 2) as sql_text,
qp.query_plan
from sys.dm_exec_query_stats qs
join long_queries lq
on lq.query_hash = qs.query_hash
cross apply sys.dm_exec_sql_text(qs.sql_handle) st
cross apply sys.dm_exec_query_plan (qs.plan_handle) qp
outer apply sys.dm_exec_plan_attributes(qs.plan_handle) pa
where pa.attribute = 'dbid'
order by lq.elapsed_time desc,
lq.query_hash,
qs.total_elapsed_time desc
option (recompile)
Using These Queries
By the way. This post seems to be really popular. So I waive any copyright I have on these four queries. Copy them without attribution wherever you like. Profit if you can. Go nuts.
Takeaway: I tackled a problem recently and I present a version of the problem here as a challenge. The challenge is to figure out why a query and a stored procedure return different results even though the procedure is defined using the same query.
If no one cracks this one, I’ll leave the answer in the comments in a couple days.
The Setup
Okay, I use AdventureWorks2012 here. I define a table data type IDList and a stored procedure s_GetCustomerProductPurchases:
use Adventureworks2012
go
create type IDList asTABLE(id int);
go
CREATEPROCEDURE dbo.s_GetCustomerProductPurchases
@CustomerIds IDList READONLY,
@ProductIds IDList READONLY,
@SearchString NVARCHAR(1000)ASSETNOCOUNTONDECLARE @ResultList TABLE(
ProductId INT,
ProductName Name
);
insert @ResultList (ProductId, ProductName)selectdistinctTOP(2000) p.ProductID, p.Namefrom sales.salesorderdetail sod
join sales.salesorderheader soh
on soh.SalesOrderID= sod.SalesOrderIDjoin Production.Product p
on p.ProductID= sod.ProductIDjoin Sales.Customer c
on c.CustomerID= soh.CustomerIDjoin @CustomerIds CIds
on CIds.id= c.CustomerIDjoin @ProductIds PIds
on PIds.id= p.ProductIDwhere p.Namelike'%'+ @SearchString +'%'and soh.[Status]=5IF@@ROWCOUNT=0SELECT0 ProductId, 'No Matching Products' ProductName
ELSESELECT ProductId, ProductName
FROM @ResultList
ORDERBY ProductId
use Adventureworks2012
go
create type IDList as TABLE(id int);
go
CREATE PROCEDURE dbo.s_GetCustomerProductPurchases
@CustomerIds IDList READONLY,
@ProductIds IDList READONLY,
@SearchString NVARCHAR(1000)
AS
SET NOCOUNT ON
DECLARE @ResultList TABLE (
ProductId INT,
ProductName Name
);
insert @ResultList (ProductId, ProductName)
select distinct TOP (2000) p.ProductID, p.Name
from sales.salesorderdetail sod
join sales.salesorderheader soh
on soh.SalesOrderID = sod.SalesOrderID
join Production.Product p
on p.ProductID = sod.ProductID
join Sales.Customer c
on c.CustomerID = soh.CustomerID
join @CustomerIds CIds
on CIds.id = c.CustomerID
join @ProductIds PIds
on PIds.id = p.ProductID
where p.Name like '%' + @SearchString + '%'
and soh.[Status] = 5
IF @@ROWCOUNT = 0
SELECT 0 ProductId, 'No Matching Products' ProductName
ELSE
SELECT ProductId, ProductName
FROM @ResultList
ORDER BY ProductId
Executing the Query
Next lets run a query against Adventureworks2012. We define some variables that are used as filters in a query. This query populates a temporary results table and then we select from it:
DECLARE
@ProductIds IDList,
@CustomerIds IDList,
@SearchString Name ='Red';
insert @ProductIds (id)VALUES(707), (717), (727)insert @CustomerIds (id)VALUES(11000), (11103), (11201)SETNOCOUNTONDECLARE @ResultList TABLE(
ProductId INT,
ProductName Name
);
insert @ResultList (ProductId, ProductName)selectdistinctTOP(2000) p.ProductId, p.Namefrom sales.salesorderdetail sod
join sales.salesorderheader soh
on soh.SalesOrderID= sod.SalesOrderIDjoin Production.Product p
on p.ProductID= sod.ProductIDjoin Sales.Customer c
on c.CustomerID= soh.CustomerIDjoin @CustomerIds CIds
on CIds.id= c.CustomerIDjoin @ProductIds PIds
on PIds.id= p.ProductIDwhere p.Namelike'%'+ @SearchString +'%'and soh.[Status]=5IF@@ROWCOUNT=0SELECT0 ProductId, 'No Matching Products' ProductName
ELSESELECT ProductId, ProductName
FROM @ResultList
ORDERBY ProductId
DECLARE
@ProductIds IDList,
@CustomerIds IDList,
@SearchString Name = 'Red';
insert @ProductIds (id) VALUES (707), (717), (727)
insert @CustomerIds (id) VALUES (11000), (11103), (11201)
SET NOCOUNT ON
DECLARE @ResultList TABLE (
ProductId INT,
ProductName Name
);
insert @ResultList (ProductId, ProductName)
select distinct TOP (2000) p.ProductId, p.Name
from sales.salesorderdetail sod
join sales.salesorderheader soh
on soh.SalesOrderID = sod.SalesOrderID
join Production.Product p
on p.ProductID = sod.ProductID
join Sales.Customer c
on c.CustomerID = soh.CustomerID
join @CustomerIds CIds
on CIds.id = c.CustomerID
join @ProductIds PIds
on PIds.id = p.ProductID
where p.Name like '%' + @SearchString + '%'
and soh.[Status] = 5
IF @@ROWCOUNT = 0
SELECT 0 ProductId, 'No Matching Products' ProductName
ELSE
SELECT ProductId, ProductName
FROM @ResultList
ORDER BY ProductId
These are the results I get:
ProductId
ProductName
707
Sport-100 Helmet, Red
Executing the Procedure
Now run the procedure which contains the exact same query:
That’s what I want you to figure out. Try to crack this one before looking at the comments.
SQL Fiddle
Have you used SQL Fiddle yet? It’s a site that allows SQL developers to set up different scenarios for use in online discussions. So if you don’t have Adventureworks2012 handy, I’ve setup up two SQLFiddles here and here. Your job then is to follow these two links and explain why the results are different.
So I was recently involved in an issue where a particular Unicode string was causing some problems in an application. The string was coming from a production database. I was asked to help reproduce this. It boiled down to this:
Can you write a SQL Script which produces a string containing every possible unicode character?
That’s every character from 0x0000 (NUL) all the way up to 0xffff (which is not even legal unicode). If you want a crack at it, start with the code here:
declare @allNCHARS nvarchar(max);
/*
Your answer would fit here.
*/UPDATE MYTABLE
SETValue= @allNCHARS
WHERE Id =1;
declare @allNCHARS nvarchar(max);
/*
Your answer would fit here.
*/
UPDATE MYTABLE
SET Value = @allNCHARS
WHERE Id = 1;
My own answer is in the comments Maybe you can come up with something more elegant.
But it’s a handy string. If you’re in QA, it can help exercise a lot of different text issues (code pages, encodings, conversions etc…).
In my own case, the string actually helped me. I was able to reproduce the issue on a local developer machine. A colleague of mine then did more detective work and squashed that bug, so it was a bit of a tag team effort there.
I want to explain how I avoid RIGHT OUTER joins in favor of LEFT OUTER joins and how I avoid OUTER joins in favor of INNER joins.
Inner joins have no direction, but outer joins do so that we have three kinds of joins:
INNER JOIN (JOIN for short)
LEFT OUTER JOIN (LEFT JOIN for short)
RIGHT OUTER JOIN (RIGHT JOIN for short)
I leave out FULL OUTER JOINS for now because I never use them. And in fact RIGHT OUTER JOINS can always be written as LEFT OUTER JOINS, so in practice I only use just the two kinds of joins:
INNER JOIN
LEFT OUTER JOIN
Now as a rule of thumb, inner joins are more efficient than outer joins so it would be better to write queries that avoid outer joins. Let’s get to an example. My goal here is to write a query that gives me a list of employees and their director (if any) based on this schema.
CREATETABLE STAFF
(
Id NVARCHAR(20)NOTNULLPRIMARYKEY,
Name NVARCHAR(400)NOTNULL,
Department NVARCHAR(20),
RoleNVARCHAR(20)NOTNULL)CREATETABLE BOSSES
(
EmployeeId NVARCHAR(20)REFERENCES STAFF(Id),
BossId nvarchar(20)REFERENCES STAFF(Id),
PRIMARYKEY(EmployeeId, BossId))
CREATE TABLE STAFF
(
Id NVARCHAR(20) NOT NULL PRIMARY KEY,
Name NVARCHAR(400) NOT NULL,
Department NVARCHAR(20),
Role NVARCHAR(20) NOT NULL
)
CREATE TABLE BOSSES
(
EmployeeId NVARCHAR(20)
REFERENCES STAFF(Id),
BossId nvarchar(20)
REFERENCES STAFF(Id),
PRIMARY KEY (EmployeeId, BossId)
)
BOSSES is a table that contains not just direct reports, but all direct and indirect reports (making it handy for this query).
Using a RIGHT JOIN
The answer is fairly straightforward. I join the BOSSES table with the STAFF table to give me all the directors and their reports:
select S_Employee.Name, S_Boss.Nameas Director
from STAFF S_Boss
join BOSSES B
on B.BossId= S_Boss.Idjoin STAFF S_Employee
on B.EmployeeId= S_Employee.Idwhere S_Boss.Role='Director'
select S_Employee.Name, S_Boss.Name as Director
from STAFF S_Boss
join BOSSES B
on B.BossId = S_Boss.Id
join STAFF S_Employee
on B.EmployeeId = S_Employee.Id
where S_Boss.Role = 'Director'
But wait, this isn’t a complete list of employees. What about those in the company that don’t report to any director (not even indirectly). This is where the right outer join comes in:
select S_Employee.Name, S_Boss.Nameas Director
from STAFF S_Boss
join BOSSES B
on B.BossId= S_Boss.IdRIGHTjoin STAFF S_Employee
on B.EmployeeId= S_Employee.Idand S_Boss.Role='Director'
select S_Employee.Name, S_Boss.Name as Director
from STAFF S_Boss
join BOSSES B
on B.BossId = S_Boss.Id
RIGHT join STAFF S_Employee
on B.EmployeeId = S_Employee.Id
and S_Boss.Role = 'Director'
Notice that the S_Boss.Role filter can’t belong to the where clause any more. If it did, we’d lose director-less employees again and we’d be back where we started.
Using LEFT JOINS Only
That works… but for the sake of style, let’s only use left joins. I prefer using only left joins in this case because logically, the results are meant to be the set of employees. So I like to start with that list of employees and then join in the bosses if necessary. That’s why I start with the set of employees as the first table. The other joined tables aren’t the stars of this query; they’re only there to help provide that director attribute. So I have this:
select S_Employee.Name, S_Boss.Nameas Director
from STAFF S_Employee
leftjoin BOSSES B
on B.EmployeeId= S_Employee.Idleftjoin STAFF S_Boss
on B.BossId= S_Boss.Idand S_Boss.Role='Director'
select S_Employee.Name, S_Boss.Name as Director
from STAFF S_Employee
left join BOSSES B
on B.EmployeeId = S_Employee.Id
left join STAFF S_Boss
on B.BossId = S_Boss.Id
and S_Boss.Role = 'Director'
But you notice that I’m now using two left joins… Really I only want the one outer join that I was using in the first example. Turns out I can do that:
Using Nested Joins
Well that just looks like this:
select S_Employee.Name, S_Boss.Nameas Director
from STAFF S_Employee
leftjoin BOSSES B
join STAFF S_Boss
on B.BossId= S_Boss.Idand S_Boss.Role='Director'on B.EmployeeId= S_Employee.Id
select S_Employee.Name, S_Boss.Name as Director
from STAFF S_Employee
left join BOSSES B
join STAFF S_Boss
on B.BossId = S_Boss.Id
and S_Boss.Role = 'Director'
on B.EmployeeId= S_Employee.Id
This is logically equivalent to the right join but it uses left joins instead and only uses one outer join. Great!
A lot about this example is only about style. Maybe you prefer the right join example and think that I’m breaking some grammar rule here. It feels like I’m splitting an infinitive or something. Let me know what your preference is.
Takeaway: I talk a bit about DBCC FREEPROCCACHE, using the plan_handle parameter to remove exactly one query plan from cache. I also explain why and when you might want to.
This is another post where I share a script of mine. It’s one of my favorite kinds of posts so I’ve created a new category for it: SQL Scripts
The Situation
Crisis mode: The database server is slow (all of a sudden) and the performance is unacceptable.
And sp_whoisactive (or other monitoring software) tells you that there’s a particular query that is running frequently. In this example, the query starts with “SELECT TOP (@ResultLimit) M.Name FROM MyTable M JOIN … etc“.
The query is called in exactly one place in your application (Oh man! A fix will take time to recompile, QA and deploy).
The query usually runs for a few minutes and using up almost all the CPU.
You’ve collected the query plan and the query text.
But, if you run the query in SQL Server Management Studio, the query completes quickly.
Some Analysis
The query plan looks different when run in Management Studio
The production database must have picked a plan based on unlucky compile-time parameters.
Confirmed! When running the query in Management Studio using the OPTIMIZE FOR clause with similar parameters, you can reproduce the bad plan.
To buy some time, it would be good to have SQL Server recompile that plan.
This is what I want to explore in this post. What is the best way to get SQL Server to recompile that plan?
Before SQL Server 2008
How do we get SQL Server to recompile this plan? Before SQL Server 2008, there were a couple things I could do:
DBCC FREEPROCCACHE: Holy cow! That’s a bit extreme. We ask SQL Server to recompile all of its plans. (Missed opportunity: Illustration featuring Darth Sidious saying “Wipe them out. All of them.”)
UPDATE STATISTICS [tablename]: Find a table that is used in the query and have SQL Server update statistics on it. All plans using the table will be recompiled afterwards (including our troubled plan!) We might not actually need updated statistics; it’s the side effect of recompiled plans that we want here. I talked a bit about that in Updating Statistics Helps, But Not For The Reasons You Think.
EXEC sp_recompile [tablename]: This technique is similar to UPDATE STATISTICS in that plans are dropped. The benefit is that we don’t have all the overhead and time spent creating the statistics. It’s still not ideal though. It takes a schema modification lock and based on the server’s current busy state, that will mean at least a couple minutes of blocked processes.
Targeting a Single Query Plan
SQL Server version 2008 and later allows you to take out a single query plan from cache using the FREEPROCCACHE command with a plan handle. This is a sharp-shooter technique compared to the other techniques.
I like this technique because it takes no important locks. Existing calls to this query will continue to execute and complete (using the old bad plan). But in the meantime, new calls to this query will use a recompiled plan based on the different (and hopefully better) parameters. If so you’ve just bought yourself time to find and fix this query correctly.
But you can’t just use FREEPROCCACHE without some preparation. Finding the plan handle is not straightforward. So I’ve got a script that only asks that you identify a query based on its text.
The Script
First find out how bad your query is. I’m arbitrarily defining more than 10 active queries as bad. Change the query text here. Use enough text to uniquely identify the problem query. In my case I used “TOP (@ResultLimit)”.
-- Part 1: -- Use this query to look for bad performing queries using "TOP (@ResultLimit)"selectcount(1)as[countof running queries, should be lessthan10],
max(datediff(second, start_time, getdate()))as[longest running such query in seconds],
min(datediff(second, start_time, getdate()))as[shortest running such query in seconds]from sys.dm_exec_requests er
cross apply sys.dm_exec_sql_text(er.sql_handle) st
where st.textlike'%TOP (@ResultLimit)%'and st.textnotlike'%80B82FDD-0297-4057-9AEB-D811F3B5C7DB%'-- filter self
GO
-- Part 1:
-- Use this query to look for bad performing queries using "TOP (@ResultLimit)"
select
count(1) as [count of running queries, should be less than 10],
max(datediff(second, start_time, getdate())) as [longest running such query in seconds],
min(datediff(second, start_time, getdate())) as [shortest running such query in seconds]
from sys.dm_exec_requests er
cross apply sys.dm_exec_sql_text(er.sql_handle) st
where st.text like '%TOP (@ResultLimit)%'
and st.text not like '%80B82FDD-0297-4057-9AEB-D811F3B5C7DB%' -- filter self
GO
Then run this part to knock out that one query (remembering again to adjust the query text).
-- Part 2:-- If the above indicates poor performance (i.e. many active MyTable queries), -- use the following to clear bad plan from cache.declare @countint=0;
declare @planhandle varbinary(64)select
@count=count(1),
@planhandle =max(plan_handle)-- any arbitrary planfrom sys.dm_exec_requests er
cross apply sys.dm_exec_sql_text(er.sql_handle) st
where st.textlike'%TOP (@ResultLimit)%'and st.textnotlike'%80B82FDD-0297-4057-9AEB-D811F3B5C7DB%'-- filter self-- this query recompiles the one identified plan.if(@count>10)--workingdbcc freeproccache (@planhandle);
GO
-- Part 2:
-- If the above indicates poor performance (i.e. many active MyTable queries),
-- use the following to clear bad plan from cache.
declare @count int = 0;
declare @planhandle varbinary(64)
select
@count = count(1),
@planhandle = max(plan_handle) -- any arbitrary plan
from sys.dm_exec_requests er
cross apply sys.dm_exec_sql_text(er.sql_handle) st
where st.text like '%TOP (@ResultLimit)%'
and st.text not like '%80B82FDD-0297-4057-9AEB-D811F3B5C7DB%' -- filter self
-- this query recompiles the one identified plan.
if (@count > 10)
--working
dbcc freeproccache (@planhandle);
GO
Hopefully You’ve Bought Yourself Time
The query is recompiled with a better plan and you can sit back (for now). Your query is still vulnerable to bad plans and you’ve got to fix that, but you’ve got time.
In practice I’ve used this script maybe three or four times successfully. Your mileage may vary.
To be honest, I can’t think of much that could go wrong other than it doesn’t solve your problem. In the worst case scenario:
your analysis was wrong, the plan is recompiled and still performs poorly.
Or perhaps somehow the script picks the wrong plan to recompile, that’s not bad news either, it’s just news. I certainly can’t say the same for the other techniques that were available before version 2008.
Maybe expectations were set incorrectly. You propose this change, It doesn’t fix anything and now it may look like your guessing (This isn’t a technical issue, but it’s something to keep in mind).
By the way, don’t take my word for it. You need to tell others that the risk is small and you have to back this claim up without my help. So you should understand what’s going on here.
So I’ve been working on an automation project that makes frequent use of foreign key metadata. I find myself writing queries for this data but I discovered that there’s no super-easy out-of-the-box view of foreign keys for me to use. Here are the ones I considered.
INFORMATION_SCHEMA views
INFORMATION_SCHEMA views that give me foreign key information include
INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE for the foreign key info
INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS for the referring columns,
INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE for the referenced columns
The first view here gives a list of foreign keys and you have to join to the other two tables in order to find the column names. But it’s a crummy solution. First of all, if the foreign key has multiple columns, there’s no real way to match the referring columns to the referenced columns.
The second thing is that we don’t see foreign keys that point to unique Keys (as pointed out by Aaron Bertrand in his post The case against INFORMATION_SCHEMA views.
So that’s out. What else have we got in?
Microsoft SQL Server system views
These views include
sys.foreign_keys
sys.foreign_key_columns
with support from sys.columns and sys.tables
These are the views I deserve, but not the views I need right now. The joins are just too annoying to remember and type each time.
Besides, the word “parent” used here changes with context. The parent table in a foreign key relationship owns the foreign key and does the pointing. But say I’m modeling a hierarchy. In the context of the data model, children records point to their parent records. The mental effort needed to keep these straight is not difficult, but it’s annoying.
My Own Views
So I’ve created my own, the goal is to simplify typing and minimize joins. I skip the word “parent” all together and use “referrer” and “referrenced”. Feel free to use and build on these.
Update June 7, 2012: I added columns ReferrerColumnCanBeNull to each view. I found I wanted it, so I added it here.
FOREIGN_KEYS
CREATEVIEW dbo.FOREIGN_KEYSASSELECT SCHEMA_NAME(fk.schema_id)AS FKSchema ,
fk.nameAS FK ,
SCHEMA_NAME(p.schema_id)AS ReferrerSchema ,
p.nameAS Referrer ,
STUFF(CAST((SELECT','+ c.nameFROM sys.foreign_key_columns fkc
JOIN sys.columns c ON fkc.parent_object_id= c.object_idAND fkc.parent_column_id= c.column_idWHERE fkc.constraint_object_id= fk.object_idORDERBY fkc.constraint_column_idASCFOR
XML PATH('') ,
TYPE
)ASNVARCHAR(MAX)), 1, 1, '')AS ReferrerColumns ,
ISNULL((SELECTTOP11FROM sys.foreign_key_columns fkc
JOIN sys.columns c ON fkc.parent_object_id= c.object_idAND fkc.parent_column_id= c.column_idWHERE fkc.constraint_object_id= fk.object_idAND c.is_nullable=1), 0)AS ReferrerColumnsCanBeNull ,
SCHEMA_NAME(r.schema_id)AS ReferencedSchema ,
r.nameAS Referenced ,
STUFF(CAST((SELECT','+ c.nameFROM sys.foreign_key_columns fkc
JOIN sys.columns c ON fkc.referenced_object_id= c.object_idAND fkc.referenced_column_id= c.column_idWHERE fkc.constraint_object_id= fk.object_idORDERBY fkc.constraint_column_idASCFOR
XML PATH('') ,
TYPE
)ASNVARCHAR(MAX)), 1, 1, '')AS ReferencedColumns ,
fk.delete_referential_action_descAS deleteAction ,
fk.update_referential_action_descAS updateAction ,
fk.object_idAS FKId ,
p.object_idAS ReferrerId ,
r.object_idAS ReferencedId
FROM sys.foreign_keys fk
JOIN sys.tables p ON p.object_id= fk.parent_object_idJOIN sys.tables r ON r.object_id= fk.referenced_object_id
GO
CREATE VIEW dbo.FOREIGN_KEYS
AS
SELECT SCHEMA_NAME(fk.schema_id) AS FKSchema ,
fk.name AS FK ,
SCHEMA_NAME(p.schema_id) AS ReferrerSchema ,
p.name AS Referrer ,
STUFF(CAST(( SELECT ',' + c.name
FROM sys.foreign_key_columns fkc
JOIN sys.columns c ON fkc.parent_object_id = c.object_id
AND fkc.parent_column_id = c.column_id
WHERE fkc.constraint_object_id = fk.object_id
ORDER BY fkc.constraint_column_id ASC
FOR
XML PATH('') ,
TYPE
) AS NVARCHAR(MAX)), 1, 1, '') AS ReferrerColumns ,
ISNULL(( SELECT TOP 1
1
FROM sys.foreign_key_columns fkc
JOIN sys.columns c ON fkc.parent_object_id = c.object_id
AND fkc.parent_column_id = c.column_id
WHERE fkc.constraint_object_id = fk.object_id
AND c.is_nullable = 1
), 0) AS ReferrerColumnsCanBeNull ,
SCHEMA_NAME(r.schema_id) AS ReferencedSchema ,
r.name AS Referenced ,
STUFF(CAST(( SELECT ',' + c.name
FROM sys.foreign_key_columns fkc
JOIN sys.columns c ON fkc.referenced_object_id = c.object_id
AND fkc.referenced_column_id = c.column_id
WHERE fkc.constraint_object_id = fk.object_id
ORDER BY fkc.constraint_column_id ASC
FOR
XML PATH('') ,
TYPE
) AS NVARCHAR(MAX)), 1, 1, '') AS ReferencedColumns ,
fk.delete_referential_action_desc AS deleteAction ,
fk.update_referential_action_desc AS updateAction ,
fk.object_id AS FKId ,
p.object_id AS ReferrerId ,
r.object_id AS ReferencedId
FROM sys.foreign_keys fk
JOIN sys.tables p ON p.object_id = fk.parent_object_id
JOIN sys.tables r ON r.object_id = fk.referenced_object_id
GO
FOREIGN_KEY_COLUMNS
CREATEVIEW dbo.FOREIGN_KEY_COLUMNSASSELECT SCHEMA_NAME(fk.schema_id)AS FKSchema ,
fk.nameAS FK ,
SCHEMA_NAME(p.schema_id)AS ReferrerSchema ,
p.nameAS Referrer ,
pc.nameAS ReferrerColumn ,
pc.is_nullableAS ReferrerColumnCanBeNull ,
SCHEMA_NAME(r.schema_id)AS ReferencedSchema ,
r.nameAS Referenced,
rc.nameAS ReferencedColumn ,
fk.object_idAS FKId ,
fkc.constraint_column_idAS FKColumnId ,
p.object_idAS ReferrerId ,
r.object_idAS ReferencedId
FROM sys.foreign_keys fk
JOIN sys.foreign_key_columns fkc ON fkc.constraint_object_id= fk.object_idJOIN sys.tables p ON p.object_id= fk.parent_object_idJOIN sys.columns pc ON fkc.parent_object_id= pc.object_idAND fkc.parent_column_id= pc.column_idJOIN sys.tables r ON r.object_id= fk.referenced_object_idJOIN sys.columns rc ON fkc.referenced_object_id= rc.object_idAND fkc.referenced_column_id= rc.column_id
GO
CREATE VIEW dbo.FOREIGN_KEY_COLUMNS
AS
SELECT SCHEMA_NAME(fk.schema_id) AS FKSchema ,
fk.name AS FK ,
SCHEMA_NAME(p.schema_id) AS ReferrerSchema ,
p.name AS Referrer ,
pc.name AS ReferrerColumn ,
pc.is_nullable AS ReferrerColumnCanBeNull ,
SCHEMA_NAME(r.schema_id) AS ReferencedSchema ,
r.name AS Referenced,
rc.name AS ReferencedColumn ,
fk.object_id AS FKId ,
fkc.constraint_column_id AS FKColumnId ,
p.object_id AS ReferrerId ,
r.object_id AS ReferencedId
FROM sys.foreign_keys fk
JOIN sys.foreign_key_columns fkc ON fkc.constraint_object_id = fk.object_id
JOIN sys.tables p ON p.object_id = fk.parent_object_id
JOIN sys.columns pc ON fkc.parent_object_id = pc.object_id
AND fkc.parent_column_id = pc.column_id
JOIN sys.tables r ON r.object_id = fk.referenced_object_id
JOIN sys.columns rc ON fkc.referenced_object_id = rc.object_id
AND fkc.referenced_column_id = rc.column_id
GO