Michael J. Swart

October 26, 2011

Where Are Your Popular Joins?

So you know a lot about your databases right? You’re familiar with their schemas and tables and the queries that run on them. Personally I use sys.dm_exec_query_stats to understand what the most popular queries are.

But I recently started wondering about popular table joins.

I was wondering: “What tables in my database are most commonly joined together?” I already have a pretty good idea based on the data model. But I wanted to find out if the popular queries are in sync with my understanding. Unfortunately there’s no system view called sys.dm_exec_join_stats. The whole reason that I was curious is that I wanted to find a set of common table joins whose queries might be improved with a indexed view.

So I wrote something that gives me a bit of an idea. It’s a query that looks at cached query plans and counts nested loop joins (multiplied by execution count).

USE tempdb;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
BEGIN TRANSACTION
 
;WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
select 
	cp.usecounts as numberOfJoins,
	seeknodes.query('.') as plansnippet
into #my_joins
from sys.dm_exec_cached_plans cp
cross apply sys.dm_exec_query_plan(cp.plan_handle)
	as qp
cross apply query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple/QueryPlan//SeekKeys/Prefix[@ScanType="EQ"]') 
	as seeks(seeknodes)
where seeknodes.exist('./RangeColumns/ColumnReference[1]/@Database') = 1
	and seeknodes.exist('./RangeExpressions/ScalarOperator/Identifier/ColumnReference[1]/@Database') = 1;
 
WITH XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/showplan' as p1)
select sum(numberOfJoins) as [Number Of Joins],
	myValues.lookupTable + '(' + myValues.lookupColumn + ')' as lookupColumn,
	myValues.expressionTable + '(' + myValues.expressionColumn + ')' as expressionColumn
from #my_joins
cross apply plansnippet.nodes('./p1:Prefix/p1:RangeColumns/p1:ColumnReference[1]')
	as rangeColumns(rangeColumnNodes)
cross apply plansnippet.nodes('./p1:Prefix/p1:RangeExpressions/p1:ScalarOperator/p1:Identifier/p1:ColumnReference[1]')
	as rangeExpressions(rangeExpressionNodes)
cross apply (
	select
		rangeColumnNodes.value('@Database', 'sysname') as lookupDatabase, 
		rangeColumnNodes.value('@Schema', 'sysname') as lookupSchema,
		rangeColumnNodes.value('@Table', 'sysname') as lookupTable,
		rangeColumnNodes.value('@Column', 'sysname') as lookupColumn,
		rangeExpressionNodes.value('@Database', 'sysname') as expressionDatabase, 
		rangeExpressionNodes.value('@Schema', 'sysname') as expressionSchema,
		rangeExpressionNodes.value('@Table', 'sysname') as expressionTable,
		rangeExpressionNodes.value('@Column', 'sysname') as expressionColumn	
	) as myValues
where myValues.expressionTable != myValues.lookupTable
group by myValues.lookupTable, myValues.lookupColumn, myValues.expressionTable, myValues.expressionColumn
order by SUM(numberOfJoins) desc;
 
rollback;

Some caveats:

  • Parsing xml takes a lot of time and a lot of CPU (the subtree cost is huge and execution time is measured in seconds or minutes)
  • It’s only useful on a system that is used a lot (as opposed to a dev database).
  • It only reports statistics about queries that are found in cached plans. So the stats are only relevant since the last server restart
  • It only counts loop joins (not hash or merge joins)
  • If you want, you can adjust the query to include schema and database names in the results

I hope you find it useful. This query gives hints for further investigation into potential indexed views. It worked well for me and so I thought it was worth sharing. I ran this query against a server I know well and I was surprised at some of the results. Good luck.

4 Comments »

  1. This made me think of the first cafeteria scene in Mean Girls, for some reason. Cool query!

    Comment by Claire — October 27, 2011 @ 10:39 am

  2. Hahaha. That’s right. You can’t join the plastics table, they’re too exclusive. And you can’t join the mathletes table because that’s social suicide.
    You know what they say: If you can’t beat ’em, join ’em. And if you can’t join ’em, have your lunch in a bathroom stall.

    Comment by Michael J. Swart — October 27, 2011 @ 11:10 am

  3. I’ve never seen Mean Girls. But I like the way you think. (And I think I need to rent that movie.)

    Comment by Kendra Little — October 27, 2011 @ 2:46 pm

  4. […] Where Are Your Popular Joins? – Alluring us to the Dark Side this week, it’s Michael J. Swart (Blog|Twitter). […]

    Pingback by Something for the Weekend – SQL Server Links 28/10/11 — October 28, 2011 @ 6:05 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress