So you know a lot about your databases right? You’re familiar with their schemas and tables and the queries that run on them. Personally I use sys.dm_exec_query_stats to understand what the most popular queries are.
But I recently started wondering about popular table joins.
I was wondering: “What tables in my database are most commonly joined together?” I already have a pretty good idea based on the data model. But I wanted to find out if the popular queries are in sync with my understanding. Unfortunately there’s no system view called sys.dm_exec_join_stats. The whole reason that I was curious is that I wanted to find a set of common table joins whose queries might be improved with a indexed view.
So I wrote something that gives me a bit of an idea. It’s a query that looks at cached query plans and counts nested loop joins (multiplied by execution count).
USE tempdb; SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; BEGIN TRANSACTION ;WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan') select cp.usecounts as numberOfJoins, seeknodes.query('.') as plansnippet into #my_joins from sys.dm_exec_cached_plans cp cross apply sys.dm_exec_query_plan(cp.plan_handle) as qp cross apply query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple/QueryPlan//SeekKeys/Prefix[@ScanType="EQ"]') as seeks(seeknodes) where seeknodes.exist('./RangeColumns/ColumnReference[1]/@Database') = 1 and seeknodes.exist('./RangeExpressions/ScalarOperator/Identifier/ColumnReference[1]/@Database') = 1; WITH XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/showplan' as p1) select sum(numberOfJoins) as [Number Of Joins], myValues.lookupTable + '(' + myValues.lookupColumn + ')' as lookupColumn, myValues.expressionTable + '(' + myValues.expressionColumn + ')' as expressionColumn from #my_joins cross apply plansnippet.nodes('./p1:Prefix/p1:RangeColumns/p1:ColumnReference[1]') as rangeColumns(rangeColumnNodes) cross apply plansnippet.nodes('./p1:Prefix/p1:RangeExpressions/p1:ScalarOperator/p1:Identifier/p1:ColumnReference[1]') as rangeExpressions(rangeExpressionNodes) cross apply ( select rangeColumnNodes.value('@Database', 'sysname') as lookupDatabase, rangeColumnNodes.value('@Schema', 'sysname') as lookupSchema, rangeColumnNodes.value('@Table', 'sysname') as lookupTable, rangeColumnNodes.value('@Column', 'sysname') as lookupColumn, rangeExpressionNodes.value('@Database', 'sysname') as expressionDatabase, rangeExpressionNodes.value('@Schema', 'sysname') as expressionSchema, rangeExpressionNodes.value('@Table', 'sysname') as expressionTable, rangeExpressionNodes.value('@Column', 'sysname') as expressionColumn ) as myValues where myValues.expressionTable != myValues.lookupTable group by myValues.lookupTable, myValues.lookupColumn, myValues.expressionTable, myValues.expressionColumn order by SUM(numberOfJoins) desc; rollback; |
Some caveats:
- Parsing xml takes a lot of time and a lot of CPU (the subtree cost is huge and execution time is measured in seconds or minutes)
- It’s only useful on a system that is used a lot (as opposed to a dev database).
- It only reports statistics about queries that are found in cached plans. So the stats are only relevant since the last server restart
- It only counts loop joins (not hash or merge joins)
- If you want, you can adjust the query to include schema and database names in the results
I hope you find it useful. This query gives hints for further investigation into potential indexed views. It worked well for me and so I thought it was worth sharing. I ran this query against a server I know well and I was surprised at some of the results. Good luck.
This made me think of the first cafeteria scene in Mean Girls, for some reason. Cool query!
Comment by Claire — October 27, 2011 @ 10:39 am
Hahaha. That’s right. You can’t join the plastics table, they’re too exclusive. And you can’t join the mathletes table because that’s social suicide.
You know what they say: If you can’t beat ’em, join ’em. And if you can’t join ’em, have your lunch in a bathroom stall.
Comment by Michael J. Swart — October 27, 2011 @ 11:10 am
I’ve never seen Mean Girls. But I like the way you think. (And I think I need to rent that movie.)
Comment by Kendra Little — October 27, 2011 @ 2:46 pm
[…] Where Are Your Popular Joins? – Alluring us to the Dark Side this week, it’s Michael J. Swart (Blog|Twitter). […]
Pingback by Something for the Weekend – SQL Server Links 28/10/11 — October 28, 2011 @ 6:05 am