cxpacket | Michael J. Swart Michael J. Swart

December 8, 2011

CXPACKET? What’s That? And What’s Next?

Filed under: SQL Scripts,SQLServerPedia Syndication,Technical Articles — Tags: cxpacket, dmvs, parallelism — Michael J. Swart @ 8:54 pm

Takeaway: Is the wait type CXPACKET bothering you? It means parallelism. Here’s how to find the queries that are giving you this grief.

The Wait Type CXPACKET

Since 2005, SQL Server provides dynamic management views (DMVs). These views help monitor database health and they’re the first place I go to when troubleshoot problems. For example the view sys.dm_os_wait_stats gives an aggregated way to tell you where the waits are. Often you’ll come across the wait type CXPACKET. Microsoft explains:

“[This wait type] Occurs when trying to synchronize the query processor exchange iterator.”

Hmmm… I know what those words mean individually… Any way, if you keep searching, you’ll see others explain: “CXPACKET means parallelism”. We’re getting a bit closer now. If a query is run with multiple threads, and one thread completes before the others, then SQL Server will report the wait type CXPACKET for that thread.

It’s like several chefs are making soup. If one chef’s only job is to add the cilantro, there’s going to be some waiting involved. So is the extra help worth the trouble?

SQL Server thought it could use more cooks for a particular “recipe”… Maybe it’s time to make simpler recipes. (The answer is almost always yes, especially for OLTP systems). SQL Server doesn’t use parallelism often. It only comes up with a multi-threaded query plan when it needs to, i.e. when the plan is estimated to be costly enough. It’s usually those queries with the high estimated cost that need to be fixed, the CXPACKET wait type is a symptom of that.

Those High Cost Queries

So which queries are those? Again, I like to go to the DMVs. The one I like to look at is sys.dm_exec_query_stats except that it doesn’t have a flag called uses_parallelism. For the definitive way to find parallel queries, you have to scan and parse all the query plans that are cached using sys.dm_exec_query_plan.

But that takes for ever so I don’t recommend it. But recently I found out that we don’t really need it: Parallel queries are interesting, they’re one of the only kind of queries that can use more CPU time than actual elapsed time. So if that’s true, then we just need to examine these queries:

select * from sys.dm_exec_query_stats 
where total_elapsed_time < total_worker_time
	-- total_worker_time is measured to the microsecond, but is accurate to the millisecond so:
	and ( ( total_worker_time – total_elapsed_time ) / execution_count ) * 1.0 > 1000 -- avg difference is at least 1 ms

But it might not be true, for example, a parallel query might be suffer from blocking too long. Long enough that the elapsed time might be longer than the total amount of CPU time. That’s an interesting question: How good an indication of parallelism is the ratio between elapsed time and CPU time?

(total_elapsed_time < total_cpu_time) Vs. uses_parallelism

I’m lucky. I’ve got a large set of query data that can help me.

This chart needs a little explanation. Pick a dot in the chart above. It represents a set of queries whose ratio of cpu to elapsed time are the same. Look at the value on the x-axis to see what that common ratio is. Say that it’s 1.5, this means that that dot represents a set of queries whose elapsed time is exactly 50% greater than its cpu time. Now look at its value on the Y axis. You might see 2%. That’s means that 2% of those queries use parallelism.

I expected a slope, not a waterfall and this is what that means to me. Knowing that a query’s worker time is greater than its elapsed time is a great indication of parallelism. And knowing by how much doesn’t change that.

CXPACKET? Here’s What’s Next

This query lists the top 20 worst parallel queries cached in your database.

-- TOP 20 parallel queries (by CPU)
SELECT TOP (20)
    [Total CPU] = total_worker_time,
	[Total Elapsed Time] = total_elapsed_time,
	[Execution Count] = execution_count,
    [Average CPU in microseconds] = cast(total_worker_time / (execution_count + 0.0) as money),
    [DB Name] = DB_NAME(ST.dbid),
    [Object Name] = OBJECT_NAME(ST.objectid, ST.dbid),
    [Query Text] = (SELECT [processing-instruction(q)] = CASE 
            WHEN [sql_handle] IS NULL THEN ' '
            ELSE (SUBSTRING(ST.TEXT,(QS.statement_start_offset + 2) / 2,
                (CASE 
                        WHEN QS.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX),ST.text)) * 2
                        ELSE QS.statement_end_offset
                        END - QS.statement_start_offset) / 2))
            END
			FOR XML PATH(''), type),
    [Query Plan] = qp.query_plan
FROM sys.dm_exec_query_stats QS
CROSS APPLY sys.dm_exec_sql_text([sql_handle]) ST
CROSS APPLY sys.dm_exec_query_plan ([plan_handle]) QP
WHERE total_elapsed_time < total_worker_time
	AND ( ( total_worker_time - total_elapsed_time ) / execution_count ) * 1.0 > 1000 -- average difference is more than a millisecond
ORDER BY total_worker_time DESC

Some interesting things about this query:

I’ve written it for SQL Server Management Studio. In SSMS, you can click on the query and the query plan to explore these queries further.
I’ve added a filter to weed out queries whose average cpu time is more than a millisecond. I’m sure you won’t miss those stats. Microsoft reports CPU time in microseconds, but it’s only accurate to milliseconds. So you can get false-positives if you don’t ignore the blazing fast queries. That’s why I added that filter.

-- Comments (9)