Michael J. Swart

March 8, 2010

I/O Operations, Arguably the Only Performance Metric You Need

Measuring Performance

Measuring Performance

Mike Walsh is hosting T-SQL Tuesday this month. And the topic is Input and Output (I/O).

The thing about I/O is that it’s very much a hardware topic and so the category is very very suited for database administrators. The topic is maybe a little more suited to them than us database developers.

So it’s natural to ask: What is it (if anything) about I/O that db developers should know?

I/O is THE Metric For Measuring Performance

Well, here’s my thesis:  I/O is the principal metric to watch. Any other metric – such as duration, CPU cycles, network lag etc… – don’t matter nearly as much as I/O.

Here’s why:

  • (Solid State Drives excepted) Disk operations are one of the few things databases do that rely on moving parts. As such they are slooow (relatively speaking). Save the I/O, save the world.
  • In theory, OLTP databases should get by with minimal I/O. If particular queries are performing many reads or writes chances are that the query can be improved.
  • Is it really a memory issue? You’re worried about memory. Low page life expectancy is usually a symptom of too little memory. But if this value nosedives often, it could be in conjunction with a I/O-intensive query.
  • What about measuring duration? Duration should be treated like a symptom, not a cause. Arguably, excessive I/O is a symptom too, but it’s often one step closer to the cause than duration.
  • Ease off on the CPU-heavy stuff. I’m a big fan of letting the database server serve data and of letting the application do the thinking. I work in a very developer-centric environment and it’s great: Everyone is comfortable with letting the app do any heavy CPU work.
  • I’ve found it convenient to deal with only one ruler. I’ve used I/O as my only metric for a few years now and I’ve been extremely happy with the results. (Well, not quite my only metric, but my TOP 20 I/O queries and my TOP 20 CPU queries have about 18 queries in common).
  • I plan to re-evaluate things if/when Solid State Drives have their day in the sun. It’s not too far into the (exciting) future.

As it turns out, as I write this, the first I/O articles are coming in and it seems that Rob Farley seems to have written a very thorough article about I/O as it relates to the cost of a query. He points out that we shouldn’t ignore other metrics in favor of I/O alone. I may be lucky, but I’ve never seen a performance problem in production that did not have the symptom of high I/O.

Bonus Post Script: My Top 20 I/O Query

Others have created these queries in the past. You can find them everywhere. This one’s mine. It’s a tool that’s been on my belt for a number of years.

DECLARE @SystemIO FLOAT
SELECT @SystemIO = SUM(total_logical_reads + total_logical_writes)
FROM sys.dm_exec_query_stats;
 
SELECT TOP 20 [Row Number] = ROW_NUMBER() OVER (ORDER BY total_logical_reads + total_logical_writes DESC),
    [Query Text] = CASE
        WHEN [sql_handle] IS NULL THEN ' '
        ELSE (SUBSTRING(ST.TEXT,(QS.statement_start_offset + 2) / 2,
            (CASE
                WHEN QS.statement_end_offset = -1 THEN LEN(CONVERT(nvarchar(MAX),ST.text)) * 2
                ELSE QS.statement_end_offset
                END - QS.statement_start_offset) / 2))
        END,
    [Execution Count] = execution_count,
    [Total IO] = total_logical_reads + total_logical_writes,
    [Average IO] = (total_logical_reads + total_logical_writes) / (execution_count + 0.0),
    [System Percentage] = 100 * (total_logical_reads + total_logical_writes) / @SystemIO,
    [Object Name] = OBJECT_NAME(ST.objectid),
    [Total System IO] = @SystemIO,
    [SQL Handle] = [sql_handle]
FROM sys.dm_exec_query_stats QS
CROSS APPLY sys.dm_exec_sql_text ([sql_handle]) ST
WHERE total_logical_reads + total_logical_writes > 0
ORDER BY [Total IO] DESC

11 Comments »

  1. Very nice Michael,

    I have bookmarked your site. Love how you present things naturally.

    Kind Regards,
    Pinal

    Comment by Pinal Dave — March 9, 2010 @ 12:34 am

  2. Thanks Pinal, That’s especially flattering coming from you.

    Comment by Michael J. Swart — March 9, 2010 @ 7:39 am

  3. Good post, glad I stumbled upon the #tsql2sday lots of good stuff today.
    Thanks!

    Comment by Stefan — March 9, 2010 @ 8:31 am

  4. Nice post. Your sample query’s HTML deceives us though: it replaced > with > (second-last line).

    Cheers,
    Aaron

    Comment by Aaron Bertrand — March 9, 2010 @ 10:04 am

  5. Thanks, I’ll fix that soon. (replace > with >)

    Comment by Michael J. Swart — March 9, 2010 @ 10:59 am

  6. Michael,

    Nice thought looking from the hardware view of things. It will be interesting to see if this also applies to SSD.

    Comment by Dave Schutz — March 9, 2010 @ 11:59 am

  7. Nice post Michael.

    Comment by Jon Russell — March 9, 2010 @ 4:31 pm

  8. […] Michael Swart goes to the other side of the coin a bit. He talks about his own experiences and that tuning to IO is one of the best ways he has seen to improve query performance. I think if you take his advice, cross apply some of Kalen’s and Rob’s advice you’ll do just fine. I agree with Michael, in the majority of cases I come across, reducing the reads reduces the duration. Just be cautious like Kalen said and don’t go overboard like Rob said. […]

    Pingback by T-SQL Tuesday #4 - IO, IO It's Off To Disk We Go | SQL Server Blog - StraightPath Solutions — March 10, 2010 @ 11:41 pm

  9. Tricky point ( about laying off the CPU heavy stuff )

    I think that striking the balance between retrieving enough information and presenting it to the developer for manipulation, and what the database can retrieve and do the CPU work on… is the tricky part for me. Probably my two biggest defenses for this are…

    1) lists ( ie. paging )

    2) retrieving details for x expected entries ( ie. app wants id IN (1, 102, 201, 450001 ) )

    Both of which require some amount of heavy lifting by the server. ( at least in my implementations )

    Comment by Richard — March 25, 2010 @ 12:08 pm

  10. Good points Richard, and I think in that case striking the balance is the right approach.

    In general everyone should be aware of absolute rules of thumb. I deliberately try to write blog articles in black and white rather than shades of gray in the hopes that someone calls me on a particular point. By defending or (yikes) conceding points, I learn a topic deeper.

    Comment by Michael J. Swart — March 25, 2010 @ 12:20 pm

  11. […] I/O Operations, Arguably the Only Performance Metric You Need – It’s that Database Whisperer chap again, Michael J Swart discusses. […]

    Pingback by Something for the Weekend: SQL Server Links 12/03/10 | John Sansom - SQL Server DBA in the UK — April 17, 2010 @ 4:22 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress