Mike Walsh is hosting T-SQL Tuesday this month. And the topic is Input and Output (I/O).
The thing about I/O is that it’s very much a hardware topic and so the category is very very suited for database administrators. The topic is maybe a little more suited to them than us database developers.
So it’s natural to ask: What is it (if anything) about I/O that db developers should know?
I/O is THE Metric For Measuring Performance
Well, here’s my thesis: I/O is the principal metric to watch. Any other metric – such as duration, CPU cycles, network lag etc… – don’t matter nearly as much as I/O.
- (Solid State Drives excepted) Disk operations are one of the few things databases do that rely on moving parts. As such they are slooow (relatively speaking). Save the I/O, save the world.
- In theory, OLTP databases should get by with minimal I/O. If particular queries are performing many reads or writes chances are that the query can be improved.
- Is it really a memory issue? You’re worried about memory. Low page life expectancy is usually a symptom of too little memory. But if this value nosedives often, it could be in conjunction with a I/O-intensive query.
- What about measuring duration? Duration should be treated like a symptom, not a cause. Arguably, excessive I/O is a symptom too, but it’s often one step closer to the cause than duration.
- Ease off on the CPU-heavy stuff. I’m a big fan of letting the database server serve data and of letting the application do the thinking. I work in a very developer-centric environment and it’s great: Everyone is comfortable with letting the app do any heavy CPU work.
- I’ve found it convenient to deal with only one ruler. I’ve used I/O as my only metric for a few years now and I’ve been extremely happy with the results. (Well, not quite my only metric, but my TOP 20 I/O queries and my TOP 20 CPU queries have about 18 queries in common).
- I plan to re-evaluate things if/when Solid State Drives have their day in the sun. It’s not too far into the (exciting) future.
As it turns out, as I write this, the first I/O articles are coming in and it seems that Rob Farley seems to have written a very thorough article about I/O as it relates to the cost of a query. He points out that we shouldn’t ignore other metrics in favor of I/O alone. I may be lucky, but I’ve never seen a performance problem in production that did not have the symptom of high I/O.
Bonus Post Script: My Top 20 I/O Query
Others have created these queries in the past. You can find them everywhere. This one’s mine. It’s a tool that’s been on my belt for a number of years.
DECLARE @SystemIO FLOAT SELECT @SystemIO = SUM(total_logical_reads + total_logical_writes) FROM sys.dm_exec_query_stats; SELECT TOP 20 [Row Number] = ROW_NUMBER() OVER (ORDER BY total_logical_reads + total_logical_writes DESC), [Query Text] = CASE WHEN [sql_handle] IS NULL THEN ' ' ELSE (SUBSTRING(ST.TEXT,(QS.statement_start_offset + 2) / 2, (CASE WHEN QS.statement_end_offset = -1 THEN LEN(CONVERT(nvarchar(MAX),ST.text)) * 2 ELSE QS.statement_end_offset END - QS.statement_start_offset) / 2)) END, [Execution Count] = execution_count, [Total IO] = total_logical_reads + total_logical_writes, [Average IO] = (total_logical_reads + total_logical_writes) / (execution_count + 0.0), [System Percentage] = 100 * (total_logical_reads + total_logical_writes) / @SystemIO, [Object Name] = OBJECT_NAME(ST.objectid), [Total System IO] = @SystemIO, [SQL Handle] = [sql_handle] FROM sys.dm_exec_query_stats QS CROSS APPLY sys.dm_exec_sql_text ([sql_handle]) ST WHERE total_logical_reads + total_logical_writes > 0 ORDER BY [Total IO] DESC