Michael J. Swart

March 9, 2010

Off Topic: Magic 8-Ball vs. 20Q

Filed under: Tongue In Cheek — Michael J. Swart @ 1:00 am
Let them sort it out.

Let them sort it out.

Those looking for SQL Server tips and tricks, please move along. Thanks.

Mattel sells a toy called the magic 8-ball that can answer any yes or no question you ask of it. And 20Q.net also sells a toy that does an amazing job guessing what you’re thinking about provided only yes or no answers.

These two need to get together.

I present Round 1:

20Q: Would you use it daily?
M8B: My reply is no
20Q: Can you find it in a house?
M8B: Ask again later
20Q: (later) Can you find it in a house?
M8B: Yes
20Q: Does it have feet?
M8B: Cannot predict now
20Q: Is it colorful?
M8B: Most likely
20Q: Does it grow over time?
M8B: My reply is no
20Q: Does it have writing on it?
M8B: Ask again later
20Q: (later) Does it have writing on it?
M8B: My sources say no
20Q: Is it a member of the cat family?
M8B: Ask again later
20Q: (later) Is it a member of the cat family?
M8B: Don’t count on it
20Q: Can it be played?
M8B: Outlook good
20Q: Is it transparent?
M8B: Most likely
20Q: Do you carry it in your pocket?
M8B: My reply is no
20Q: Does it refract light?
M8B: Yes
20Q: Is some part of it made of glass?
M8B: It is decidedly so
20Q: Does it contain a liquid?
M8B: You may rely on it
20Q: Is it larger than a pound of butter?
M8B: Outlook not so good
20Q: Do you use it in public?
M8B: Without a doubt
20Q: Do you use it at night?
M8B: It is certain
20Q: Is it round?
M8B: As I see it, yes
20Q: Does it have a hard outer shell?
M8B: My sources say no
20Q: I am guessing that it is a martini glass?
M8B: You may rely on it

Martini glass

"You may rely on it"

So, A martini glass. That’s pretty classy.

(I also daydream about what it would be like if the Oracle at Delphi ever met the Sphinx at Thebes.)

March 8, 2010

I/O Operations, Arguably the Only Performance Metric You Need

Filed under: SQLServerPedia Syndication, Technical Articles — Tags: , , — Michael J. Swart @ 11:43 pm
Measuring Performance

Measuring Performance

Mike Walsh is hosting T-SQL Tuesday this month. And the topic is Input and Output (I/O).

The thing about I/O is that it’s very much a hardware topic and so the category is very very suited for database administrators. The topic is maybe a little more suited to them than us database developers.

So it’s natural to ask: What is it (if anything) about I/O that db developers should know?

I/O is THE Metric For Measuring Performance

Well, here’s my thesis:  I/O is the principal metric to watch. Any other metric – such as duration, CPU cycles, network lag etc… – don’t matter nearly as much as I/O.

Here’s why:

  • (Solid State Drives excepted) Disk operations are one of the few things databases do that rely on moving parts. As such they are slooow (relatively speaking). Save the I/O, save the world.
  • In theory, OLTP databases should get by with minimal I/O. If particular queries are performing many reads or writes chances are that the query can be improved.
  • Is it really a memory issue? You’re worried about memory. Low page life expectancy is usually a symptom of too little memory. But if this value nosedives often, it could be in conjunction with a I/O-intensive query.
  • What about measuring duration? Duration should be treated like a symptom, not a cause. Arguably, excessive I/O is a symptom too, but it’s often one step closer to the cause than duration.
  • Ease off on the CPU-heavy stuff. I’m a big fan of letting the database server serve data and of letting the application do the thinking. I work in a very developer-centric environment and it’s great: Everyone is comfortable with letting the app do any heavy CPU work.
  • I’ve found it convenient to deal with only one ruler. I’ve used I/O as my only metric for a few years now and I’ve been extremely happy with the results. (Well, not quite my only metric, but my TOP 20 I/O queries and my TOP 20 CPU queries have about 18 queries in common).
  • I plan to re-evaluate things if/when Solid State Drives have their day in the sun. It’s not too far into the (exciting) future.

As it turns out, as I write this, the first I/O articles are coming in and it seems that Rob Farley seems to have written a very thorough article about I/O as it relates to the cost of a query. He points out that we shouldn’t ignore other metrics in favor of I/O alone. I may be lucky, but I’ve never seen a performance problem in production that did not have the symptom of high I/O.

Bonus Post Script: My Top 20 I/O Query

Others have created these queries in the past. You can find them everywhere. This one’s mine. It’s a tool that’s been on my belt for a number of years.

DECLARE @SystemIO FLOAT
SELECT @SystemIO = SUM(total_logical_reads + total_logical_writes)
FROM sys.dm_exec_query_stats;
 
SELECT TOP 20 [ROW Number] = ROW_NUMBER() OVER (ORDER BY total_logical_reads + total_logical_writes DESC),
    [Query TEXT] = CASE
        WHEN [sql_handle] IS NULL THEN ' '
        ELSE (SUBSTRING(ST.TEXT,(QS.statement_start_offset + 2) / 2,
            (CASE
                WHEN QS.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX),ST.TEXT)) * 2
                ELSE QS.statement_end_offset
                END - QS.statement_start_offset) / 2))
        END,
    [Execution COUNT] = execution_count,
    [Total IO] = total_logical_reads + total_logical_writes,
    [Average IO] = (total_logical_reads + total_logical_writes) / (execution_count + 0.0),
    [System Percentage] = 100 * (total_logical_reads + total_logical_writes) / @SystemIO,
    [OBJECT Name] = OBJECT_NAME(ST.objectid),
    [Total System IO] = @SystemIO,
    [SQL Handle] = [sql_handle]
FROM sys.dm_exec_query_stats QS
CROSS APPLY sys.dm_exec_sql_text ([sql_handle]) ST
WHERE total_logical_reads + total_logical_writes > 0
ORDER BY [Total IO] DESC

March 3, 2010

Start learning about SQL Server

Filed under: SQLServerPedia Syndication — Michael J. Swart @ 9:58 am

Takeaway: I provide a list of free or near-free resources that novices can look to for an introduction to SQL Server.

I’ve often written articles about topics that are new or interesting to me. What I mean is that I’ve often avoided writing about the basics (e.g. what tables, row or columns are. What a SELECT statement is). A lot of this attitude is from feedback I got once. Someone told me once that most of my articles fell into one of two categories: The category knew that already or the category don’t care. So I’ve always tried to write articles that fall in the category relatively novel or the category relatively relevant*.

But there’s a huge need for 101 level information. I’m thinking about new developers. And I’m thinking about people just getting started in the I.T. industry. This post is for you guys.

But why re-invent the wheel? I’m just going to stand on the shoulders of giants and point you the best sites (in my opinion) for getting started. The best part is that at this level, this information will often be relevant for years. For example, compare the usefulness of a Notification Services article versus that of a SELECT statement article.

T-SQL 101 articles by SQL Server Magazine

Biggest advantage: Tailored to the greenest SQL Developer.

This series is a set of 10 articles written by Bill McEvoy in 2008.

By far the most comprehensive kick start I’ve come across. The only downside is that it’s not free. Personally, I use Google for my table of contents on this series. But the last article http://www.sqlmag.com/article/tsql3/t-sql-101-stored-procedures.aspx has a nice set of links to the previous lessons as well.

Pinal Dave’s site SQL Authority

Biggest Advantage: Breadth of topics.

Pinal Dave is a SQL Server MVP who is one of the most prolific bloggers I know. He never shies away from any topic (complicated or uncomplicated).

He’s organized his web site nicely, but I like to browse his intro to topics found here.

SQL University

Biggest Advantage: Comprehensive and good start for SQL admins.

Remember those giants I told you about? The ones whose shoulders I’m standing on? Jorge Segarra (aka SQL Chicken) is one of those guys. He came up with the idea of SQL Server University. A great home base for getting started as a DBA. There is little overlap between this info and the T-SQL 101 articles because of the target audience. T-SQL 101 is aimed at new SQL developers, while SQL University is aimed at new DBAs.

What do you think?

Am I missing any good beginner resources? Leave a comment below. I’ll consider updating this article to include it.

* relatively relevant I love the sound of that phrase. Must try to use it more.

February 17, 2010

More images from the Spatial Results tab

Filed under: SQLServerPedia Syndication, Tongue In Cheek — Michael J. Swart @ 1:00 am

Inspired by Itzik Ben Gan’s Geekiest Sig Challenge and Michael Cole’s virtual Christmas Card. I came up with a query that forces SQL Server’s spatial results tab to display something pretty. In recognition of Valentine’s Day, I chose a detail of Botticelli’s Birth of Venus.

The Query

The query isn’t too large, but it’s 120 KB, you can download it here:

http://michaeljswart.com/wp-content/uploads/2010/02/venus.sql

The Results

Not quite as pretty as the original

Not quite as pretty as the original

The Process

  • First reduce the color depth to a given color pallet (SQL Server’s icky pastels). I personally use a custom technique for this.
  • Use a program like Adobe Illustrator to turn into an svg file.
  • From there use your C# and VI savvy to convert into a SQL query.

Cheers guys

February 5, 2010

Removing Columns

Takeaway: Deleting columns does not reduce the width of a record (including new ones). Tables have to be rebuilt in order to reclaim space.

So I had the opportunity to talk to Michelle Ufford (SqlFool.com) last November after her PASS Summit Talk. We wondered what happens to the space taken by columns after they’ve been deleted. For example, you may be removing columns or altering columns in order to reduce a records’ width. The improved bytes/row (and rows/page) can provide a huge benefit to performance.

Michelle thought it was a great idea for a blog topic and so did I. But Michelle graciously agreed to let me be the one to write it up (I understand she’s busy this month). So here’s the main question:  What does happen to the space taken by columns that have been deleted? The naive idea that SQL Server magically cleans it up is wrong, for example:

First I create a table with really inefficient columns:

CREATE TABLE NAME (
	id INT IDENTITY PRIMARY KEY,
	FIRST CHAR(1000) NOT NULL,
	middle CHAR(1000) NOT NULL,
	LAST CHAR(1000) NOT NULL
);
GO
INSERT NAME (FIRST, middle, LAST) VALUES ('Michael', 'J', 'Swart');
INSERT NAME (FIRST, middle, LAST) VALUES ('Lester', 'B', 'Pearson');
INSERT NAME (FIRST, middle, LAST) VALUES ('Mack', 'D', 'Knife');
INSERT NAME (FIRST, middle, LAST) VALUES ('Homer', 'J', 'Simpson');

These four rows I put in take up two data pages like this:

pages graph

Remember, each page is 8 Kb long. So lets remove the middle column and right-size the others like this:

ALTER TABLE NAME DROP COLUMN Middle;
ALTER TABLE NAME ALTER COLUMN FIRST VARCHAR(20) NOT NULL;
ALTER TABLE NAME ALTER COLUMN LAST VARCHAR(20) NOT NULL;
 
-- And add a couple rows for good measure:
INSERT NAME (FIRST, LAST) VALUES ('Bartholomew', 'Simpson');
INSERT NAME (FIRST, LAST) VALUES ('Lisa', 'Simpson');

So things should look better right? Not quite! The old data pages haven’t changed at all and the new data looks just as bad:

pages graph

What’s Happening

The action of dropping columns is not an operation that affects existing data. It’s a meta-data operation only. You can actually see this at work using the slightly undocumented view sys.system_internals_partition_columns. The is_dropped field indicates that the columns haven’t disappeared, they’ve just been marked as dropped:

SELECT t.name AS tablename,
c.name AS columnname,
	ipc.*
FROM sys.system_internals_partition_columns ipc
join sys.partitions p
	ON ipc.partition_id = p.partition_id
join sys.tables t
	ON t.OBJECT_ID = p.OBJECT_ID
LEFT join sys.columns c
	ON c.OBJECT_ID = t.OBJECT_ID
	and c.column_id = ipc.partition_column_id
WHERE t.name = 'NAME'
ORDER BY c.column_id

So just how hard is it to get rid of columns?

Samson Removing Columns

Hair Makes Removing Columns Possible

How to Really Get Rid of Columns

It turns out that an index REBUILD is required (A REORGANIZE isn’t good enough). If you have have a maintenance plan it might eventually take care of this. Or you can do this explicitly:

ALTER INDEX ALL ON NAME REBUILD;

Things look much nicer now:

pages graph

The information in the diagram above is hard to see, but it works out to about 200 rows per page as we had hoped. The column is truly deleted now which you can see using this query (once again):

SELECT t.name AS tablename,
	c.name AS columnname,
	ipc.*
FROM sys.system_internals_partition_columns ipc
join sys.partitions p
	ON ipc.partition_id = p.partition_id
join sys.tables t
	ON t.OBJECT_ID = p.OBJECT_ID
LEFT join sys.columns c
	ON c.OBJECT_ID = t.OBJECT_ID
	and c.column_id = ipc.partition_column_id
WHERE t.name = 'NAME'
ORDER BY c.column_id

Other Blogs have treated this topic before e.g.:

  • Andras at Simple Talk last year.
  • Kalen Delaney whose name is almost synonymous with SQL Internals looked at this in 2006.

But none of those have pictures of Samson, do they?

February 4, 2010

Forbidden String

Filed under: SQLServerPedia Syndication, Technical Articles — Michael J. Swart @ 10:13 am

ATTENTION SQL BLOGGERS,

I’ve been blogging using the wordpress platform for about three months now and every now and then I’ve come across a certain quirkiness that I’ve finally gotten to the bottom of. I want to warn you about it and maybe save some head-scratching.

Articles with a particular string cannot be saved. When I try to use this particular string, the web server times out. At least in my environment and after checking with Brent Ozar, it’s not isolated to just me.

Here’s the string, I’m pasting it as an image for somewhat obvious reasons:

Some programmer doesn't want you using this string

Some programmer doesn't want you using this string

What’s really odd is that

  • VARCHAR ( works (note the space).
  • varchar( works.
  • CHAR( works.
  • and ARCHAR( works.

If this is a security “feature” it seems like a crummy one. Crummy because it hinders valid work (by us honest sql bloggers) and crummy times two because It’s not particularly effective at guarding against some perceived risk. Especially when the work-around of VARCHAR space bracket works just fine.

It seems like a known issue though: http://core.trac.wordpress.org/ticket/11486

Your mileage may vary. But try it out. See if you can save a draft of a blog post using the forbidden string. Leave a comment back here if you had trouble like I do.

February 2, 2010

Looking at Page Contents

Filed under: SQLServerPedia Syndication, Technical Articles — Tags: , — Michael J. Swart @ 10:06 am

If you want to dig deep into SQL Server internals, you’re eventually going to want to look at the physical page structure of a data page. As such I’ve bookmarked Anatomy of a Page by Paul Randal.

The syntax for all this stuff is hard for me to remember, so I’ve made a template for myself. And I’m sharing that here:

-- looking at page contents.
 
-- T3604 to output to console
DBCC traceon (3604)
 
-- select rows from the table of interest
-- (as well as their physical location)
SELECT TOP (10) t.*,
	pl.FILE_ID, pl.page_id, pl.slot_id
FROM <tablename,sysname,Production.Product> AS t
cross apply sys.fn_PhysLocCracker(t.%%physloc%%) AS pl;
 
DECLARE @dbname sysname;
SET @dbname = DB_NAME();
DBCC page(@dbname, --db_name
	1, --file_id
	136, --page_id
	1) --results style

Ctrl+Shift+M will let you provide the table name. I’ve also made this script into a code-snippet called page.

Note that the script makes use of sys.fn_PhysLocCracker which is SQL 2008 and later.

By the way, I highly recommend playing with this stuff. Not having to guess at the width of records is really helpful. I plan to write a post soon which shows how it helped an investigation I did.

January 27, 2010

Blogger tip: DIY In-Line Forms

Filed under: SQLServerPedia Syndication — Michael J. Swart @ 12:56 pm


Rick Klau of Google recently mentioned a relatively unused feature of Google Docs called Docs Forms. I wanted to mention it in case you haven’t come across it yet. Docs Forms is a feature of Google Docs that allows bloggers to embed forms into their blog articles or web pages. And that includes all blogging platforms (not just google’s blogspot). Basically, these forms can be embedded into any html that supports IFRAMES.

This easy-to-use form-creator opens a huge potential to bloggers that might not have used forms in the past (e.g. me). The list of applications includes surveys, polls, contest submissions, contact pages and so on.

For me, Blogs + Forms allows a higher level of interaction with readers than I had compared to this time last week. And that is an exciting prospect. If you care, you can see how I’m using a form right now on my “Mail Me” page.

Caveat:
There is one thing I worry about though. Blog polls are notorious for producing untrustworthy data. It’s the nature of a self-selected sample population.

For example, I bet Paul Randal’s survey takers, are a bit savvier than average so his blog survey results have to be taken with a grain of salt. In other words, I bet the who’s-Paul-Randal? crowd have a lot of people in common with the No-disaster-recover-plan crowd and they might be under-represented in such surveys.

Example
And what would this blog post be without an example. So without further ado, give the feature a spin:

January 18, 2010

Tic Tac Toe vs. T-SQL

Filed under: SQLServerPedia Syndication, Tongue In Cheek — Michael J. Swart @ 11:50 pm
It's learning

It's learning

Stephen Horne (aka BlueDog67) threw down the challenge last week to Create An Unbeatable Tic-Tac-Toe Game. He did that in this post.

So that’s what I did. Here’s my crack at it (which I also left as a comment):

The curious can download http://michaeljswart.com/wp-content/uploads/2010/01/tictactoe.zip

The zipped file contains two files:

  • One is a bcp output file that contains data. (It is used by the script)
  • The other is a sql script that creates the following tables and procedures:
    • TABLE tttBoard (spaceId, value)
    • TABLE tttMoves(id, turn, bestNextMove)
    • PROCEDURE s_CleanBoard()
    • PROCEDURE s_PrintBoard()
    • PROCEDURE s_MakeMove(@spaceId, @letter)
    • PROCEDURE s_ComputerMove

I decided to use a look up table (tttMoves) to find the best move. I thought it fitting to develop a solution that uses a set of data rather than an algorithm because of it is T-SQL after all. Here are three things I learned.

  • The size of tttMoves is extremely small. Roughly 4500. Think about it. If you take nine ordered spaces on a tic-tac-toe board and put every combination of X, O, or space, there’s still a manageable number of boards to deal with. I removed completed boards and boards showing games that were invalid. 4500 were left. Of these game boards, only about 400 were more complicated than a simple win or a block for their best move.
  • The whole process was a lot quicker than it took the eighties super-computer to learn tic tac toe in WarGames. (Who remembers that? To this day, joshua is a terrible password)
  • A T-SQL tip: This should have been obvious but remember that order of columns in a table matters when using BULK INSERT with a data file using native formatting.

January 17, 2010

An incentive, and a hint

Filed under: SQLServerPedia Syndication, Tongue In Cheek — Michael J. Swart @ 8:21 pm

So last week, I challenged the world to solve a puzzle. This was part of Adam Machanic’s T-SQL Tuesday. And while the entire world does not read my blog, Google Analytics tell me that there are roughly a hundred of you that do.

But even among those of you who read my blog, there were no bites. Not one. So I decided to not give the solution this week. Instead I offer a prize and a hint. First the hint for everyone (it’s a two parter):

  • First, look in the source for a clue
  • secondly, the puzzle is not about cryptography, but can be solved with one or two lines of T-SQL

And now for the prize: Coffee on me. This prize is for one person. I’ve got a $10 Tim Horton’s gift card ready to be sent to the first person to tell me the answer to the puzzle. (A Starbucks gift card if you’re international). So get to it!
Update(1/18/2010)… We have a winner. Check out the solution here.


Older Posts »

Powered by WordPress