Michael J. Swart

October 11, 2012

You Can “Do Science” Anywhere

Filed under: SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 12:00 pm

Consider the Following

Did you know you can do science anywhere. Science is about learning how Nature behaves. Well what if we replace Mother Nature with SQL Server? SQL Server’s not as pretty, but we can still learn things. It’s 2012! When we wonder about things – even databases – we don’t have to guess.

Consider the following...

The Question

To keep things simple, I’m going to follow the scientific method here and this is step one. The question. All good science starts with a question. The question itself doesn’t matter. I just want to talk about the process.

So I take the question from this DBA Stackexchange question: Why are NULLS sorted first?

Good question! The SQL Standard does not actually prescribe any behavior here (says Wikipedia). So the actual behavior is left up to the RDBMS vendor. The likely answer to the question is that NULLS are sorted that way because Microsoft had to pick something and they probably just made a choice based on what other RDBMS’s were doing or other implementation considerations. Or who knows, maybe they just flipped a coin.

But that leads me to this question: Are they always sorted first? If I reverse the order of a sort by using ORDER BY column DESC do Null values still show up first? If I look at the manual, Microsoft tells me “Null values are treated as the lowest possible values” so that ORDER BY column DESC will place Null values at the end.

But the way the stack overflow question is worded makes me wonder a little and I’d like to find out for sure. So to repeat:

“If I sort the results of a query using ORDER BY column DESC do Null values come last?”

Hypothesis / Prediction

I’m placing my bets with Microsoft’s documentation and so I answer YES.

I predict that any query written with “ORDER BY column DESC” will place the Null values at the end. If I’m wrong, then it would seem like null values always first is the behavior here (Note: NULLS FIRST is actually mentioned in the SQL Standards as an extension to the ORDER BY clause, but it’s not implemented by SQL Server. This also lets me believe I’m not crazy for wondering.)

Test

That’s simple enough:

select *
from (values (1),(2),(3),(null)) as v(value)
order by value desc

The prediction says that the fourth value in the results will be NULL. And tada! The prediction is correct.

Analysis

There’s nothing surprising here, but maybe it raises some new questions. Like:

  • What do other vendors do? (Turns out they do the same. This lends weight to Microsoft picking this behavior because it was common practice)
  • What’s up with this NULLS FIRST keyword?

There’s a publish part here. The default assumption (that Microsoft documentation is often correct) is still valid so there’s nothing surprising to publish. Probably not worth a blog post (despite what you’re reading).

Boiled Down

Normally the thought processes aren’t so structured, but as long as the question comes first and the guess is checked against experiment, it counts.

It seems like common sense. And maybe I could have picked a better example. Science helps satisfy our curiosity. But it also can save us from kidding ourselves. Without this kind of thinking we risk turning into superstitious pigeons. I once saw someone perform a task (release and renew his ip address) three times out of habit “just so it takes”. That way lies madness.

6 Comments »

  1. Hey you’ve still got some unfounded speculation in there! You’ve got “Microsoft behaviour because of common practice” quite often the reality is that it is “common practice because of Microsoft behaviour”.

    Comment by Dave Swart — October 11, 2012 @ 8:22 pm

  2. Just read up on a bit of history here. (Maybe I’ll do a history post next!)

    ORDER BY is a basic part of the SELECT statement, and so we can guess that the NULLS first/last/highest/lowest decision was made when the database itself was first developed. If that’s the case, here’s a small timeline:

    70′s IBM develops System R (And initially called it SEQUEL)
    1979: Oracle released to market (first commercially available implementation of SQL)
    Early 80′s: IBM continues developing databases and releases DB2.
    1984: Sybase founded, begins development on an RDBMS.
    1986: ANSI SQL standard published
    1987: Sybase SQL Server released for unix.
    Early 90s: Microsoft with rights to Sybase SQL Server’s code base develops markets and releases Microsoft SQL Server.
    1994: What would become MySQL starts development
    1995: PostgreSQL begins development

    The names DB2 and SQL (still pronounced sequel) both imply that something came before. I’ve always understood that it was in recognition of the non-relational databases that came before, like IMS.

    So ORDER BY behaviour was probably decided on by Sybase Engineers sometime in the mid-eighties.
    So in my analysis section, I don’t say so, but when I mentioned other vendors, I meant DB2 and Oracle which preceded Sybase and Microsoft. But yeah, it’s still speculation.

    Comment by Michael J. Swart — October 12, 2012 @ 9:15 am

  3. Anytime someone spells out “SEQUEL” I keep thinking of a cool movie title called “SEQUEL The Movie”, which follows the shenanigans of a DBA who uses cheesy lines like “I just cant pull(-off) that Trigger!” or “Does the new project server need blue jeans or business casual?” (taken from Brent Ozar’s DBA Slang post).

    Sorry if this is abit off-topic, just couldn’t help myself! :P

    Comment by Links — October 19, 2012 @ 7:14 am

  4. Hey Links, off topic diversions are quite accepted here (and even encouraged!)
    I would definitely watch a movie called “SEQUEL The Movie”. I might not even wait for the DVD.

    Comment by Michael J. Swart — October 19, 2012 @ 8:57 am

  5. [...] You Can “Do Science” Anywhere By Michael J Swart Science is about learning how Nature behaves. Well what if we replace Mother Nature with SQL Server? SQL Server’s not as pretty, but we can still learn things. [...]

    Pingback by Technology Post Round Up–5th Edition « Jonathan Rozenblit — October 27, 2012 @ 6:26 pm

  6. [...] You Can “Do Science” Anywhere By Michael J Swart Science is about learning how Nature behaves. Well what if we replace Mother Nature with SQL Server? SQL Server’s not as pretty, but we can still learn things. [...]

    Pingback by Technology Post Round Up–5th Edition | MSDN Blogs — October 28, 2012 @ 8:59 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress