Michael J. Swart

February 2, 2011

Ridiculously Unnormalized Database Schemas – Part Three

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: , , , — Michael J. Swart @ 12:00 pm
Ridiculously Unnormalized Database Schemas

Today, I describe 3rd Normal Form and give a ridiculous example of a table that breaks the rule badly. I’ve seen 3NF defined this way:

“A relation (table) R, in second normal form is in third normal form if every non-prime attribute of R is non-transitively dependent on every candidate key of R.”

That’s Wikipedia’s definition of third normal form. And they do a decent job explaining what that means. They reference E. F. Codd’s paper Further Normalization of the Data Base Relational Model 1972. But I couldn’t find a reference online and I was too busy this week to dig out my alumni card and visit the Davis Centre Library.

The House That Jack Built

Before we get into the ridiculous example, I want to talk a bit about the English nursery rhyme The House That Jack Built.

It’s a cumulative poem that starts with the line This is the house that jack built. It repeats and expands (like the songs I Know an Old Lady Who Swallowed a Fly or The Twelve Days of Christmas)
By the end of the poem we have a beautiful run-on sentence where the subject of the poem has very very little to do with the house that Jack built. Here is the rhyme:

This is the house that Jack built.

This is the malt that lay in the house that Jack built.

This is the rat that ate the malt
that lay in the house that Jack built.

and so on until

This is the farmer sowing his corn
That kept the cock that crowed in the morn
That waked the priest all shaven and shorn
That married the man all tattered and torn
That kissed the maiden all forlorn
That milked the cow with the crumpled horn
That tossed the dog that worried the cat
That killed the rat that ate the malt
That lay in the house that Jack built.

The cow with the crumpled horn tosses the dog.

But the thing that makes this nursery rhyme interesting is the same thing that makes tables violate third normal form. If we think of each database row as a sentence and the subject as the primary key. This brings us to our ridiculous example.

The Ridiculous Example

Houses (hope you like scroll bars)

Builder

Contents

Content Consumer

Content Consumer Killer

Content Consumer Killer Worrier

Content Consumer Killer Worrier Tosser

Content Consumer Killer Worrier Tosser Milker

Content Consumer Killer Worrier Tosser Milker Spouse

Content Consumer Killer Worrier Tosser Milker Spouse Wedder

Content Consumer Killer Worrier Tosser Milker Spouse Wedder Waker

Content Consumer Killer Worrier Tosser Milker Spouse Wedder Waker Owner

Jack

Malt

Rat

Cat

Dog

Cow (crumpled horn)

Maiden (all forlorn)

Man (tattered and torn)

Priest (shaven and shorn)

Cock (crowed in the morn)

Farmer (sowing corn)

This table is nowhere near third normal form; it’s in a different time zone! This table is supposed to be about houses, not livestock and 16th century English people. So when you’re designing your tables remember: No run on sentences!

The Other Normal Forms

So that’s the series. I could continue with 4NF, 5NF and BCNF (and others!) but I won’t for a few reasons.

  • Most people who pay attention to 1NF, 2NF and 3NF almost always design databases in the other normal forms without trying.
  • The other normal forms are dull enough that it becomes a challenge to actually come up with a counter example that can seriously be called ridiculous
  • I’ve never heard of any problems encountered by anyone caused by not paying attention to the other normal forms. If you know of any, let me know!

Liked This Series?

Hey you, yeah you. The web surfer. Thanks for coming by. I’m glad you’re visiting this post. That’s what I wrote it for. If you liked this series, you may also like my series on Transaction Levels. Or maybe have a look at some of my favorite posts (or some of yours).

Now clicking the links, that’s the first step. Subscribing to my RSS feed – That’s a whole other thing. So here’s the the deal, if you put my blog into your RSS reader, I promise to write stuff worth belonging in it. Go on, subscribe.

17 Comments »

  1. I don’t understand why you say your example isn’t in 3NF. It seems like the stated candidate key is a determinant for the other attributes. I guess you are saying that some of the other (non-key) attributes might be determinants as well. An interesting assumption perhaps but it’s not really clear if that’s what you mean or why. There are certainly important problems solved by other NFs. Google for Chasm Trap / Fan Trap.

    Comment by David — February 2, 2011 @ 1:53 pm

  2. The feature of 3NF that I’m highlighting here is what it means for a column to be transitively dependent on the key (vs. non-transitively dependent) Google for “Transitive Dependency”

    Comment by Michael J. Swart — February 2, 2011 @ 2:07 pm

  3. Exactly what transitive dependencies are you refering to? For example: Builder->Contents->ContentConsumer? But is that likely to be true for *every* house? If the table is intended to support data for another house containing malt that gets eaten by a mouse instead of a rat then that dependency would not hold at all. So there would be no violation 3NF. This is why I am interested to see where you think the violation is. What are the allowable values for each of the columns?

    Comment by David — February 2, 2011 @ 2:24 pm

  4. Well, the dependency would still hold if we’re talking about a specific bag of malt (Malt bag #42 say). In that case the dependency does hold since another house would have malt bag #43.

    It’s a bit of a stretch and the example is deliberately crazy. But the intention was that each value depends directly on the value in the column to its left.

    It may be surprising, but this actually did cross my mind when writing the article. That all the articles in the nursery rhyme are definite.

    I’ll tell you one thing David, you sure do keep me honest. Thanks again for stopping by.

    Comment by Michael J. Swart — February 2, 2011 @ 2:34 pm

  5. Hmmm. If each malt bag were uniquely identifiable then Contents would also be a candidate key wouldn’t it? Therefore the dependency Contents->ContentConsumer is a *key* dependency and not merely a transitive dependency. So 3NF would not be violated.

    Comment by David — February 2, 2011 @ 2:50 pm

  6. I see what you mean now. Malt is a candidate key.

    But the other columns don’t have to be. Imagine for me a rodent that can eat at one or more houses (but only have one killer).

    Or dogs that worry one or more cats, but would only ever get tossed by one cow.

    Comment by Michael J. Swart — February 2, 2011 @ 3:24 pm

  7. Hey Mike, Awesome post & picture as always. The example is very explanatory:

    What I wondered throughout this series was: what are problems and difficulties arise from databases not in 3nf (or 2nf, or …)?

    For 3NF I imagine it’s something like: “Hey, I have a column for one man all tatterned and torn, but I have no place in this table for the three other dudes the maiden kissed.”

    So That’s what I got out of it: mission accomplished.

    Comment by David Swart — February 2, 2011 @ 7:25 pm

  8. … Oh I see, it’s more like, if there’s a new dog, you have to update over half of the row just for one little fact change.

    Comment by David Swart — February 2, 2011 @ 7:46 pm

  9. Yeah, the whole point of normalization is to avoid problems like the ones you mention. Here’s just one other example: If the house table above was the only place to store info in the database, then there would be no place to put married couples if the wife never milked a cow 🙂

    Comment by Michael J. Swart — February 2, 2011 @ 8:36 pm

  10. […] Ridiculously Unnormalized Database Schemas – Part Three […]

    Pingback by Something for the Weekend – SQL Server Links 04/02/11 | John Sansom - SQL Server DBA in the UK — February 4, 2011 @ 12:02 pm

  11. Maybe giving an example of how to convert your ridiculous example into proper 3rd normal form would help? Showing how you can maintain the original story without degradation in accuracy of the story which seems ridiculous in and of itself, but that story is obviously important to your “customer” who wants to retrieve the story when needed.

    Comment by Ken — February 6, 2011 @ 4:55 am

  12. Hi Ken,
    That’s a great question, I chose the example to highlight what 3rd normal form is not. So the point was to give an example so that in the future if you find yourself designing anything that “smells” like this example, it might set off a warning bell. So I chose the example for it’s badness, not for how easy it is to come up with a normalized example…

    But I’m going to give it a try 🙂

    So we have two options.

    If we decide we’re going to keep the columns as is, we break the table into a bunch of parts:

    Houses (Builder, Contents, ContentsConsumer)
    Consumers (Consumer, Killer)
    Killers (Killer, Worrier)
    … and so on.

    Or we could decide to pull back a bit and make things more generic if we decide that we’re going to store similar stories:
    StoryClauses (Story, Subject, TransitiveVerb, Object)
    Which is kind of nice (especially since it lets me point out that the verb is transitive)

    Yeah, I like that one best. In fact the song I know an old woman who swallowed a fly could be done a similar way.

    Comment by Michael J. Swart — February 6, 2011 @ 9:05 am

  13. […] cow tossing dog (strangely enough) You've come to the right place, random […]

    Pingback by My favourite search terms | Michael J. Swart — January 19, 2012 @ 8:24 am

  14. i don’t get the example about jack building a house.

    Comment by Mark Delgado — August 28, 2012 @ 3:15 am

  15. Sorry to hear that Mark. What part aren’t you getting? This blog post hopes to explain a bit about database normal forms. Maybe start with Wikipedia. http://en.wikipedia.org/wiki/Third_normal_form

    Comment by Michael J. Swart — August 28, 2012 @ 8:26 am

  16. Hey Michael,

    Just wanted to commend you on your informative and entertaining posts. I found them helpful!

    Comment by Newbie to DB Design — October 18, 2012 @ 12:45 pm

  17. […] Ridiculously Unnormalized Database Schemas – Part Three […]

    Pingback by Database Design: Third Normal Form – Piffany's Musings — August 31, 2018 @ 11:08 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress