Ridiculously Unnormalized Database Schemas
- Part Zero: Schemas that are not schemas
- Part One: Schemas that are not 1st normal form
- Part Two: Schemas that are not 2nd normal form
- Part Three: Schemas that are not 3rd normal form
Second Normal Form (2NF)
I’ve seen 2NF defined this way:
“A table is considered 2NF if it is 1NF and if its non-prime attributes depends on the whole of every candidate key not just part of it.
It means that for tables that have multicolumn primary keys (or could), the other columns depend on all the columns in the key, not just a subset. If you follow this rule you’ll see that a field in a record will contain data that is a fact only about the record it belongs to, not others. From another point of view, I’ve seen normalization defined as removing redundant data
Things to Remember
- For 2NF, you only need to look at the “non-prime attributes” or in other words the attributes that aren’t part of the primary key.
- Look at these columns and ask whether they depend on the whole primary key.
- Tables with a single-column primary keys are automatically in 2NF
- BUT it’s not fair to make identity columns your primary keys on every table and call your job done (The definition of 2NF closes this loophole by mentioning candidate keys).
The Example
Take a look at the following table. It tracks reviews from the talent search t.v. show “American Idol”. The primary key is defined on the columns (Date, Singer, Song, Reviewer). Look at each other column and ask whether it depends on the whole key, or a subset of the key.
American Idol Reviews
Date |
Singer |
Song |
Reviewer |
Age |
Show Order |
Show Theme |
Result |
Review |
Aug-21 |
Justin Guarini |
Before Your Love |
Paula Abdul |
23 |
1 |
Finale |
Runner-Up |
Beautiful |
Aug-21 |
Justin Guarini |
Before Your Love |
Randy Jackson |
23 |
1 |
Finale |
Runner-Up |
A little pitchy dawg |
Aug-21 |
Justin Guarini |
Before Your Love |
Simon Cowell |
23 |
1 |
Finale |
Runner-Up |
Rubbish |
Aug-21 |
Kelly Clarkson |
A Moment Like This |
Paula Abdul |
23 |
2 |
Finale |
Winner |
Beautiful |
Aug-21 |
Kelly Clarkson |
A Moment Like This |
Randy Jackson |
23 |
2 |
Finale |
Winner |
A little pitchy dawg |
Aug-21 |
Kelly Clarkson |
A Moment Like This |
Simon Cowell |
23 |
2 |
Finale |
Winner |
Rubbish |
Aug-21 |
Justin Guarini |
Get Here |
Paula Abdul |
20 |
3 |
Finale |
Runner-Up |
Beautiful |
Aug-21 |
Justin Guarini |
Get Here |
Randy Jackson |
20 |
3 |
Finale |
Runner-Up |
A little pitchy dawg |
Aug-21 |
Justin Guarini |
Get Here |
Simon Cowell |
20 |
3 |
Finale |
Runner-Up |
Rubbish |
Aug-21 |
Kelly Clarkson |
Respect |
Paula Abdul |
23 |
4 |
Finale |
Winner |
Beautiful |
Aug-21 |
Kelly Clarkson |
Respect |
Randy Jackson |
23 |
4 |
Finale |
Winner |
A little pitchy dawg |
Aug-21 |
Kelly Clarkson |
Respect |
Simon Cowell |
23 |
4 |
Finale |
Winner |
Rubbish |
Aug-21 |
Justin Guarini |
A Moment Like This |
Paula Abdul |
23 |
5 |
Finale |
Runner-Up |
Beautiful |
Aug-21 |
Justin Guarini |
A Moment Like This |
Randy Jackson |
23 |
5 |
Finale |
Runner-Up |
A little pitchy dawg |
Aug-21 |
Justin Guarini |
A Moment Like This |
Simon Cowell |
23 |
5 |
Finale |
Runner-Up |
Rubbish |
Aug-21 |
Kelly Clarkson |
Before Your Love |
Paula Abdul |
23 |
6 |
Finale |
Winner |
Beautiful |
Aug-21 |
Kelly Clarkson |
Before Your Love |
Randy Jackson |
23 |
6 |
Finale |
Winner |
A little pitchy dawg |
Aug-21 |
Kelly Clarkson |
Before Your Love |
Simon Cowell |
23 |
6 |
Finale |
Winner |
Rubbish |
You can see that with the exception of the column Review that all the columns in tables are not dependent on the whole key. We can pull these columns into separate tables:
Normalized:
ShowContestants
Singer |
Date |
Age |
Result |
Justin Guarini |
Aug-21 |
23 |
Runner-Up |
Kelly Clarkson |
Aug-21 |
20 |
Winner |
Performances
Date |
Singer |
Song |
Show Order |
Aug-21 |
Justin Guarini |
Before Your Love |
1 |
Aug-21 |
Kelly Clarkson |
A Moment Like This |
2 |
Aug-21 |
Justin Guarini |
Get Here |
3 |
Aug-21 |
Kelly Clarkson |
Respect |
4 |
Aug-21 |
Justin Guarini |
A Moment Like This |
5 |
Aug-21 |
Kelly Clarkson |
Before Your Love |
6 |
Shows
Date |
Show Theme |
Aug-21 |
Finale |
We still have a table to hold all the reviews by the judges. Defined as:
- Reviews(Date, Singer, Song, Reviewer, Review)
But it’s still a point of debate whether or not the reviews depend on the whole primary key or a subset of the key (especially based on the example).
By the way, if you laughed at the cartoon here, check out my other Data cartoons: here and here.
If you groaned at the Data cartoon, then I apologize, That will be the last one for a while.
Comment by Michael J. Swart — January 26, 2011 @ 12:07 pm
Noooooooooooo ……. more data more data more data !!!!!! π
Great series Michael …. as always ……
Comment by David Nelles — January 27, 2011 @ 8:43 am
“But itβs still a point of debate whether or not the reviews depend on the whole primary key or a subset of the key (especially based on the example).”
LOL… too true.
And keep the Data cartoons coming… The “Data(Bass)” one is classic.
Comment by Brad Schulz — January 27, 2011 @ 1:19 pm
2NF (in common with the other normal forms) is all about dependencies on the candidate keys, not just the primary key – something which you mention in passing but which you haven’t made explicit. An important point I think. “Non-prime” means not part of any candidate key rather than simply not part of the “primary” key. It is possible for a relation with single attribute keys to be in violation of 2NF but it’s just very unusual.
Comment by David — February 1, 2011 @ 4:20 am
Hi David,
Now we’re talking about style! I see that I did put the word “candidate” in italics if that helps π
But anyway, thanks for stopping by David, I hope you’ll subscribe to my rss and weigh in. Next week I’m doing 3NF!
Comment by Michael J. Swart — February 1, 2011 @ 8:28 am
Age does not seem dependent on Date… Shouldn’t it be separated as well together with it’s key Singer…
Comment by Jan Moons — January 8, 2012 @ 2:51 pm
Hi Jan, I thought about that while writing the article. And in this case it does matter a little bit. I think one of the contestants had a birthday during the taping of the season and so the age does depend on the date. I chose not to bring it up because it seemed distracting to the article.
But it’s a good question and it shows that you know this stuff.
Comment by Michael J Swart — January 8, 2012 @ 3:29 pm
[…] https://michaeljswart.com/2011/01/ridiculously-unnormalized-database-schemas-part-two/ […]
Pingback by Database Design: Second Normal Form – Piffany's Musings — August 30, 2018 @ 2:42 pm