Michael J. Swart

January 12, 2011

Ridiculously Unnormalized Database Schemas – Part Zero

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: , , , — Michael J. Swart @ 12:00 pm
Ridiculously Unnormalized Database Schemas

I’m presenting a small series of articles that describes database normalization. But the topic is already covered quite well by Wikipedia’s Database Normalization article. And so in this series I’m going to explain database normalization by using examples. Actually by counter-example. The goal is to come up with counter-examples that break the rules so bad that you’ll never forget the rules for each normal form.

DB Normalization is a good thing

But I’m not going to explain why. At least not in depth.  That’s a topic for another day. Let me just say that SQL Server – and Oracle and DB2 and the rest – were designed to support Normalized Databases. Designing Normalized databases makes the best use of SQL Server and doing so avoids a lot of messy problems.

Before We Start…

I’m not going to jump into first normal form (1NF). That’s next article. All database schemas that are 1NF must be relational. So in this article I’m looking at information stores that aren’t relational.

I start by giving an example of an information container. It’s not a database schema, It’s not even a data model (strictly).

The Human Brain

The Human Brain

So as an information store, the brain is not super-reliable but it has it’s place. I chose this example because if you’re not in I.T., the brain still holds a ton of information!

  • To the theoretical physicist, it has a lot information by virtue of the fact that it’s a physical object.
  • The chemist and biologist would be fascinated by the information stored in the DNA of every cell.
  • And of course if connected to a living body, the brain has information for everyone else, in the usual sense (just ask!)

The Relational Database Model

Getting back to databases. Specifically ones that aren’t relational.

It’s surprising that DB normalization was defined at the exact same time as the Relational Database Model (Well, not exactly the same time. They were a chapter apart in Codd’s paper A Relational Model of Data for Large Shared Data Banks 1970)

So for a schema to even be considered for first normal form, it has to be relational. And while SQL Server is called an RDBMS, it turns out to be surprisingly easy to define tables that aren’t relational. Take this example:

-- not a relational table
CREATE TABLE Employee (
    EmployeeNumber INT NOT NULL,
    FirstName NVARCHAR (50) NOT NULL,
    LastName NVARCHAR (50) NOT NULL
)

There’s no primary key! If your tables are missing primary keys, then you’re automatically disqualified. No 1NF for you! (Incidentally the same goes for missing foreign keys.)

In the next few articles I leave the non-relational stuff behind. We’ll look at examples of relational tables (don’t worry, there’s still a lot of ridiculousness coming).

8 Comments »

  1. […] Ridiculously Unnormalized Database Schemas – Part Zero – Welcoming Michael J Swart back into the blogging fray, following on from his New Year resolution to hit that scary post button weekly. Good stuff I say! […]

    Pingback by SQL Server Blogs and news for the week 14/01/11 | John Sansom - SQL Server DBA in the UK — January 14, 2011 @ 5:33 am

  2. “the same goes for missing foreign keys” You are saying that a schema without foreign keys is not “relational”! That’s certainly not what relational meant to Ted Codd. To be relational just means that all data is represented as relations. Foreign key constraints on relations are a nice feature of course, but they don’t make a database relational!

    Comment by David — February 1, 2011 @ 4:23 am

  3. Hey David, thanks for commenting!

    On relational. I’m going by Codd’s 12 rules for that. (Rule number 10) Which mentions integrity independence. That constraints are defined and enforced outside the application.

    Foreign keys are the way SQL has implemented constraints to enforce that elements in a tuple belong to their domain (values in a table belong to their column-space).

    Did I misunderstand that?

    Comment by Michael J. Swart — February 1, 2011 @ 8:15 am

  4. As a SQL Server professional, the question I’m interested in is where do non-relational database systems (eg CouchDB) fit into the world? When are these preferable to relational systems?

    When I see people using non-relational systems to drive their systems effectively one has to wonder what the “real world” benefit of a relational system really is?

    I guess, as always, it depends what you are doing!

    Comment by jez — February 9, 2011 @ 7:11 am

  5. Jez – non-relational databases fit in to a number of different places.

    The first one that most people quote is scale. When you start hitting arbitrary size limits (based on hardware, load, and data), you have to start changing the way you store your data in an RDBMS. At this point you start denormalizing to improve performance on a single server. Non-relational solutions solve this problem by making it easier to scale out across multiple servers instead of buying a bigger server and a better SAN.

    Another reason to go non-relational is redundancy. Many non-relational databases have replication baked in, but that replication isn’t on a time delay. Instead the replication is immediate and happens when a write is initiated. You can configure your non-relational database to only respond that a write is successful when X number of copies have been written. Since data in these systems is distributed across multiple machines, you have a tremendous amount of availability built into your applications.

    People look a systems like CouchDB or MongoDB for the flexible data models they provide. When you give up a lot of the features of an RDBMS, you gain a lot of flexibility in how you model and store data.

    What’s the downside? RDBMSes provide a lot of things for “free” out of the box like indexes, constraints, and referential integrity. Most of the non-relational databases also don’t have support for strong data typing. This puts a lot of responsibility on application developers to write validations and other rules. With an RDBMS you are guaranteed that your data is always correct (according to the rules in the database). With a non-relational database you only know that your data was correct at the time it was entered (unless you’ve written something to continually validate your data).

    Comment by Jeremiah Peschka — February 9, 2011 @ 9:22 am

  6. Ladies and Gentleman, Jeremiah Peschka (NoSQLPedia Editor in Chief)

    Comment by Michael J. Swart — February 9, 2011 @ 9:24 am

  7. The benefits of the relational database model are the same as they always were. It is an extermely powerful representation of data based on logical principles and it works independently of storage considerations and without the need for navigational data stuctures. SQL Server is not relational. It is based on the SQL model which is not truly relational at all but is somewhat closer to relational than many other non-SQL alternatives.

    The reasons the “NOSQL” kind of alternatives are popular has little or nothing to do with the relational model and everything to do with the fact that the SQL DBMS market has been pretty slow to evolve and respond to change. In large part that’s because that market is dominated by just three big players who play safe with their products to protect their investment.

    Why are those NOSQL alternatives not relational? Certainly not because of any real limitations of the relational model. It’s more likely because it’s much easier and cheaper to build niche non-relational DBMSs than to build relational ones. So those innovators and newcomers who dominate that space can get their products to market quicker by building them that way.

    Comment by David — February 10, 2011 @ 8:16 pm

  8. “the SQL DBMS market has been pretty slow to evolve and respond to change”
    I totally second that.
    Cheers, David

    Comment by Michael J. Swart — February 11, 2011 @ 8:22 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress