Dr. Jim Gray developed many of the properties that transactions have in well-designed databases. Later, Haerder and Reuter took these properties and used them to coin the acronym ACID. At that time, they defined consistency this way:
A transaction reaching its normal end, thereby committing its results, preserves the consistency of the database. In other words, each successful transaction by definition commits only legal results.
Essentially, consistency means that database systems have to enforce business rules defined for their databases.
But it’s interesting. The word consistency (applied to database systems) aren’t always used the same way! For example, in Brewer’s CAP theorem the C, standing for consistency is defined as “All clients have the same view of the data.” (Really computer scientists?? That’s the word you decide to overload with different meanings?). So if you ever hear someone say eventually consistent. They’re using the consistency term from CAP, not the consistency term from ACID.
I guess “C” means something different for everyone.
Consistency in SQL Server
In my own words, consistency means that any defined checks and constraints are satisfied after any transaction:
- Columns only store values of a particular type (int columns store only ints, etc…)
- Primary keys and unique keys are unique
- Check constraints are satisfied
- Foreign key constraints are satisfied
Constraints Are Enforced One Record At A Time
Some things you might notice about these constraints. They can all be checked and validated by looking at a single row. Check constraints enforce rules based only on the columns of a single row. One exception is where these constraints might perform a singleton lookup in an index to look for the existence of a row (for enforcing foreign keys and primary keys).
Multi-line constraints are not supported directly because it would be impractical to efficiently enforce consistency. For example, it’s not possible to create a constraint on an EMPLOYEE table that would enforce the rule that the sum of employee salaries must not exceed a specific amount.
Consistency Enforced After Each Statement
Another interesting thing about SQL Server is that while ACID only requires the DBMS to enforce consistency after a compolete transaction, SQL Server will go further and enforce consistency after every single statement inside a transaction. It might be nice to insert rows into several tables in any order you wish. But if these rows reference each other with foreign keys, you still have to be careful about the order you do the inserting, transaction or no transaction.
When SQL Server finds inconsistencies. It handles it in one of a few ways.
- If a foreign key is defined properly, a change to one row can cascade to other other rows.
- If a value of a particular datatype is inserted into a column which is defined to hold a different datatype, SQL Server may sometimes implicitly convert the value to the target datatype.
- Most often, SQL Server gives up and throws an error, rolling back all effects of that statement.
Inconsistent Data Any Way
It also turns out that it’s very easy to work around these constraints! (Besides the all-too-common method of not defining constraints in the first place). Primary keys, Unique constraints and datatype validation are always enforced, no getting around them. But you can get around foreign keys and check constraints by
- using WITH NOCHECK when creating a foreign key or a check constraint. You’re basically saying, enforce any new or changing data, but don’t bother looking at any existing data. These constraints will then be marked as not trusted
- using the BULK INSERT statement (or other similar bulk operations) without CHECK_CONSTRAINTS. In this case foreign keys and check constraints are ignored and marked as not trusted.
I’m taking the following example not from I.T., but from the world of medical labs.
When processing medical tests (at least in my part of the world), there’s a whole set of rules that medical professionals have to follow. Doctors and their staff have to fill in a requisition properly. The specimen collection centre has do verify that information, take samples and pass everything on to a lab. The lab that performs the tests, ensures that everything is valid before performing the test and sending back results to the doctor.
Just like a database transaction, the hope is that everything goes smoothly. All patient information is entered properly. Patients and lab techs have followed all appropriate instructions.
Fixing Inconsistent Data: It sometimes happens that information is entered incorrectly or missing (like insurance info, or the date and time of the test). In these cases, often the lab might call back for corrections before continuing with the test. This is similar to the case when SQL Server recognizes that a statement will not leave the database in a consistent state. In some cases, SQL Server can try to do something about it. For example it can do an implicit conversion of a datatype, or it can cascade a delete/update.
Giving Up And Rolling Back: But sometimes a medical test can’t be saved. For example, sometimes a sample arrives clotted when it should have arrived unclotted (or vice versa). In these cases, meaningful results aren’t possible and the whole test has to be rejected to be performed again correctly. SQL Server will do this whenever it’s necessary to maintain consistent data. It will raise an error and the entire statement or transaction is undone (to be corrected and performed again).
Well, this counterexample comes from the world of cheesy Science Fiction. Normally we want our databases to store only consistent and legal data. Any illegal data should be rejected right away. What we don’t want is for our databases to get hung up on some crazy inconsistent data.
But if you’re Captain Kirk and you need to deal with a rogue computer or robot that’s acting up. What do you do? Simple, confuse it with inconsistent information! Those robots won’t know what hit them.
This bit of dialog comes straight from an episode of Star Trek called “I, Mudd” (I’m not even making this up, Google it!)
Kirk: Norman, Everything Harry tells you is a lie, remember that, everything Harry tells you is a lie.
Harry Mudd: Listen to this carefully Norman: I am lying.
[Norman the android starts beeping, his light starts flashing and his ears start smoking]
Norman: You say you are lying but if everything you say is a lie then you are telling the truth but you cannot tell the truth because everything you say is a lie but you lie, you tell the truth but you cannot for you lie … Illogical! Illogical!
[more of the same, more smoke and kaboom]
Honest-to-God smoke from the ears! It’s so classic it gets parodied a lot. (Here’s one of my favourites, a comic from Cyanide and Happiness).
I’m grateful that our databases don’t choke on inconsistent data. They just throw an error and tell clients “Here, you deal with it!”.