It seems like Microsoft has always provided example databases in order to illustrate database concepts.
Pubs: Back when SQL Server 2000 was new, Microsoft provided this sample database and maybe you remember it. Pubs was a really simple database that was designed to represent data that a fictional book publishing company might use. The database had a name (pubs) but I don’t think anyone ever gave the company a name. From here we’re told that it was “used to demonstrate many of the options available” for SQL Server.
Northwind also was available for SQL Server 2000. It was a database that was used for a fictitious company “Northwind Traders”. The company “imports and exports specialty foods from around the world.” It was better than pubs, but that wasn’t saying much. Pubs set a pretty low bar.
AdventureWorks: The Adventureworks database came along with SQL Server 2005 and we were introduced to the bicycle manufacturer Adventure Works Cycle. The database was pretty comprehensive and it touched on a lot more features of SQL Server. I’ve seen countless performance demos that query Sales.SalesOrderDetail. There’s even an Adventureworks Data Warehouse, an Adventureworks OLAP cube and even a light version: AdventureworksLT.
Contoso And lately there’s a new sample database Contoso. It’s another fictional retail company (Contoso Ltd. This time they’re selling electronics and appliances). This database is used for Business Intelligence demos. With Contoso, Microsoft has developed and made a data warehouse and an SSAS cube available.
Presumably, each of these databases is an example of Microsoft putting their best foot forward. Or at least it’s a database which enables them to show off what their database software can do.
So those databases are good for demos.
But if we’re looking for other Microsoft-designed databases, we don’t have to stop there. There’s other ones that we can look at. I’m talking about the databases that Microsoft ships along with their products. When looking at these, remember:
- There are some good examples, …
We can look at those databases and see how they’re constructed. If you see some interesting db designs, you can use that.
- … and there are some not so good examples …
These databases weren’t built as database role models. So there’s bound to be a lot of database design choices that are not ideal. I know that Microsoft has huge pressure to maintain backwards compatibility.
- … and that’s okay.
It’s actually a fun exercise to pick out the good designs from the bad. It’s like a smorgasbord, you take the lessons you like and leave the rest.
Some examples of what I’m talking about.
Install and configure an instance of SQL Server Reporting Services. Take a look at the ReportServer. It’s a pretty nice database schema. One thing that was interesting was the choice of identity column data types:
- Catalog seems to be one of the main tables here and they’re using uniqueidentifiers (GUIDs) for identity columns. Normally people scoff at that, but I think it makes sense in this case. Catalog is the table that tracks reports, datasources and folder structures. Even at it’s largest, it can’t be that big (I just checked my own installation and the clustered index fits on 1 extent). That means that all the worries about GUIDs for primary keys are mitigated and the choice of using a globally unique identifier might make sense here.
- ExecutionLogStorage Contrast that with ReportServer’s largest table, ExecutionLogStorage. Depending on how busy your Reporting Services installation is, this table can grow very large with one row for every report view. When I look at the choice for their primary key, I see that it’s clustered index on a single BIGINT column defined as it’s IDENTITY column. Well done.
Any Database’s Metadata
- I like the object oriented influence here:
- The relationship between sys.objects –> sys.tables and between sys.objects –> sys.procedures is the subtype relationship class –> subclass. And that shows in the schema, for example, there’s one sys.object row for every sys.table row.
- The relationship between sys.foreign_keys –> and sys.foreign_key_columns is the object composition relationship. And that also shows in the schema too. The “primary key” for the table foreign_key_columns is (constraint_object_id, constraint_column_id) and it contains the primary key of its owner.
- Another good example is the sys.index_columns table. Look at the schema diagram below. It’s a perfect instance of my favourite DB Design example
Any Sharepoint database
- [Sigh] These are mostly counter-examples here, I took a brief glance at the schema once and decided quickly that there are no good lessons. Can anyone tell me if that’s changed?