Michael J. Swart

May 28, 2013

Swart’s Ten Percent Rule

Filed under: SQLServerPedia Syndication — Michael J. Swart @ 12:00 pm

A picture of coffee captioned: "Success: ninety percent hard work, ten percent coffee"“Just because you can, doesn’t mean you should.” SQL Server has a number of limits and restrictions (many of the limits are described here). Most of the time these restrictions don’t feel restrictive at all. I can’t think of a single scenario where I’d want more than 128 characters to name a column. And I don’t think I’ll ever need to select more than 4096 columns at a time. I get the feeling that if you need over 10% of what Microsoft restricts you to, then you’re probably doing it wrong.

So I turn that thought into Swart’s Ten Percent Rule:

If you’re using over 10% of what SQL Server restricts you to, you’re doing it wrong.

Always? No not always. It depends of course on what restriction we’re talking about. Sometimes the rule applies more than other times. I’d like to evaluate a number of restrictions here and judge whether the rule applies or not. I’ll be assigning each 10% rule a “Risk that you’re doing it wrong”. The levels are Severe, High, Elevated, Guarded or Low.


SQL Statement Length

Max Size: 256 megabytes (Using the default network packet size.)
10% of restriction: 25 megabytes
Risk of doing it wrong: Severe

Let’s take a step back. Why do you want to execute a statement that’s 25 megabytes long? The only scenario I can think of is that you’re sending data to the database to be parsed and stored. There’s much better ways to send data to the database then this (See Erland Sommarskog’s definitive Arrays and Lists in SQL Server).

Or maybe you’re creating a one-off huge IN clause to pass to SQL. Better to join to a temp table or a table valued parameter.


Identifier Length

Max Size: 128 characters
10% of restriction: 12 characters.
Risk of doing it wrong: Low

Twelve characters is not a lot to work with when coming up with names for tables or columns. If you had a column or table name with more than 12 characters, I wouldn’t be at all concerned. For comparison, Adventurework’s average column is 11 characters long. Their largest column is “HomeAddressSpecialInstructions” at 30 characters. My friend, Karen Lopez, told me she once came across this beast of a column name: “RetailTransactionLineItemModifierEventReasonCode” at 48 characters. That’s getting extreme (despite intellisense), but 12 is fine.


Number of Columns in a Table

Max Size: 1024 Columns
10% of restriction: 102 columns
Risk of doing it wrong: High

Maybe you need a lot of columns because you’re treating a table like a spreadsheet. The worst offender I’ve ever seen had a whopping 96 columns. It was designed by someone who didn’t know any better. By the way, the most notable modeling smell coming from this table was the columns named [Int01] to [Int30]). In this case, the database designer was definitely doing it wrong.

Maybe you need a lot of columns to support growing functionality. As an example, say you have a table like

CREATE TABLE UserProfile
(
    UserId int not null primary key,
    PhotoFileId int null,
    PreferenceTags varchar(1000) not null,
    CanEditWidget bit not null,
    CanViewWidget bit not null,
    CanAddWidget bit not null,
    IsMonitored bit not null,
    ...,
    IsApprovedForFastTrack bit not null
)

Then the thing to ask yourself is why is this list of columns changing so much? In this example, we see a growing list of boolean attributes (which will probably continue to grow as the application evolves). In this case it makes sense to redesign your schema using:

CREATE TABLE UserProfile
(
    UserId int not null primary key,
    PhotoFileId int null,
    PreferenceTags varchar(1000) not null
)
 
CREATE TABLE UserPermissions
(
    UserId int not null 
        references UserProfile(UserId),
    PermissionId int not null
        references Permissions(PermissionId),
    IsGranted bit not null
)

But maybe you really do need this number of columns, you’re probably storing semi-structured data. By semistructured data, I always think of how one would store cascading style sheets or health records.

Semistructured data is really difficult to model. As an alternative to these difficult-to-manage columns, consider storing the values in an xml column (accepting that you can fetch, but probably no longer want to filter). If you need something more radical, maybe consider a schema-less json document database like couchdb (I can’t believe I said that).

SQL Server has something called wide tables which use sparse columns but you’re on your own here. It’s not something I could recommend.


Simultaneous Server Connections

Max Size: 32,767
10% of restriction: 3,277.
Risk of doing it wrong: Guarded

The number of connections your SQL Server instance can support can be found by this query

SELECT @@MAX_CONNECTIONS as [this many]

and the number of open connections on the server is basically this

SELECT count(1) as [this many] FROM sys.dm_exec_sessions

It’s not that unusual to have 1000 or 2000 connections open. It is rarer to see that many active queries. But it’s not unusual to see 1000s of connections, most of whom are sleeping. It’s possible you have a large number of services and web servers connecting to a variety of databases on your server. With ado.net connection pools, it’s common to have a bunch of idle connections.

The best advice I can give is to get a baseline. Watch the connections regularly in order to understand what is normal. Maybe start with

select 
    count(1) as [connections], 
    count(case [status] when 'sleeping' then null else 1 end) as [active connections],
    db_name([dbid]) as [database name],
    [hostname], 
    [loginame],
    [program_name]
from sys.sysprocesses
where [dbid] NOT IN (0)
group by [hostname], [loginame], [dbid], [program_name]
order by connections desc

Then go from there. Once you know what’s normal, you can then watch for changes to the status quo.


Database Size

Max Size: 512 Petabytes
10% of restriction: 51 Petabytes

51 Petabytes is unwieldy in so many ways I’m not going to list them here. I’m not saying such big data sizes are impossible with a centralized RDBMS like SQL Server. But I am saying that if you’re not sure about whether you can scale SQL Server to that size and you’re reading my blog looking for advice then…

Risk of doing it wrong: Severe


Other metrics

I think Swart’s Ten Percent Rule applies to a lot of things. Try to apply it the next time you come across a SQL Server restriction and let me know if you find anything interesting.

May 14, 2013

Myself in 2004

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication — Michael J. Swart @ 11:53 am

This is me in 2004

9 years ago

I’m in the center with the glasses

This picture was taken in an electronics factory in Zhongshan, China.

I’m not a big traveler and I never have been.  I thought the opportunity to see China was a once in a lifetime thing. In my case, I found myself making the trip three times courtesy of my employer. It wasn’t the adventure I thought it would be. Unlike most tourists to China, my memories don’t include much else other than hotels, ferries and factory locations like the one you see here. I remember counting the days until each trip was over.

When I look at this picture, I remember a lot of things. I remember the anti-static lab coats and I’m reminded that I used to have more hair than I do now. The machine in front of me is meant to test a motherboard for the upcoming XBox 360. The guy on the right was an engineer, a hardware guy. My job was to install and support the software that collected test results and store them in a SQL Server database.

In Over My Head?

I remember the other people in this photo. It was an amazing assembly of talent. The people here were smart. But not just smart, they were smart and competent. They were ambitious and passionate about what they do and it’s rare to see that. I felt a little overwhelmed and I felt a little out of place. Everyone seemed so confident about what needed to be done. I guess I was no different. I knew what I had to do, but the confidence I showed was half acting.

For the most part, I was lucky. I installed the software and waited for a problem to support. It’s a testament to the company I worked for that I was able to wait more than work. But it wasn’t all super-smooth sailing.

Eventually I was asked to troubleshoot a burning issue and I wasn’t used to the pressure; hopefully it didn’t show. I felt like I was thrown into the deep end in order to learn how to swim. The head guy asked me to resolve something (and by the way Michael, the factory can’t operate until you do). The issues I faced were new to me and the pressure was on. Here’s a small subset of the kind of things I was asked to tackle:

  • Replication configuration issues. <bleah!>
  • Log files filling up (because replication’s log reader wasn’t operating properly)<ouch>
  • Concurrency bottlenecks. <just the beginning of the rest of my life>

In my career up until then, I was always able to ask a more experienced colleague for help. Here was the first time where I was it. There were no other colleagues to call on. If I couldn’t crack this nut, it wasn’t going to get cracked. In this case it was me and books online vs. SQL Server 2000. I did eventually get through those days and it felt amazing to beat those issues. But during the crisis itself, there was some anxiety.

Out of the Deep End

I came away from those experience with a bit more confidence than I started with. It was the first time I thought, “I’m fine, I can handle this”. By the end of my trips to China, the confidence I was showing wasn’t acting any more. I wasn’t just solving such crises, I was avoiding them. And back in Canada, I was asked to be on-call overnight in order to field questions from others in China.

I don’t believe there exists a training course anywhere that is equivalent to the confidence gained by solving these FIX IT NOW crises.

And it didn’t stop there, these other things helped boost my confidence even further:

  • SQL Server released 2005 with features that made a DBAs life so much easier than when supporting 2000.
  • I left my software development job for a different job focusing on databases full time.
  • twitter and #sqlhelp happened. It made me feel like I had the world on call. I don’t use it as much as I used to, but it’s nice to now it’s still there if I need it.

The Guy in the Picture

I’ve showed this picture a couple times to others in the past year. Each time I was encouraged to blog about it. This is me doing that. I have a lot of stories I could tell about these trips. In fact, I wish now that I kept a journal. So dear reader, if sometime in the future, we find ourselves hanging out and have nothing to chat about, ask me about 2004 Michael. Until then…

Powered by WordPress