Michael J. Swart

February 5, 2010

Removing Columns

Takeaway: Deleting columns does not reduce the width of a record (including new ones). Tables have to be rebuilt in order to reclaim space.

So I had the opportunity to talk to Michelle Ufford (SqlFool.com) last November after her PASS Summit Talk. We wondered what happens to the space taken by columns after they’ve been deleted. For example, you may be removing columns or altering columns in order to reduce a records’ width. The improved bytes/row (and rows/page) can provide a huge benefit to performance.

Michelle thought it was a great idea for a blog topic and so did I. But Michelle graciously agreed to let me be the one to write it up (I understand she’s busy this month). So here’s the main question:  What does happen to the space taken by columns that have been deleted? The naive idea that SQL Server magically cleans it up is wrong, for example:

First I create a table with really inefficient columns:

CREATE TABLE NAME (
	id INT IDENTITY PRIMARY KEY,
	first CHAR(1000) NOT NULL,
	middle CHAR(1000) NOT NULL,
	last CHAR(1000) NOT NULL
);
GO
INSERT NAME (first, middle, last) VALUES ('Michael', 'J', 'Swart');
INSERT NAME (first, middle, last) VALUES ('Lester', 'B', 'Pearson');
INSERT NAME (first, middle, last) VALUES ('Mack', 'D', 'Knife');
INSERT NAME (first, middle, last) VALUES ('Homer', 'J', 'Simpson');

These four rows I put in take up two data pages like this:

pages graph

Remember, each page is 8 Kb long. So lets remove the middle column and right-size the others like this:

ALTER TABLE NAME DROP COLUMN Middle;
ALTER TABLE NAME ALTER COLUMN First varchar(20) NOT NULL;
ALTER TABLE NAME ALTER COLUMN Last varchar(20) NOT NULL;
 
-- And add a couple rows for good measure:
INSERT NAME (First, Last) VALUES ('Bartholomew', 'Simpson');
INSERT NAME (First, Last) VALUES ('Lisa', 'Simpson');

So things should look better right? Not quite! The old data pages haven’t changed at all and the new data looks just as bad:

pages graph

What’s Happening

The action of dropping columns is not an operation that affects existing data. It’s a meta-data operation only. You can actually see this at work using the slightly undocumented view sys.system_internals_partition_columns. The is_dropped field indicates that the columns haven’t disappeared, they’ve just been marked as dropped:

select t.name as tablename,
c.name as columnname,
	ipc.*
from sys.system_internals_partition_columns ipc
join sys.partitions p
	on ipc.partition_id = p.partition_id
join sys.tables t
	on t.object_id = p.object_id
left join sys.columns c
	on c.object_id = t.object_id
	and c.column_id = ipc.partition_column_id
where t.name = 'NAME'
order by c.column_id

So just how hard is it to get rid of columns?

Samson Removing Columns

Hair Makes Removing Columns Possible

How to Really Get Rid of Columns

It turns out that an index REBUILD is required (A REORGANIZE isn’t good enough). If you have have a maintenance plan it might eventually take care of this. Or you can do this explicitly:

ALTER INDEX ALL ON NAME REBUILD;

Things look much nicer now:

pages graph

The information in the diagram above is hard to see, but it works out to about 200 rows per page as we had hoped. The column is truly deleted now which you can see using this query (once again):

select t.name as tablename,
	c.name as columnname,
	ipc.*
from sys.system_internals_partition_columns ipc
join sys.partitions p
	on ipc.partition_id = p.partition_id
join sys.tables t
	on t.object_id = p.object_id
left join sys.columns c
	on c.object_id = t.object_id
	and c.column_id = ipc.partition_column_id
where t.name = 'NAME'
order by c.column_id

Other Blogs have treated this topic before e.g.:

  • Andras at Simple Talk last year.
  • Kalen Delaney whose name is almost synonymous with SQL Internals looked at this in 2006.

But none of those have pictures of Samson, do they?

Update March 12, 2010: I just read chapter 30 in SQL Server MVP Deep Dives. The chapter is called Reusing space in a table. In it Joe Webb talks about a very similar scenario and talks recommends an approach of using DBCC CLEANTABLE. This is good for reclaiming space that was once used by variable-length columns. But it doesn’t work for the example I use in this post which uses fixed-length columns. It’s an important distinction because large columns are more often variable length than fixed length.

12 Comments »

  1. […] Removing Columns Doesn’t Decrease Table Size – I read the title and I thought Michael Swart was pulling my leg.  The fact that I learned something by reading this makes me think I’m not ready for the MCM program.  I have moments like this whenever I read a bunch of blog posts in one sitting, and I’m glad Michael blogged this. […]

    Pingback by Things I Read This Week | Brent Ozar - Too Much Information — February 5, 2010 @ 8:30 am

  2. Great info, Michael. This is not something I had realized. Thanks.

    Comment by Mark Vaillancourt — February 5, 2010 @ 12:01 pm

  3. It would be interesting to discuss what happens with heaps, too.

    Comment by Mark Vaillancourt — February 5, 2010 @ 12:04 pm

  4. Mark, that’s an excellent question.

    A quick check tells me that it’s the same story for heaps as it is for indexed tables. Except for the part about rebuilding indexes:

    ALTER INDEX ALL ON NAME REBUILD;

    While this command doesn’t fail, it doesn’t have any affect at all (because there are no indexes to rebuild). With heaps, new rows are still huge. Dropped columns are still there. Even if the table has non-clustered indexes against heaps. The dropped columns are still there.

    The point is is that the index rebuild does the trick only when a table has a clustered index.

    To the problem of reclaiming space used by dropped columns in heaps, I don’t know what the solution is. One workaround is to create a clustered index for it. Even if it is temporary.

    Comment by Michael J. Swart — February 5, 2010 @ 1:27 pm

  5. Excellent stuff, Michael. Really informative.

    Comment by Brad Schulz — February 5, 2010 @ 1:41 pm

  6. This is why I love SQL Server, you really do learn something new every day.

    Thanks for sharing.

    Comment by John Sansom — February 6, 2010 @ 3:07 am

  7. Mark, Brad, John, thanks for the feedback! It makes blogging worthwhile.

    Comment by Michael J. Swart — February 6, 2010 @ 10:12 am

  8. Thanks for the article; it was new and vital information. The only thing I can’t agree with is the caption on your graphic. For it wasn’t the hair but the Spirit of God that gave Samson power to tear down those columns.
    See Judges 16:28-30:
    And Samson called unto the LORD, and said, O Lord God, remember me, I pray thee, and strengthen me, I pray thee, only this once, O God, that I may be at once avenged of the Philistines for my two eyes. And Samson took hold of the two middle pillars upon which the house stood, and on which it was borne up, of the one with his right hand, and of the other with his left. And Samson said, Let me die with the Philistines. And he bowed himself with all his might; and the house fell upon the lords, and upon all the people that were therein. So the dead which he slew at his death were more than they which he slew in his life.

    Comment by Benjamin Lotter — February 8, 2010 @ 9:59 am

  9. Excellent, thank you

    Comment by SimonS — December 13, 2010 @ 8:28 am

  10. Hi Mr. Michael,
    Its really great to see the example. I have diffrent problem os space it would be great if you put your thought on it.

    Recently i have added a BIT column in a table which having 547,173,777 rows and updated some of the rows with boolean value true.

    Before adding the column size the database size is 189 GB. and just after updating few millions ids (Bit column to true) the database size increased and the size now 243 GB.

    The increment in the size not relflecting the BIT column ‘s projection.

    Can you please give your thought on it ?

    Why after adding a bit column and updating some of the rows with boolean true database size increased so high ?

    Thanks & Best Regards
    Rajat Jaiswal

    Comment by Rajat Jaiswal — April 25, 2011 @ 9:11 am

  11. Hey Rajat, That’s a really good question,
    But without more information, I don’t think I could make a good guess. What I do when I have questions like that is post them at http://stackoverflow.com. That site provides really good really high quality help.
    Post your question there, and send me the link for the question. I’ll try to help there if I can.
    Michael

    Comment by Michael J. Swart — April 25, 2011 @ 4:44 pm

  12. […] Removing Columns – Now this is why I just love working with SQL Server, you really can learn something new each day. Did you know that removing a column from a table does not reclaim the space? Michael J Swart (Database Whisperer) reveals all. […]

    Pingback by Something for the Weekend: SQL Server Links for the week 05/02/10 — March 13, 2012 @ 6:16 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress