Takeaway: Deleting columns does not reduce the width of a record (including new ones). Tables have to be rebuilt in order to reclaim space.
So I had the opportunity to talk to Michelle Ufford (SqlFool.com) last November after her PASS Summit Talk. We wondered what happens to the space taken by columns after they’ve been deleted. For example, you may be removing columns or altering columns in order to reduce a records’ width. The improved bytes/row (and rows/page) can provide a huge benefit to performance.
Michelle thought it was a great idea for a blog topic and so did I. But Michelle graciously agreed to let me be the one to write it up (I understand she’s busy this month). So here’s the main question: What does happen to the space taken by columns that have been deleted? The naive idea that SQL Server magically cleans it up is wrong, for example:
First I create a table with really inefficient columns:
CREATETABLE NAME (
id INTIDENTITYPRIMARYKEY,
firstCHAR(1000)NOTNULL,
middle CHAR(1000)NOTNULL,
lastCHAR(1000)NOTNULL);
GO
CREATE TABLE NAME (
id INT IDENTITY PRIMARY KEY,
first CHAR(1000) NOT NULL,
middle CHAR(1000) NOT NULL,
last CHAR(1000) NOT NULL
);
GO
INSERT NAME (first, middle, last)VALUES('Michael', 'J', 'Swart');
INSERT NAME (first, middle, last)VALUES('Lester', 'B', 'Pearson');
INSERT NAME (first, middle, last)VALUES('Mack', 'D', 'Knife');
INSERT NAME (first, middle, last)VALUES('Homer', 'J', 'Simpson');
INSERT NAME (first, middle, last) VALUES ('Michael', 'J', 'Swart');
INSERT NAME (first, middle, last) VALUES ('Lester', 'B', 'Pearson');
INSERT NAME (first, middle, last) VALUES ('Mack', 'D', 'Knife');
INSERT NAME (first, middle, last) VALUES ('Homer', 'J', 'Simpson');
These four rows I put in take up two data pages like this:
Remember, each page is 8 Kb long. So lets remove the middle column and right-size the others like this:
ALTERTABLE NAME DROPCOLUMN Middle;
ALTERTABLE NAME ALTERCOLUMNFirstvarchar(20)NOTNULL;
ALTERTABLE NAME ALTERCOLUMNLastvarchar(20)NOTNULL;
-- And add a couple rows for good measure:INSERT NAME (First, Last)VALUES('Bartholomew', 'Simpson');
INSERT NAME (First, Last)VALUES('Lisa', 'Simpson');
ALTER TABLE NAME DROP COLUMN Middle;
ALTER TABLE NAME ALTER COLUMN First varchar(20) NOT NULL;
ALTER TABLE NAME ALTER COLUMN Last varchar(20) NOT NULL;
-- And add a couple rows for good measure:
INSERT NAME (First, Last) VALUES ('Bartholomew', 'Simpson');
INSERT NAME (First, Last) VALUES ('Lisa', 'Simpson');
So things should look better right? Not quite! The old data pages haven’t changed at all and the new data looks just as bad:
What’s Happening
The action of dropping columns is not an operation that affects existing data. It’s a meta-data operation only. You can actually see this at work using the slightly undocumented view sys.system_internals_partition_columns. The is_dropped field indicates that the columns haven’t disappeared, they’ve just been marked as dropped:
select t.nameas tablename,
c.nameas columnname,
ipc.*from sys.system_internals_partition_columns ipc
join sys.partitions p
on ipc.partition_id= p.partition_idjoin sys.tables t
on t.object_id= p.object_idleftjoin sys.columns c
on c.object_id= t.object_idand c.column_id= ipc.partition_column_idwhere t.name='NAME'orderby c.column_id
select t.name as tablename,
c.name as columnname,
ipc.*
from sys.system_internals_partition_columns ipc
join sys.partitions p
on ipc.partition_id = p.partition_id
join sys.tables t
on t.object_id = p.object_id
left join sys.columns c
on c.object_id = t.object_id
and c.column_id = ipc.partition_column_id
where t.name = 'NAME'
order by c.column_id
So just how hard is it to get rid of columns?
Hair Makes Removing Columns Possible
How to Really Get Rid of Columns
It turns out that an index REBUILD is required (A REORGANIZE isn’t good enough). If you have have a maintenance plan it might eventually take care of this. Or you can do this explicitly:
ALTERINDEXALLON NAME REBUILD;
ALTER INDEX ALL ON NAME REBUILD;
Things look much nicer now:
The information in the diagram above is hard to see, but it works out to about 200 rows per page as we had hoped. The column is truly deleted now which you can see using this query (once again):
select t.nameas tablename,
c.nameas columnname,
ipc.*from sys.system_internals_partition_columns ipc
join sys.partitions p
on ipc.partition_id= p.partition_idjoin sys.tables t
on t.object_id= p.object_idleftjoin sys.columns c
on c.object_id= t.object_idand c.column_id= ipc.partition_column_idwhere t.name='NAME'orderby c.column_id
select t.name as tablename,
c.name as columnname,
ipc.*
from sys.system_internals_partition_columns ipc
join sys.partitions p
on ipc.partition_id = p.partition_id
join sys.tables t
on t.object_id = p.object_id
left join sys.columns c
on c.object_id = t.object_id
and c.column_id = ipc.partition_column_id
where t.name = 'NAME'
order by c.column_id
Kalen Delaney whose name is almost synonymous with SQL Internals looked at this in 2006.
But none of those have pictures of Samson, do they?
Update March 12, 2010: I just read chapter 30 in SQL Server MVP Deep Dives. The chapter is called Reusing space in a table. In it Joe Webb talks about a very similar scenario and talks recommends an approach of using DBCC CLEANTABLE. This is good for reclaiming space that was once used by variable-length columns. But it doesn’t work for the example I use in this post which uses fixed-length columns. It’s an important distinction because large columns are more often variable length than fixed length.
I’ve been blogging using the wordpress platform for about three months now and every now and then I’ve come across a certain quirkiness that I’ve finally gotten to the bottom of. I want to warn you about it and maybe save some head-scratching.
Articles with a particular string cannot be saved. When I try to use this particular string, the web server times out. At least in my environment and after checking with Brent Ozar, it’s not isolated to just me.
Here’s the string, I’m pasting it as an image for somewhat obvious reasons:
Some programmer doesn't want you using this string
What’s really odd is that
VARCHAR (works (note the space).
varchar( works.
CHAR(works.
and ARCHAR( works.
If this is a security “feature” it seems like a crummy one. Crummy because it hinders valid work (by us honest sql bloggers) and crummy times two because It’s not particularly effective at guarding against some perceived risk. Especially when the work-around of VARCHAR space bracket works just fine.
Your mileage may vary. But try it out. See if you can save a draft of a blog post using the forbidden string. Leave a comment back here if you had trouble like I do.
If you want to dig deep into SQL Server internals, you’re eventually going to want to look at the physical page structure of a data page. As such I’ve bookmarked Anatomy of a Page by Paul Randal.
The syntax for all this stuff is hard for me to remember, so I’ve made a template for myself. And I’m sharing that here:
-- looking at page contents.-- T3604 to output to consoledbcc traceon (3604)-- select rows from the table of interest-- (as well as their physical location)selecttop(10) t.*,
pl.file_id, pl.page_id, pl.slot_idfrom<tablename,sysname,Production.Product>as t
cross apply sys.fn_PhysLocCracker(t.%%physloc%%)as pl;
declare @dbname sysname;
set @dbname =DB_NAME();
dbcc page(@dbname, --db_name1, --file_id136, --page_id1)--results style
-- looking at page contents.
-- T3604 to output to console
dbcc traceon (3604)
-- select rows from the table of interest
-- (as well as their physical location)
select top (10) t.*,
pl.file_id, pl.page_id, pl.slot_id
from <tablename,sysname,Production.Product> as t
cross apply sys.fn_PhysLocCracker(t.%%physloc%%) as pl;
declare @dbname sysname;
set @dbname = DB_NAME();
dbcc page(@dbname, --db_name
1, --file_id
136, --page_id
1) --results style
Ctrl+Shift+M will let you provide the table name. I’ve also made this script into a code-snippet called page.
Note that the script makes use of sys.fn_PhysLocCracker which is SQL 2008 and later.
By the way, I highly recommend playing with this stuff. Not having to guess at the width of records is really helpful. I plan to write a post soon which shows how it helped an investigation I did.