Michael J. Swart

June 19, 2020

Problem Solving by Cheating

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 10:34 am

Solving real-world problems is different than answering interview questions or twitter polls. The biggest difference is that real problems aren’t always fair. There’s not always a right answer.

Answer this multiple choice question:

Which of the following SQL statements is used to modify existing data in a table?
A) SELECT
B) INSERT
C) DELETE

Give it some thought. Which option would you pick? The correct answer is UPDATE but it wasn’t one of the options listed and that’s not fair. But neither is real life. Many real problems don’t have an easy answer and some real problems are impossible to solve. That can be discouraging.

A blue shell from MarioKart is about to attack a runner. Sometimes life just isn't fair.

Real Problems Allow For Creativity

But if your problems are unfair, then maybe you’re allowed to cheat too.

“None of the above” is always an option. Understand the goal so that you can stretch or ignore requirements.

Example – Changing an INT to a BIGINT

I have a table that logs enrollments into courses. It’s append only and looks something like this:

CREATE TABLE dbo.LOG_ENROLL (
    LogId INT IDENTITY NOT NULL,  -- This identity column is running out of space
    UserId INT NOT NULL,
    CourseId INT NOT NULL,
    RoleId INT NULL,
    EnrollmentType INT NOT NULL,
    LogDate DATETIME NOT NULL DEFAULT GETUTCDATE(),
 
    INDEX      IX_LOG_ENROLL_CourseId    CLUSTERED    ( CourseId, UserId ),
    CONSTRAINT PK_LOG_ENROLL PRIMARY KEY NONCLUSTERED ( LogId ),
    INDEX      IX_LOG_ENROLL_UserId      NONCLUSTERED ( UserId, CourseId ),
    INDEX      IX_LOG_ENROLL_LogDate     NONCLUSTERED ( LogDate, LogId )
);

The table has over 2 billion rows and it looks like it’s going to run out of space soon because the LogId column is defined as an INT. I need to change this table so that it’s a BIGINT. But changing an INT to a BIGINT is known as a “size of data” operation. This means SQL Server has to process every row to expand the LogId column from 4 to 8 bytes. But it gets trickier than that.

The biggest challenge is that the table has to remain “online” (available for queries and inserts).

Compression?
Gianluca Sartori (spaghettidba) had the idea of enlarging the columns with no downtime using compression. It’s promising, but I discovered that for this to work, all indexes need to be compressed not just the ones that contain the changed column. Also, any indexes which use the column need to be disabled for this to work.

Cheating
I gave up on solving this problem in general and constrained my focus to the specific problem I was facing. There’s always some context that lets us bend the rules. In my case, here’s what I did.

Ahead of time:

  • I removed extra rows. I discovered that many of the rows were extraneous and could be removed. After thinning out the table, the number of rows went from 2 billion down to 300 million.
  • I compressed two of the indexes online (IX_LOG_ENROLL_UserId and IX_LOG_ENROLL_CourseId) because I still want to use the compression trick.

But I’m not ready yet. I still can’t modify the column because the other two columns depend on the LogId column. If I tried, I get this error message:


Msg 5074, Level 16, State 1, Line 22
The index ‘IX_LOG_ENROLL_LogDate’ is dependent on column ‘LogId’.
Msg 5074, Level 16, State 1, Line 22
The object ‘PK_LOG_ENROLL’ is dependent on column ‘LogId’.
Msg 4922, Level 16, State 9, Line 22
ALTER TABLE ALTER COLUMN LogId failed because one or more objects access this column.

So I temporarily drop those indexes!

  • Drop the constraint PK_LOG_ENROLL and the index IX_LOG_ENROLL_LogDate
  • Do the switch! ALTER TABLE LOG_ENROLL ALTER COLUMN LogId BIGINT NOT NULL; This step takes no time!
  • Recreate the indexes online that were dropped.

Hang on, that last step is a size of data operation. Anyone who needs those indexes won’t be able to use them while they’re being built.
Exactly, and this is where I cheat. It turns out those indexes were used for infrequent reports and I was able to co-ordinate my index rebuild around the reporting schedule.

You can’t always make an operation online, but with effort and creativity, you can get close enough. I have found that every real problem allows for a great degree of creativity when you’re allowed to bend the rules or question requirements.

January 28, 2020

What Tables Are Being Written To The Most?

Filed under: Miscelleaneous SQL,SQL Scripts,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 10:38 am

You have excessive WRITELOG waits (or HADR_SYNC_COMMIT waits) and among other things, you want to understand where.

Microsoft’s advice Diagnosing Transaction Log Performance Issues and Limits of the Log Manager remains a great resource. They tell you to use perfmon to look at the log bytes flushed/sec counter (in the SQL Server:Databases object) to see which database is being written to so much.

After identifying a database you’re curious about, you may want to drill down further. I wrote about this problem earlier in Tackle WRITELOG Waits Using the Transaction Log and Extended Events. The query I wrote for that post combines results of an extended events session with the transaction log in order to identify which procedures are doing the most writing.

But it’s a tricky kind of script. It takes a while to run on busy systems. There’s a faster way to drill into writes if you switch your focus from which queries are writing so much to which tables are being written to so much. Both methods of drilling down can be helpful, but the table approach is faster and doesn’t require an extended event session and it might be enough to point you in the right direction.

Use This Query

use [specify your databasename here]
 
-- get the latest lsn for current DB
declare @xact_seqno binary(10);
declare @xact_seqno_string_begin varchar(50);
exec sp_replincrementlsn @xact_seqno OUTPUT;
set @xact_seqno_string_begin = '0x' + CONVERT(varchar(50), @xact_seqno, 2);
set @xact_seqno_string_begin = stuff(@xact_seqno_string_begin, 11, 0, ':')
set @xact_seqno_string_begin = stuff(@xact_seqno_string_begin, 20, 0, ':');
 
-- wait a few seconds
waitfor delay '00:00:10'
 
-- get the latest lsn for current DB
declare @xact_seqno_string_end varchar(50);
exec sp_replincrementlsn @xact_seqno OUTPUT;
set @xact_seqno_string_end = '0x' + CONVERT(varchar(50), @xact_seqno, 2);
set @xact_seqno_string_end = stuff(@xact_seqno_string_end, 11, 0, ':')
set @xact_seqno_string_end = stuff(@xact_seqno_string_end, 20, 0, ':');
 
WITH [Log] AS
(
  SELECT Category, 
         SUM([Log Record Length]) as [Log Bytes]
  FROM   fn_dblog(@xact_seqno_string_begin, @xact_seqno_string_end)
  CROSS  APPLY (SELECT ISNULL(AllocUnitName, Operation)) AS C(Category)
  GROUP  BY Category
)
SELECT   Category, 
         [Log Bytes],
         100.0 * [Log Bytes] / SUM([Log Bytes]) OVER () AS [%]
FROM     [Log]
ORDER BY [Log Bytes] DESC;

Results look something like this (Your mileage may vary).
A screenshot of the results

Notes

  • Notice that some space in the transaction log is not actually about writing to tables. I’ve grouped them into their own categories and kept them in the results. For example LOP_BEGIN_XACT records information about the beginning of transactions.
  • I’m using sp_replincrementlsn to find the current last lsn. I could have used log_min_lsn from sys.dm_db_log_stats but that dmv is only available in 2016 SP2 and later.
  • This method is a little more direct measurement of transaction log activity than a similar query that uses sys.dm_db_index_operational_stats

January 20, 2020

Watching SQL Server Stuff From Performance Monitor

Taking a small break from my blogging sabbatical to post one script that I’ve found myself writing from scratch too often.
My hope is that the next time I need this, I’ll look it up here.

The User Settable Counter

Use this to monitor something that’s not already exposed as a performance counter. Like the progress of a custom task or whatever. If you can write a quick query, you can expose it to a counter that can be plotted by Performance Monitor.

Here’s the script (adjust SomeMeasurement and SomeTable to whatever makes sense and adjust the delay interval if 1 second is too short:

declare @deltaMeasurement int = 0;
declare @totalMeasurement int = 0;
 
while (1=1)
begin
 
  select @deltaMeasurement = SomeMeasurement - @totalMeasurement
  from SomeTable;
 
  set @totalMeasurement += @deltaMeasurement;
 
  exec sp_user_counter1 @deltaMeasurement;
  waitfor delay '00:00:01'
end

Monitoring

Now you can monitor “User Counter 1” in the object “SQLServer:User Settable” which will look like this:
Example of monitoring a performance counter using Performance Monitor

Don’t forget to stop the running query when you’re done.

October 26, 2018

Uncovering Hidden Complexity

Filed under: Miscelleaneous SQL,SQL Scripts,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 12:15 pm

The other day, Erin Stellato asked a question on twitter about the value of nested SPs. Here’s how I weighed in:

Hidden complexity has given me many problems in the past. SQL Server really really likes things simple and so it’s nice to be able to uncover that complexity. Andy Yun has tackled this problem for nested views with his sp_helpexpandview.

Here’s what I came up with for nested anything. It helps unravel a tree of dependencies based on information found in sys.triggers and sys.dm_sql_referenced_entities. With it, you can see what’s involved when interacting with objects. Here’s what things look like for Sales.SalesOrderDetail in AdventureWorks2014. A lot of the resulting rows can be ignored, but there can be surprises in there too.

A lot in there

DECLARE @object_name SYSNAME = 'Sales.SalesOrderDetail';
 
WITH dependencies AS
(
    SELECT @object_name AS [object_name],
           CAST(
             QUOTENAME(OBJECT_SCHEMA_NAME(OBJECT_ID(@object_name))) + '.' + 
             QUOTENAME(OBJECT_NAME(OBJECT_ID(@object_name)))
             as sysname) as [escaped_name],
           [type_desc],
           object_id(@object_name) AS [object_id],
           1 AS is_updated,
           CAST('/' + CAST(object_id(@object_name) % 10000 as VARCHAR(30)) + '/' AS hierarchyid) as tree,
           0 as trigger_parent_id
      FROM sys.objects 
     WHERE object_id = object_id(@object_name)
 
    UNION ALL
 
    SELECT CAST(OBJECT_SCHEMA_NAME(o.[object_id]) + '.' + OBJECT_NAME(o.[object_id]) as sysname),
           CAST(QUOTENAME(OBJECT_SCHEMA_NAME(o.[object_id])) + '.' + QUOTENAME(OBJECT_NAME(o.[object_id])) as sysname),
           o.[type_desc],
           o.[object_id],
           CASE o.[type] when 'U' then re.is_updated else 1 end,
           CAST(d.tree.ToString() + CAST(o.[object_id] % 10000 as VARCHAR(30)) + '/' AS hierarchyid),
           0 as trigger_parent_id
      FROM dependencies d
     CROSS APPLY sys.dm_sql_referenced_entities(d.[escaped_name], default) re
      JOIN sys.objects o
           ON o.object_id = isnull(re.referenced_id, object_id(ISNULL(re.referenced_schema_name,'dbo') + '.' + re.referenced_entity_name))
     WHERE tree.GetLevel() < 10
       AND re.referenced_minor_id = 0
       AND o.[object_id] <> d.trigger_parent_id
       AND CAST(d.tree.ToString() as varchar(1000)) not like '%' + CAST(o.[object_id] % 10000 as varchar(1000)) + '%'
 
     UNION ALL
 
     SELECT CAST(OBJECT_SCHEMA_NAME(t.[object_id]) + '.' + OBJECT_NAME(t.[object_id]) as sysname),
            CAST(QUOTENAME(OBJECT_SCHEMA_NAME(t.[object_id])) + '.' + QUOTENAME(OBJECT_NAME(t.[object_id])) as sysname),
            'SQL_TRIGGER',
            t.[object_id],
            0 AS is_updated,
            CAST(d.tree.ToString() + CAST(t.object_id % 10000 as VARCHAR(30)) + '/' AS hierarchyid),
            t.parent_id as trigger_parent_id
       FROM dependencies d
       JOIN sys.triggers t
            ON d.[object_id] = t.parent_id
      WHERE d.is_updated = 1
        AND tree.GetLevel() < 10
        AND CAST(d.tree.ToString() as varchar(1000)) not like '%' + cast(t.[object_id] % 10000 as varchar(1000)) + '%'
)
SELECT replicate('—', tree.GetLevel() - 1) + ' ' + [object_name], 
       [type_desc] as [type],
       tree.ToString() as dependencies       
  FROM dependencies
 ORDER BY tree

June 15, 2018

ORDER BY newid() is an Unbiased Way To Randomize

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 9:47 am

Mike Bostock is a data-visualization specialist. And it really shows in his blog. Every article is really well designed (which makes sense… many of the articles are about design).

One of his articles, Visualizing Algorithms has some thoughts on shuffling at https://bost.ocks.org/mike/algorithms/#shuffling.

He says that sorting using a random comparator is a rotten way to shuffle things. Not only is it inefficient, but the resulting shuffle is really really biased. He goes on to visualize that bias (again, I really encourage you to go see his stuff).

Ordering by random reminded me of the common technique in SQL Server of ORDER BY newid(). So I wondered whether an obvious bias was present there. So I shuffled 100 items thousands of times and recreated the visualization of bias in a heat map (just like Mike did).

Here is the heatmap. If you can, try to identify any patterns.

Order By NewID Bias

Where:

    columns are the position before the shuffle,
    rows are the position after the shuffle,
    green is a positive bias and
    red is a negative bias.

I don’t think there is any bias here. The problem that introduces bias in Mike Bostock’s example is that his “random comparator” that he defined does not obey transitivity. His words. “A comparator must obey transitivity: if a > b and b > c, then a > c.”
But in SQL Server, because each row is assigned a newid(), ORDER BY newid() doesn’t have that flaw and so it doesn’t have that bias.

But Be Careful

Although the method is unbiased, ORDER BY newid() is still inefficient. It uses a sort which is an inefficient way of shuffling. There are alternative shuffle algorithms that are more efficient.
ORDER BY newid() is good for quick and dirty purposes. But if you value performance, shuffle in the app.

April 6, 2018

Are There Any System Generated Constraint Names Lurking In Your Database?

Names for constraints are optional meaning that if you don’t provide a name when it’s created or cannot afford one, one will be appointed to you by the system.
These system provided names are messy things and I don’t think I have to discourage you from using them. Kenneth Fisher has already done that in Constraint names, Say NO to the default.

But how do you know whether you have any?

Here’s How You Check

SELECT SCHEMA_NAME(schema_id) AS [schema name],
       OBJECT_NAME(object_id) AS [system generated object name],
       OBJECT_NAME(parent_object_id) AS [parent object name],
       type_desc AS [object type]
  FROM sys.objects
 WHERE OBJECT_NAME(object_id) LIKE 
         type + '\_\_' + LEFT(OBJECT_NAME(parent_object_id),8) + '\_\_%' ESCAPE '\'
       OR
       OBJECT_NAME(object_id) LIKE 
          REPLACE(sys.fn_varbintohexstr(CAST(object_id AS VARBINARY(MAX))), '0x', '%\_\_') ESCAPE '\'

This will find all your messy system-named constraints.
For example, a table defined like this:

create table MY_TABLE
(
  id INT IDENTITY PRIMARY KEY,
  id2 INT REFERENCES MY_TABLE(id) DEFAULT 0,
  UNIQUE(id),
  CHECK (id >= 0)
)

Will give results like this:

Happy hunting.

Update: April 9, 2018
We can get this info from the system views a little easier as Rob Volk pointed out. I’ve also included the parent object’s type.

SELECT OBJECT_SCHEMA_NAME(id) AS [schema name],
       OBJECT_NAME(constid) AS [system generated constraint name],
       (select type_desc from sys.objects where object_id = constid) as [constraint type],
       OBJECT_NAME(id) AS [parent object name],
       (select type_desc from sys.objects where object_id = id) as [parent object type]
  FROM sys.sysconstraints
 WHERE status & 0x20000 > 0
   AND OBJECT_NAME(id) NOT IN (N'__RefactorLog', N'sysdiagrams')
 ORDER BY [parent object type], [parent object name], [system generated constraint name];

March 26, 2018

T-SQL Options for Comparing “Distinctness”

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 8:40 am

I had the privilege of listening to Itzik Ben Gan talk about “distinctness” in a talk he gave at PASS Summit. Distinctness is a relationship or comparison between two variables, just like equals (=). But unlike equality, distinctness treats NULLs in a more intuitive way (NULL is not distinct from NULL).

There’s often confusion because equality in SQL is not like equality in mathematics. In particular equality in SQL doesn’t follow the reflexive property (∀A, A=A).

Clear right?

I explore different syntax options to test whether values are distinct or not. Each method has its pros and cons.

The Setup

Consider this table.

CREATE TABLE Tasks
(
    TaskId INT NOT NULL IDENTITY CONSTRAINT PK_TASKS PRIMARY KEY,
    AssignedTeamId INT NULL,
    AssignedSubTeamId INT NULL,
    TaskDetails NVARCHAR(2000) NOT NULL,
    INDEX IX_TASKS_ASSIGNEDTEAMID (AssignedTeamId)
);

When a task is unassigned, the AssignedTeamId and AssignedSubTeamId columns can both be null.
Our goal will be to select an arbitrary TaskId given parameters @TeamId, @SubTeamId. And when parameters @TeamId and @SubTeamId are both null, I want to return an unassigned task.

The Equality Join (doesn’t compare distinctness)

I just want to post this here as an example that doesn’t work.

DECLARE @TeamId bigint = NULL,
    @SubTeamId bigint = NULL;
 
-- this will never return any rows when the parameters are null:
SELECT TOP 1 TaskId
  FROM Tasks
 WHERE AssignedTeamId = @TeamId
   AND AssignedSubTeamId = @SubTeamId

PROS: The syntax looks nice and clean.
CONS: It doesn’t work for nulls.

The Expanded WHERE Clause

Well, let’s just write it all out then.

DECLARE @TeamId bigint = NULL,
    @SubTeamId bigint = NULL;
 
SELECT TOP 1 TaskId
  FROM Tasks
 WHERE ( (AssignedTeamId IS NULL AND @TeamId IS NULL) OR AssignedTeamId = @TeamId )
   AND ( (AssignedSubTeamId IS NULL AND @SubTeamId IS NULL) OR AssignedSubTeamId = @SubTeamId )

There’s no way that syntax is sarg-able. But it turns out that it is. SQL Server works hard and says “I see what you’re trying to do there, I’ll help you out”.
PROS: It works and it’s sarg-able.
CONS: That syntax is sooo awkward.

Using INTERSECT Syntax

This is a tip that I got straight from Itzik Ben Gan who says he got the idea from Paul White. The idea is that INTERSECT doesn’t use the idea of equality, but of distinctness for it’s comparison. We can use this to create slightly nicer syntax.

DECLARE @TeamId bigint = NULL,
    @SubTeamId bigint = NULL;
 
SELECT TOP 1 TaskId
FROM tasks
WHERE EXISTS (
    SELECT assignedTeamId, assignedSubTeamId
    INTERSECT
    SELECT @TeamId, @SubTeamId
)

The syntax is slightly less awkward, and it’s sarg-able. Or should be… But there’s a problem with this query (see if you can find it before reading further). Compare the two query plans. First the expanded where clause:

The Expanded where clause produces an efficient seek.


Here’s what the query with the INTERSECT syntax produces:

The INTERSECT syntax produces an inefficient scan

The secret to this mystery lies in that filter operator. There’s an implicit conversion there from int to bigint and that can cause a scan of the entire index. With the expanded syntax, SQL Server can handle the conversion gracefully. With the INTERSECT syntax it cannot. This was a really hard-earned lesson for us this week.

Change the parameters @TeamId and @SubTeamId to INT to match and the query becomes sarg-able again.

PROS: More elegant syntax and sarg-able (when you’re careful)
CONS: This syntax causes performance issues with mismatched types. Take extra-special care to make sure types match up.

IS NOT DISTINCT FROM

Check it:

DECLARE @TeamId bigint = NULL,
    @SubTeamId bigint = NULL;
 
SELECT TOP 1 TaskId
FROM tasks
WHERE assignedTeamId IS NOT DISTINCT FROM @TeamId
  AND assignedSubTeamId IS NOT DISTINCT FROM @SubTeamId

Talk about elegant! That’s what we wanted from the beginning. It’s part of ANSI’s SQL 1999 standard. Paul White tells us it’s implemented internally as part of the query processor, but it’s not part of T-SQL! There’s a connect item for it… err. Or whatever they’re calling it these days. Go read all the comments and then give it a vote. There are lots of examples of problems that this feature would solve.

PROS: Super-elegant!
CONS: Invalid syntax (vote to have it included).

January 17, 2018

SHA1 Collisions in SQL Server

Takeaway: It’s been frowned on for a while, but SHA1 is definitely broken for security purposes.

In October of 2010, Michael Coles created a contest on his blog called “Find a Hash Collision, Win $100“. The contest was part of a discussion at the time about whether the SHA1 hash was useful for detecting changes. For what it’s worth, I still think SHA1 is valuable as a consistency check if not for security.

At the time no SHA1 hash collisions were known, but in 2017, the news broke that some researchers finally generated a collision. So I looked up the research paper and downloaded the files. I used OPENROWSET to get the binary strings and I created my entry for Michael Coles’ contest:

--  Begin script
DECLARE @A varbinary(8000),
      @B varbinary(8000),
      @hA binary(20),
      @hB binary(20);
 
-- Replace the ? below with binary strings
 
SELECT @A = 0x255044462D312E330A25E2E3CFD30A0A0A312030206F626A0A3C3C2F57696474682032203020522F4865696768742033203020522F547970652034203020522F537562747970652035203020522F46696C7465722036203020522F436F6C6F7253706163652037203020522F4C656E6774682038203020522F42697473506572436F6D706F6E656E7420383E3E0A73747265616D0AFFD8FFFE00245348412D3120697320646561642121212121852FEC092339759C39B1A1C63C4C97E1FFFE017F46DC93A6B67E013B029AAA1DB2560B45CA67D688C7F84B8C4C791FE02B3DF614F86DB1690901C56B45C1530AFEDFB76038E972722FE7AD728F0E4904E046C230570FE9D41398ABE12EF5BC942BE33542A4802D98B5D70F2A332EC37FAC3514E74DDC0F2CC1A874CD0C78305A21566461309789606BD0BF3F98CDA8044629A10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000FFFE00FE0000000000000000FFE000104A46494600010101004800480000FFFE00134372656174656420776974682047494D50FFDB00430001010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFDB00430101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFC20011080008000803011100021101031101FFC40014000100000000000000000000000000000008FFC40014010100000000000000000000000000000009FFFE0006FFFE002FFFDA000C03010002100310000001539DC51FFFC4001510010100000000000000000000000000001626FFFE0006FFFE0033FFDA0008010100010502A953FFC4001F1100000309000000000000000000000000141517001316274563658695FFFE0006FFFE0041FFDA0008010301013F019A8AA56D533BB238739E612166B90E605BFFC4001E11000004070000000000000000000000000014151713274563658594FFFE0006FFFE0040FFDA0008010201013F01984E1555C155B66CDC3E04A21A444C40FFC4001E10000101090000000000000000000000001413001215164462648594FFFE0006FFFE0033FFDA0008010100063F02AD9A4DB175DCE6086D743B05BFFFC40014100100000000000000000000000000000000FFFE0006FFFE0012FFDA0008010100013F216001FFFE0006FFFE002BFFDA000C030100020003000000101FFFC40014110100000000000000000000000000000000FFFE0006FFFE0028FFDA0008010301013F106980FFC40014110100000000000000000000000000000000FFFE0006FFFE0028FFDA0008010201013F106BC7FFC40014100100000000000000000000000000000000FFFE0006FFFE0014FFDA0008010100013F10153FFFD9414E4745FFE000104A46494600010101004800480000FFFE00134372656174656420776974682047494D50FFDB00430001010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFDB00430101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFC20011080008000803011100021101031101FFC40014000100000000000000000000000000000009FFC4001501010100000000000000000000000000000607FFDA000C03010002100310000001524A5FFF00FFC40014100100000000000000000000000000000000FFDA00080101000105027FFFC40018110002030000000000000000000000000000F037A7B7FFDA0008010301013F01DFDBFD9FCFFFC40018110002030000000000000000000000000000F036A6B6FFDA0008010201013F01CA3546287FFFC40018100002030000000000000000000000000000F03767A7FFDA0008010100063F02A99C99898FFFC40014100100000000000000000000000000000000FFDA0008010100013F217FFFDA000C030100020003000000101FFFC40014110100000000000000000000000000000000FFDA0008010301013F100FFFC40014110100000000000000000000000000000000FFDA0008010201013F100FFFC40014100100000000000000000000000000000000FFDA0008010100013F100FFFD90A656E6473747265616D0A656E646F626A0A0A322030206F626A0A380A656E646F626A0A0A332030206F626A0A380A656E646F626A0A0A342030206F626A0A2F584F626A6563740A656E646F626A0A0A352030206F626A0A2F496D6167650A656E646F626A0A0A362030206F626A0A2F4443544465636F64650A656E646F626A0A0A372030206F626A0A2F4465766963655247420A656E646F626A0A0A382030206F626A0A313639330A656E646F626A0A0A392030206F626A0A3C3C0A20202F54797065202F436174616C6F670A20202F5061676573203130203020520A3E3E0A656E646F626A0A0A0A31302030206F626A0A3C3C0A20202F54797065202F50616765730A20202F436F756E7420310A20202F4B696473205B3131203020525D0A3E3E0A656E646F626A0A0A31312030206F626A0A3C3C0A20202F54797065202F506167650A20202F506172656E74203130203020520A20202F4D65646961426F78205B302030203820385D0A20202F43726F70426F78205B302030203820385D0A20202F436F6E74656E7473203132203020520A20202F5265736F75726365730A20203C3C0A202020202F584F626A656374203C3C2F496D302031203020523E3E0A20203E3E0A3E3E0A656E646F626A0A0A31322030206F626A0A3C3C2F4C656E6774682033303E3E0A73747265616D0A710A2020382030203020382030203020636D0A20202F496D3020446F0A510A656E6473747265616D0A656E646F626A0A0A0A0A787265660A30203133200A303030303030303030302036353533352066200A30303030303030303137203030303030206E200A30303030303031383631203030303030206E200A30303030303031383739203030303030206E200A30303030303031383937203030303030206E200A30303030303031393232203030303030206E200A30303030303031393435203030303030206E200A30303030303031393732203030303030206E200A30303030303031393939203030303030206E200A30303030303032303230203030303030206E200A30303030303032303736203030303030206E200A30303030303032313432203030303030206E200A30303030303032333039203030303030206E200A0A747261696C6572203C3C202F526F6F74203920302052202F53697A652031333E3E0A0A7374617274787265660A323339310A2525454F460A,
       @B = 0x255044462D312E330A25E2E3CFD30A0A0A312030206F626A0A3C3C2F57696474682032203020522F4865696768742033203020522F547970652034203020522F537562747970652035203020522F46696C7465722036203020522F436F6C6F7253706163652037203020522F4C656E6774682038203020522F42697473506572436F6D706F6E656E7420383E3E0A73747265616D0AFFD8FFFE00245348412D3120697320646561642121212121852FEC092339759C39B1A1C63C4C97E1FFFE017346DC9166B67E118F029AB621B2560FF9CA67CCA8C7F85BA84C79030C2B3DE218F86DB3A90901D5DF45C14F26FEDFB3DC38E96AC22FE7BD728F0E45BCE046D23C570FEB141398BB552EF5A0A82BE331FEA48037B8B5D71F0E332EDF93AC3500EB4DDC0DECC1A864790C782C76215660DD309791D06BD0AF3F98CDA4BC4629B10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000FFFE00FE0000000000000000FFE000104A46494600010101004800480000FFFE00134372656174656420776974682047494D50FFDB00430001010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFDB00430101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFC20011080008000803011100021101031101FFC40014000100000000000000000000000000000008FFC40014010100000000000000000000000000000009FFFE0006FFFE002FFFDA000C03010002100310000001539DC51FFFC4001510010100000000000000000000000000001626FFFE0006FFFE0033FFDA0008010100010502A953FFC4001F1100000309000000000000000000000000141517001316274563658695FFFE0006FFFE0041FFDA0008010301013F019A8AA56D533BB238739E612166B90E605BFFC4001E11000004070000000000000000000000000014151713274563658594FFFE0006FFFE0040FFDA0008010201013F01984E1555C155B66CDC3E04A21A444C40FFC4001E10000101090000000000000000000000001413001215164462648594FFFE0006FFFE0033FFDA0008010100063F02AD9A4DB175DCE6086D743B05BFFFC40014100100000000000000000000000000000000FFFE0006FFFE0012FFDA0008010100013F216001FFFE0006FFFE002BFFDA000C030100020003000000101FFFC40014110100000000000000000000000000000000FFFE0006FFFE0028FFDA0008010301013F106980FFC40014110100000000000000000000000000000000FFFE0006FFFE0028FFDA0008010201013F106BC7FFC40014100100000000000000000000000000000000FFFE0006FFFE0014FFDA0008010100013F10153FFFD9414E4745FFE000104A46494600010101004800480000FFFE00134372656174656420776974682047494D50FFDB00430001010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFDB00430101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFC20011080008000803011100021101031101FFC40014000100000000000000000000000000000009FFC4001501010100000000000000000000000000000607FFDA000C03010002100310000001524A5FFF00FFC40014100100000000000000000000000000000000FFDA00080101000105027FFFC40018110002030000000000000000000000000000F037A7B7FFDA0008010301013F01DFDBFD9FCFFFC40018110002030000000000000000000000000000F036A6B6FFDA0008010201013F01CA3546287FFFC40018100002030000000000000000000000000000F03767A7FFDA0008010100063F02A99C99898FFFC40014100100000000000000000000000000000000FFDA0008010100013F217FFFDA000C030100020003000000101FFFC40014110100000000000000000000000000000000FFDA0008010301013F100FFFC40014110100000000000000000000000000000000FFDA0008010201013F100FFFC40014100100000000000000000000000000000000FFDA0008010100013F100FFFD90A656E6473747265616D0A656E646F626A0A0A322030206F626A0A380A656E646F626A0A0A332030206F626A0A380A656E646F626A0A0A342030206F626A0A2F584F626A6563740A656E646F626A0A0A352030206F626A0A2F496D6167650A656E646F626A0A0A362030206F626A0A2F4443544465636F64650A656E646F626A0A0A372030206F626A0A2F4465766963655247420A656E646F626A0A0A382030206F626A0A313639330A656E646F626A0A0A392030206F626A0A3C3C0A20202F54797065202F436174616C6F670A20202F5061676573203130203020520A3E3E0A656E646F626A0A0A0A31302030206F626A0A3C3C0A20202F54797065202F50616765730A20202F436F756E7420310A20202F4B696473205B3131203020525D0A3E3E0A656E646F626A0A0A31312030206F626A0A3C3C0A20202F54797065202F506167650A20202F506172656E74203130203020520A20202F4D65646961426F78205B302030203820385D0A20202F43726F70426F78205B302030203820385D0A20202F436F6E74656E7473203132203020520A20202F5265736F75726365730A20203C3C0A202020202F584F626A656374203C3C2F496D302031203020523E3E0A20203E3E0A3E3E0A656E646F626A0A0A31322030206F626A0A3C3C2F4C656E6774682033303E3E0A73747265616D0A710A2020382030203020382030203020636D0A20202F496D3020446F0A510A656E6473747265616D0A656E646F626A0A0A0A0A787265660A30203133200A303030303030303030302036353533352066200A30303030303030303137203030303030206E200A30303030303031383631203030303030206E200A30303030303031383739203030303030206E200A30303030303031383937203030303030206E200A30303030303031393232203030303030206E200A30303030303031393435203030303030206E200A30303030303031393732203030303030206E200A30303030303031393939203030303030206E200A30303030303032303230203030303030206E200A30303030303032303736203030303030206E200A30303030303032313432203030303030206E200A30303030303032333039203030303030206E200A0A747261696C6572203C3C202F526F6F74203920302052202F53697A652031333E3E0A0A7374617274787265660A323339310A2525454F460A;
 
SELECT @hA = HASHBYTES('SHA1', @A),
      @hB = HASHBYTES('SHA1', @B);
 
SELECT CASE WHEN @A = @B
                  THEN '@A Equals @B'
                  ELSE '@A Is Not Equal To @B'
                  END AS AB_Equal,
            CASE WHEN @hA = @hB
                  THEN '@hA Equals @hB'
                  ELSE '@hA Is Not Equal To @hB'
                  END AS Hash_Equal;
-- End script

This gives me the output that wins the contest:

Unfortunately upon closer inspection, I see that the rules of the contest say that entries must be received prior to midnight U.S. Eastern Standard Time on October 31, 2010.

Rats, 7 years too late!

January 15, 2018

100 Percent Online Deployments: Stage and Switch

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 12:42 pm
100 Percent Online Deployments
How to deploy schema changes without scheduled downtime

In the first draft of this series, this post didn’t exist. I wanted to show a really simple example of a column switch and include it in the Blue-Green (Details) post. I planned for something simple. But I ran into some hiccups that I though were pretty instructive, so I turned it into the post you see here.

The Plan

For this demo, I wanted to use the WideWorldImporters database. In table Warehouse.ColdRoomTemperatures I wanted to change the column

ColdRoomSensorNumber INT NOT NULL,

into

ColdRoomSensorLabel NVARCHAR(100) NOT NULL,

because maybe we want to track sensors via some serial number or other code.

The Blue-Green plan would be simple:

The Trouble

But nothing is ever easy. Even SQL Server Data Tools (SSDT) gives up when I ask it to do this change with this error dialog:

Never Easy

There’s two things going on here (and one hidden thing):

  1. The first two messages point out that a procedure is referencing the column ColdRoomSensorNumber with schemabinding. The reason it’s using schemabinding is because it’s a natively compiled stored procedure. And that tells me that the table Warehouse.ColdRoomTemperatures is an In-Memory table. That’s not all. I noticed another wrinkle. The procedure takes a table-valued parameter whose table type contains a column called ColdRoomSensorLabel. We’re going to have to replace that too. Ugh. Part of me wanted to look for another example.
  2. The last message tells me that the table is a system versioned table. So there’s a corresponding archive table where history is maintained. That has to be dealt with too. Luckily Microsoft has a great article on Changing the Schema of a System-Versioned Temporal Table.
  3. One last thing to worry about is a index on ColdRoomSensorNumber. That should be replaced with an index on ColdRoomSensorLabel. SSDT didn’t warn me about that because apparently, it can deal with that pretty nicely.

So now my plan becomes:

Blue The original schema

Aqua After the pre-migration scripts are run

An extra step is required here to update the new column and keep the new and old columns in sync.

Green After the switch, we clean up the old objects and our schema change is finished:

Without further ado, here are the scripts:

Pre-Migration (Add Green Objects)

In the following scripts, I’ve omitted the IF EXISTS checks for clarity.

-- Add the four green objects
 
ALTER TABLE Warehouse.ColdRoomTemperatures
ADD ColdRoomSensorLabel NVARCHAR(100) NOT NULL 
    CONSTRAINT DF_Warehouse_ColdRoomTemperatures_ColdRoomSensorLabel DEFAULT '';
GO
 
ALTER TABLE Warehouse.ColdRoomTemperatures
ADD INDEX IX_Warehouse_ColdRoomTemperatures_ColdRoomSensorLabel (ColdRoomSensorLabel);
GO
 
CREATE TYPE Website.SensorDataList_v2 AS TABLE(
    SensorDataListID int IDENTITY(1,1) NOT NULL,
    ColdRoomSensorLabel VARCHAR(100) NULL,
    RecordedWhen datetime2(7) NULL,
    Temperature decimal(18, 2) NULL,
    PRIMARY KEY NONCLUSTERED (SensorDataListID)
)
GO
 
CREATE PROCEDURE Website.RecordColdRoomTemperatures_v2
    @SensorReadings Website.SensorDataList_v2 READONLY
AS
    --straight-forward definition left as exercise for reader
GO

Pre-Migration (Populate and Keep in Sync)

Normally, I would use triggers to keep the new and old column values in sync like this, but you can’t do that with In-Memory tables. So I altered the procedure Website.RecordColdRoomTemperatures to achieve something similar. The only alteration I made is to set the ColdRoomSensorLabel value in the INSERT statement:

ALTER PROCEDURE Website.RecordColdRoomTemperatures
@SensorReadings Website.SensorDataList READONLY
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS
BEGIN ATOMIC WITH
(
    TRANSACTION ISOLATION LEVEL = SNAPSHOT,
    LANGUAGE = N'English'
)
    BEGIN TRY
 
        DECLARE @NumberOfReadings int = (SELECT MAX(SensorDataListID) FROM @SensorReadings);
        DECLARE @Counter int = (SELECT MIN(SensorDataListID) FROM @SensorReadings);
 
        DECLARE @ColdRoomSensorNumber int;
        DECLARE @RecordedWhen datetime2(7);
        DECLARE @Temperature decimal(18,2);
 
        -- note that we cannot use a merge here because multiple readings might exist for each sensor
 
        WHILE @Counter <= @NumberOfReadings
        BEGIN
            SELECT @ColdRoomSensorNumber = ColdRoomSensorNumber,
                   @RecordedWhen = RecordedWhen,
                   @Temperature = Temperature
            FROM @SensorReadings
            WHERE SensorDataListID = @Counter;
 
            UPDATE Warehouse.ColdRoomTemperatures
                SET RecordedWhen = @RecordedWhen,
                    Temperature = @Temperature
            WHERE ColdRoomSensorNumber = @ColdRoomSensorNumber;
 
            IF @@ROWCOUNT = 0
            BEGIN
                INSERT Warehouse.ColdRoomTemperatures
                    (ColdRoomSensorNumber, ColdRoomSensorLabel, RecordedWhen, Temperature)
                VALUES (@ColdRoomSensorNumber, 
                        'HQ-' + CAST(@ColdRoomSensorNumber AS NVARCHAR(50)), 
                        @RecordedWhen, 
                        @Temperature);
            END;
            SET @Counter += 1;
        END;
 
    END TRY
    BEGIN CATCH
        THROW 51000, N'Unable to apply the sensor data', 2;
 
        RETURN 1;
    END CATCH;
END;

That keeps the values in sync for new rows. But now it’s time to update the values for existing rows. In my example, I imagine that the initial label for the sensors are initially: “HQ-1”, “HQ-2”, etc…

UPDATE Warehouse.ColdRoomTemperatures
SET ColdRoomSensorLabel = 'HQ-' + CAST(ColdRoomSensorNumber as nvarchar(50));

Eagle-eyed readers will notice that I haven’t dealt with the history table here. If the history table is large use batching to update it. Or better yet, turn off system versioning and then turn it back on immediately using a new/empty history table (if feasible).

Post-Migration

After a successful switch, the green application is only calling Website.RecordColdRoomTemperatures_v2. It’s time now to clean up. Again, remember that order matters.

DROP PROCEDURE Website.RecordColdRoomTemperatures;
 
DROP TYPE Website.SensorDataList;
 
ALTER TABLE Warehouse.ColdRoomTemperatures
DROP INDEX IX_Warehouse_ColdRoomTemperatures_ColdRoomSensorNumber;
 
ALTER TABLE Warehouse.ColdRoomTemperatures
DROP COLUMN ColdRoomSensorNumber;

January 12, 2018

100 Percent Online Deployments: Keep Changes OLTP-Friendly

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 9:00 am
100 Percent Online Deployments
How to deploy schema changes without scheduled downtime

Using the Blue-Green deployment method, database changes are decoupled from applications changes. That leaves us with one last challenge to tackle. The schema changes have to be performed while the application is online. It’s true that you can’t always write an online script for every kind of schema change you want.

I got the moves like Jagger

The challenge of writing online schema changes is essentially a concurrency problem and the guiding principle I follow is: Do whatever you need to do, but avoid excessive blocking.

Locks Are Hot Potatoes

You can’t hold them for long. This applies to schema changes too. Logically if you don’t hold a lock long, you can’t block activity. One exception might be the SCH-M lock which can participate in blocking chains:

SCH-M locks

There are two main kinds of SQL queries. SELECT/INSERT/UPDATE/DELETE statements are examples of Data Manipulation Language (DML). CREATE/ALTER/DROP statements are examples of Data Definition Language (DDL).

With schema changes – DDL – we have the added complexity of the SCH-M lock. It’s a kind of lock you don’t see with DML statements. DML statements take and hold schema stability locks (SCH-S) on the tables they need. This can cause interesting blocking chains between the two types where new queries can’t start until the schema change succeeds:

Some suggestions:

  • Don’t rebuild indexes while changing schema
  • Rely on the OLTP workload which has many short queries. In an OLTP workload, the lead blocker shouldn’t be a lead blocker for long. Contrast that with an OLAP workload with long-running and overlapping queries. OLAP workloads can’t tolerate changing tables without delays or interruptions.
  • When using Enterprise Edition, use ONLINE=ON for indexes. It takes and holds a SCH-M lock only briefly.

Changes to Big Tables

Scripts that change schema are one-time scripts. If the size of the table is less than 50,000 rows, I write a simple script and then move on.

If the table is larger, look for metadata-only changes. For example, these changes are metadata-only changes:

If a table change is not a meta-data change, then it’s a size-of-data change. Then it’s time to get creative. Look for my other post in this series for an example of batching and an example of a column switcheroo.

Pragmatism Example

If you think “good enough” is neither, you may want to skip this section. There are some schema changes that are still very difficult or impossible to write online. With some creativity, we’ve always been able to mitigate these issues with shortcuts and I want to give an example which I think is pretty illustrative.

When a colleague asked for a rowversion column on a humongous table. We avoided that requirement by instead creating a datetime column called LastModifiedDate. Since 2012, new columns with constant default values are online. So we added the column with a constant default, and then changed the default value to something more dynamic:

alter table dbo.MYTABLE
add LastModifiedDate DATETIME NOT NULL 
    CONSTRAINT DF_TABLE_LastModifiedDate DEFAULT '20000101'
 
alter table dbo.MYTABLE
drop CONSTRAINT DF_TABLE_LastModifiedDate;
 
alter table dbo.MYTABLE
add CONSTRAINT DF_TABLE_LastModifiedDate 
      DEFAULT GETUTCDATE() for LastModifiedDate;

It’s a cool situation because it seems like the column has two defaults, one constant default for rows with missing values. And another definition to be used for new rows:

select pc.default_value, d.definition as [default definition]
from sys.system_internals_partitions p
join sys.system_internals_partition_columns pc 
	on p.partition_id = pc.partition_id
join sys.default_constraints d
	on d.parent_object_id = p.object_id
	and d.parent_column_id = pc.partition_column_id
where p.object_id = object_id('MYTABLE')
and pc.partition_column_id = 2
 
/* Gives 
 
default_value  default definition
-------------  ------------------
2000-01-01     (getutcdate())
 
*/

So be creative and pragmatic. Successful 100% online schema changes involve creativity and close collaboration between everyone involved.

Older Posts »

Powered by WordPress