Michael J. Swart

July 3, 2018

Shifting Gears in 2018

Filed under: SQLServerPedia Syndication — Michael J. Swart @ 9:00 am

I wanted you to know about some changes coming to this blog. I’m shifting the focus from SQL Server to other technologies. Specifically, I’m going to explore and begin writing more about modern software development including things that have been labeled devops or site reliability engineering.

Shifting Gears

I’ve been looking for a new challenge for a while and I have an opportunity to do that by following the direction set by my company a few years ago. My company is embracing the public cloud for its price, its flexibility and its promise of scalability. Which public cloud? As awesome as Azure is, we’re going all-in AWS.

For me, this means new lessons to learn and new things to write about.

My Audience

My target audience for the new topics include

  • People searching google who want to find the answers to the same questions I learned recently.
  • The developer who is super-familiar with the Microsoft Stack (aka yours truly) but who wants to branch out into a new stack.

I hope that still includes you.

Blogging as a Student

I have no problems blogging as a learner. Just look at Kenneth Fisher (@sqlstudent144) and Pinal Dave (@SqlAuthority). They both began their blogs from the point of view of a learner. That word “student” is even there in Kenneth’s handle. And Pinal’s site is about his “journey to authority”, another colorful expression for learning. And they’ve done it. They’ve both successfully gained a reputation as an authority in their field.

My Topics

I’ve often straddled the line between a Developer and a DBA. I know a little bit about what it takes to keep SQL Server happy and healthy. I look forward to expanding my “Site Reliability Engineering” skills into new areas.

So for the next few weeks, I’ll start by blogging about the tools I use and what it takes to get started on a simple project.

It’s About the Arrows
Software architecture is often over-simplified as drawing boxes and arrows describing things (the boxes) and how they’re organized or how they communicate with each other (the arrows).

One thing I’ve noticed is that programs used to be the hard part. The classes, the objects, the algorithms. Now it seems to me, that the arrows are the hard part. Deployment, security, automation and all that network stuff can’t be deferred to another team.

The Arrows Are The Hard Part

The Arrows Are The Hard Part

I may specialize in something in the future, but for now I have no shortage of topics. I’ve been tracking my google search history: Here’s what that looks like for the past two weeks:

  • youtube getting started terraform aws circleci
  • tf examples getting started
  • terraform tf examples getting started
  • terraform deploy to aws
  • specify descending in primary key
  • codepipeline
  • aws code deploy
  • dynamodb ttl attribute
  • lambda to dynamodb tutorial
  • javascript add 4 months
  • add days to time javascript
  • javascript get guid
  • Handler ‘handler’ missing on module ‘index’
  • TypeError: Date.now is not a constructor
  • Date.now is not a constructor
  • unix timestamp 1 day
  • dynamodb set ttl example js
  • dynamodb DocumentClient
  • specify region in document client
  • aws.config.update region
  • lodash
  • visual studio code
  • visual studio code marketplace tf
  • visual studio code marketplace tf terraform
  • terraform dynamodb attribute type
  • terraform lambda example
  • terraform output arn
  • create role terraform
  • iam_role_policy
  • best way to terraform a role
  • script out role for terraform
  • terraform dynamodb example
  • invoke terraform in aws
  • how to test terraform
  • terraform download
  • aws command line
  • how to create a role using terraform
  • terraform grant a role access
  • deploy a role with terraform
  • create role assume role
  • terraform role trusted entities
  • push a new repository to github
  • provider config ‘aws’: unknown variable referenced ‘aws_secret_key
  • terraform aws credentials
  • aws_profile environment variable
  • set AWS_PROFILE
  • specify aws_access_key terraform
  • executable bash script
  • executable bash script windows
  • provider.aws: no suitable version installed
  • no suitable version installed
  • run terraform in circleci
  • run syntax circleci
  • run step syntax circleci
  • specify circleci environement variables
  • set password environment variable circleci
  • terraform “ResourceInUseException: Table already exists: broken_links”
  • terraform “ResourceInUseException: Table already exists:”
  • image hashicorp terraform
  • terraform EntityAlreadyExists
  • terraform backend dynamodb
  • canonical userid s3
  • deploy a lambda function terraform
  • terraform lambda runtime
  • resource “aws_lambda_function”
  • terraform archive_file
  • resource depends on
  • resource depends_on terraform
  • DiffTransformer
  • DiffTransformer trace
  • terraform archive_file example
  • depends_on terraform module
  • path.module terraform
  • windows path vs linux path terraform path.module
  • circleci zip directory
  • zip a file in shell
  • circleci zip
  • zip a file in circleci
  • working_directory circleci
  • zip directory for lambda
  • how to zip a file circleci
  • circleci apt-get zip
  • terraform export environment variables
  • run a shell srcript in terraform
  • steps in circleci
  • circleci artifact directory
  • build-artifacts circleci
  • store_artifacts
  • store variable in circleci
  • create file in terraform
  • output_base64sha256
  • concatenate in terraform
  • Unexpected value for InstanceType
  • Unexpected value for InstanceType terraform
  • terraform apply force
  • use artifacts terraform
  • get artifacts terraform
  • get artifacts circleci
  • use circleci artifacts
  • terraform file contents
  • terraform environment variables
  • use environment variables in terraform
  • var.Circle_artifacts
  • using environment variables in terraform
  • TF_VAR_CIRCLE_ARTIFACTS
  • set variables when calling terraform
  • use environment variables in circleci
  • multiline circleci
  • wrap line circleci
  • terraform pass variable to module
  • echo in circleci
  • persist to workspace circleci
  • attach_workspace persist_to_workspace
  • persist_to_workspace
  • debugging circleci
  • git merge all changes into one commit
  • dynamodb materialized views
  • query dynamodb from js
  • query dynamodb from
  • aws_lambda_function filename
  • AWS Lambda Developer Guide
  • bash zip command not found
  • linux create zip file
  • upsert dynamodb
  • updateexpression example js
  • dynamodb docclient javascript update expression
  • use UpdateExpression to increment
  • The provided key element does not match the schema
  • dynamodb multiple key
  • javascript multiline string
  • javascript md5 hash
  • hash a string javascript
  • md5
  • simple hash string javascript
  • hash a string javascript
  • md5 bit length
  • Every entry in that list that doesn’t have an obvious answer is a blog post idea.

    Giving up SQL Server?

    No, not at all, I suspect that most of my day job will still be focused on SQL Server technologies. When I come across something super-interesting. No matter what, I’ll write about it.

    Networking

    I’m excited. If you find yourself at AWS: Reinvent this fall, then let me know. Maybe we can meet for coffee.

    June 15, 2018

    ORDER BY newid() is an Unbiased Way To Randomize

    Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 9:47 am

    Mike Bostock is a data-visualization specialist. And it really shows in his blog. Every article is really well designed (which makes sense… many of the articles are about design).

    One of his articles, Visualizing Algorithms has some thoughts on shuffling at https://bost.ocks.org/mike/algorithms/#shuffling.

    He says that sorting using a random comparator is a rotten way to shuffle things. Not only is it inefficient, but the resulting shuffle is really really biased. He goes on to visualize that bias (again, I really encourage you to go see his stuff).

    Ordering by random reminded me of the common technique in SQL Server of ORDER BY newid(). So I wondered whether an obvious bias was present there. So I shuffled 100 items thousands of times and recreated the visualization of bias in a heat map (just like Mike did).

    Here is the heatmap. If you can, try to identify any patterns.

    Order By NewID Bias

    Where:

      columns are the position before the shuffle,
      rows are the position after the shuffle,
      green is a positive bias and
      red is a negative bias.

    I don’t think there is any bias here. The problem that introduces bias in Mike Bostock’s example is that his “random comparator” that he defined does not obey transitivity. His words. “A comparator must obey transitivity: if a > b and b > c, then a > c.”
    But in SQL Server, because each row is assigned a newid(), ORDER BY newid() doesn’t have that flaw and so it doesn’t have that bias.

    But Be Careful

    Although the method is unbiased, ORDER BY newid() is still inefficient. It uses a sort which is an inefficient way of shuffling. There are alternative shuffle algorithms that are more efficient.
    ORDER BY newid() is good for quick and dirty purposes. But if you value performance, shuffle in the app.

    April 6, 2018

    Are There Any System Generated Constraint Names Lurking In Your Database?

    Names for constraints are optional meaning that if you don’t provide a name when it’s created or cannot afford one, one will be appointed to you by the system.
    These system provided names are messy things and I don’t think I have to discourage you from using them. Kenneth Fisher has already done that in Constraint names, Say NO to the default.

    But how do you know whether you have any?

    Here’s How You Check

    SELECT SCHEMA_NAME(schema_id) AS [schema name],
           OBJECT_NAME(object_id) AS [system generated object name],
           OBJECT_NAME(parent_object_id) AS [parent object name],
           type_desc AS [object type]
      FROM sys.objects
     WHERE OBJECT_NAME(object_id) LIKE 
             type + '\_\_' + LEFT(OBJECT_NAME(parent_object_id),8) + '\_\_%' ESCAPE '\'
           OR
           OBJECT_NAME(object_id) LIKE 
              REPLACE(sys.fn_varbintohexstr(CAST(object_id AS VARBINARY(MAX))), '0x', '%\_\_') ESCAPE '\'

    This will find all your messy system-named constraints.
    For example, a table defined like this:

    create table MY_TABLE
    (
      id INT IDENTITY PRIMARY KEY,
      id2 INT REFERENCES MY_TABLE(id) DEFAULT 0,
      UNIQUE(id),
      CHECK (id >= 0)
    )

    Will give results like this:

    Happy hunting.

    Update: April 9, 2018
    We can get this info from the system views a little easier as Rob Volk pointed out. I’ve also included the parent object’s type.

    SELECT OBJECT_SCHEMA_NAME(id) AS [schema name],
           OBJECT_NAME(constid) AS [system generated constraint name],
           (select type_desc from sys.objects where object_id = constid) as [constraint type],
           OBJECT_NAME(id) AS [parent object name],
           (select type_desc from sys.objects where object_id = id) as [parent object type]
      FROM sys.sysconstraints
     WHERE status & 0x20000 > 0
       AND OBJECT_NAME(id) NOT IN (N'__RefactorLog', N'sysdiagrams')
     ORDER BY [parent object type], [parent object name], [system generated constraint name];

    March 26, 2018

    T-SQL Options for Comparing “Distinctness”

    Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 8:40 am

    I had the privilege of listening to Itzik Ben Gan talk about “distinctness” in a talk he gave at PASS Summit. Distinctness is a relationship or comparison between two variables, just like equals (=). But unlike equality, distinctness treats NULLs in a more intuitive way (NULL is not distinct from NULL).

    There’s often confusion because equality in SQL is not like equality in mathematics. In particular equality in SQL doesn’t follow the reflexive property (∀A, A=A).

    Clear right?

    I explore different syntax options to test whether values are distinct or not. Each method has its pros and cons.

    The Setup

    Consider this table.

    CREATE TABLE Tasks
    (
        TaskId INT NOT NULL IDENTITY CONSTRAINT PK_TASKS PRIMARY KEY,
        AssignedTeamId INT NULL,
        AssignedSubTeamId INT NULL,
        TaskDetails NVARCHAR(2000) NOT NULL,
        INDEX IX_TASKS_ASSIGNEDTEAMID (AssignedTeamId)
    );

    When a task is unassigned, the AssignedTeamId and AssignedSubTeamId columns can both be null.
    Our goal will be to select an arbitrary TaskId given parameters @TeamId, @SubTeamId. And when parameters @TeamId and @SubTeamId are both null, I want to return an unassigned task.

    The Equality Join (doesn’t compare distinctness)

    I just want to post this here as an example that doesn’t work.

    DECLARE @TeamId bigint = NULL,
        @SubTeamId bigint = NULL;
     
    -- this will never return any rows when the parameters are null:
    SELECT TOP 1 TaskId
      FROM Tasks
     WHERE AssignedTeamId = @TeamId
       AND AssignedSubTeamId = @SubTeamId

    PROS: The syntax looks nice and clean.
    CONS: It doesn’t work for nulls.

    The Expanded WHERE Clause

    Well, let’s just write it all out then.

    DECLARE @TeamId bigint = NULL,
        @SubTeamId bigint = NULL;
     
    SELECT TOP 1 TaskId
      FROM Tasks
     WHERE ( (AssignedTeamId IS NULL AND @TeamId IS NULL) OR AssignedTeamId = @TeamId )
       AND ( (AssignedSubTeamId IS NULL AND @SubTeamId IS NULL) OR AssignedSubTeamId = @SubTeamId )

    There’s no way that syntax is sarg-able. But it turns out that it is. SQL Server works hard and says “I see what you’re trying to do there, I’ll help you out”.
    PROS: It works and it’s sarg-able.
    CONS: That syntax is sooo awkward.

    Using INTERSECT Syntax

    This is a tip that I got straight from Itzik Ben Gan who says he got the idea from Paul White. The idea is that INTERSECT doesn’t use the idea of equality, but of distinctness for it’s comparison. We can use this to create slightly nicer syntax.

    DECLARE @TeamId bigint = NULL,
        @SubTeamId bigint = NULL;
     
    SELECT TOP 1 TaskId
    FROM tasks
    WHERE EXISTS (
        SELECT assignedTeamId, assignedSubTeamId
        INTERSECT
        SELECT @TeamId, @SubTeamId
    )

    The syntax is slightly less awkward, and it’s sarg-able. Or should be… But there’s a problem with this query (see if you can find it before reading further). Compare the two query plans. First the expanded where clause:

    The Expanded where clause produces an efficient seek.


    Here’s what the query with the INTERSECT syntax produces:

    The INTERSECT syntax produces an inefficient scan

    The secret to this mystery lies in that filter operator. There’s an implicit conversion there from int to bigint and that can cause a scan of the entire index. With the expanded syntax, SQL Server can handle the conversion gracefully. With the INTERSECT syntax it cannot. This was a really hard-earned lesson for us this week.

    Change the parameters @TeamId and @SubTeamId to INT to match and the query becomes sarg-able again.

    PROS: More elegant syntax and sarg-able (when you’re careful)
    CONS: This syntax causes performance issues with mismatched types. Take extra-special care to make sure types match up.

    IS NOT DISTINCT FROM

    Check it:

    DECLARE @TeamId bigint = NULL,
        @SubTeamId bigint = NULL;
     
    SELECT TOP 1 TaskId
    FROM tasks
    WHERE assignedTeamId IS NOT DISTINCT FROM @TeamId
      AND assignedSubTeamId IS NOT DISTINCT FROM @SubTeamId

    Talk about elegant! That’s what we wanted from the beginning. It’s part of ANSI’s SQL 1999 standard. Paul White tells us it’s implemented internally as part of the query processor, but it’s not part of T-SQL! There’s a connect item for it… err. Or whatever they’re calling it these days. Go read all the comments and then give it a vote. There are lots of examples of problems that this feature would solve.

    PROS: Super-elegant!
    CONS: Invalid syntax (vote to have it included).

    March 15, 2018

    “Failed to initialize deadlock control, Key cannot be null” When Viewing Deadlock Graphs

    Filed under: SQLServerPedia Syndication — Michael J. Swart @ 10:02 am

    I recently got this error in Management Studio when trying to view a deadlock graph that was collected with an extended events session:

    Failed to initialize deadlock control.
        Key cannot be null.
        Parameter name: key

    I found this error in a session that included the xml_deadlock_report event. I chose to “View Target Data” and then clicked the deadlock tab:

    So I couldn’t view the deadlock graph, but the xml for the deadlock was still accessible. I copy-pasted the deadlock graph information into an xdl file, I opened it in management studio and I still got the same error.

    No luck in Management Studio, but SentryOne’s Plan Explorer had no trouble displaying the deadlock graph visually. In fact it kind of helped me figure out what’s going on:

    I noticed a number of things:

    • The resource list is full of “exchangeEvent” elements (which has to do with parallelism) and not the usual key locks I’m used to seeing.
    • The “processes” are all the same process, just different threads (The spids on all processes are the same but the ecids are different)
    • Most importantly, there is no victim here! And our application that issued the query didn’t receive an error from the database.

    My guess is that the deadlock monitor detects this kind of deadlock cycle, but knows how to resolve it without choosing a victim and then issues a deadlock report any way.

    The reason that the management studio can’t display the deadlock is because it assumes that there is at least one victim in a deadlock graph.

    The original deadlock graph starts like this:

    <deadlock>
     <victim-list />
     <process-list>
      <process id="process3ad41344e8" taskpriority="0" logused="10000" ...

    And when I edit it and pick an arbitrary victim like this:

    <deadlock>
     <victim-list>
    	<victimProcess id="process3ad41344e8" />
     </victim-list>
     <process-list>
      <process id="process3ad41344e8" taskpriority="0" logused="10000" ...

    then Management Studio is happy again:

    This Kind of Deadlock Is Ignorable

    Take this with a grain of salt, but personally, I ignore these kinds of deadlock because there is no victim and my applications are unaffected.
    I think there are two issues here.

    1. Management Studio can’t handle deadlock graphs with zero victims. But that’s just a UI problem.
    2. This “interquery parallelism” kind of deadlock is resolved internally somehow (yay!) but an xml_deadlock_report event is still raised (boo!)

    Update Turns out this is a bit of a known issue. There’s a request you can vote on to add extra info to the xml_deadlock_report that would allow us to filter these non-deadlocks out: Add columns has_victims and is_intra_query_deadlock to the event xml_deadlock_report

    March 12, 2018

    My Trip to the West Coast

    Filed under: SQLServerPedia Syndication — Michael J. Swart @ 11:22 am

    I was at the MVP Global Summit last week in Redmond, Washington. I was excited to go because I hadn’t been to the summit since 2014. Since I was travelling to the west coast anyway, I applied to speak at SQL Saturday Victoria the weekend before and I got accepted. Then, I expanded my trip to include a visit to the D2L office near Vancouver making it a week and a half tour of the West coast.

    The bonus for me is that I got to visit British Columbia. I’ve spent my whole life in Canada, but I’ve never been to British Columbia and I’m so glad I went. That is one good looking province.

    West Coast D2L

    I started my trip by visiting D2L first.

    Everyone single person there was awesome. D2L has a great work culture and the Vancouver office was no different. They treated me like family.

    On the first day, we did a Q&A on anything database related. There were some good questions and I get the feeling that some of the answers were surprising. For example, I don’t advocate for stored procedures over inline SQL.

    At the end of the week, I got to practice my session that I prepared for SQL Saturday Victoria in front of my colleagues. Now that was interesting. I wrote the talk for a general audience. It’s always useful to imagine an audience member when writing a talk. I imagined someone who wanted to learn more about databases and chose to give up their Saturday to do it. This imaginary audience member chose to come to a talk called “100 Percent Online Migrations” and presumably were interested in schema migrations. So when I gave the talk to my D2L colleagues, it was interesting to be able to get into specifics because I know the business challenges we’re all dealing with. The Vancouver office gave awesome feedback and they didn’t heckle me too much.

    Other sights included a pikachu car, a build-is-broken light (just like home!) and an impromptu tour of the best coffee roaster in Richmond, BC

    Before I left for Victoria, I got to spend a little time in downtown Vancouver

    Downtown Vancouver

    Even though I didn’t have a lot of time in downtown Vancouver, I did get to snap a couple photos before the sun went down. Mostly in and around Gastown.

    Gassy Jack, Steam Clock, Vancouver Street


    Another thing I noticed is that no one seems to smoke tobacco any more.

    Before I left, I did manage to find the alley that Bastian ran down when he was chased by bullies in the Neverending Story. But the alley hasn’t aged well. It didn’t look like luck-dragons frequented that place any more so I didn’t take a picture.

    Then it was time to head to SQL Saturday Victoria

    Ferry to Victoria

    I spent Friday on busses and ferries. Long trips in Ontario are dull. If you’ve seen one corn field, you’ve seen them all. But in British Columbia, the passing scenery is beautiful. Here’s a timelapse video of part of the ferry ride to Swartz Bay (not Swart’s bay).

    SQL Saturday Victoria

    SQL Saturday in Victoria was great. I was relaxed, and the talk (I thought) went really really well.

    Randolph tries not to heckle Michael


    I wasn’t super-thrilled with the turn-out. There were about ten people there (five were fellow speakers). And I have an uncomfortably long streak of having someone fall asleep during my talks. I know I’m not the most dynamic speaker but it’s starting to get on my nerves. That streak remains unbroken thanks in part to Erland’s jetlag.

    But it remained a special SQL Saturday for me. I discovered that the venue was Camosun College in Victoria. Not only is Camosun College a client of D2L, but they’ve got a beautiful campus. Fun fact, spring has already arrived on campus there:

    Spring!

    Thanks to Janice, Scott and everyone else who ran a very very successful event.

    On Sunday, Randolph, Mike and I waited in line for breakfast and walked around Victoria until the evening when we took a boat to meet Angela in Seattle, then on to Bellevue just in time for the MVP Global Summit in Redmond!

    The 2018 MVP Global Summit

    It’s been a while since I attended the MVP Summit. And this one was a good one. Everyone behaved for the most part. In past years, I remember a lot of people looking gift horses in the mouth. The feedback from MVPs to Microsoft sometimes took the form of angry complaints. Even if it was about something they didn’t know existed until that day. There was much less of that this year.

    Personally I got to give feedback about some SQL Server features, and I got to hear about new stuff coming up. I’m always very grateful for Microsoft for putting on this event. I returned home with a lot more blue stuff than I started with.

    What does this photobooth do?

    (Special thanks to Scott, Mike, Angela and Randolph for making this trip extra fun. You’re the best)

    January 17, 2018

    SHA1 Collisions in SQL Server

    Takeaway: It’s been frowned on for a while, but SHA1 is definitely broken for security purposes.

    In October of 2010, Michael Coles created a contest on his blog called “Find a Hash Collision, Win $100“. The contest was part of a discussion at the time about whether the SHA1 hash was useful for detecting changes. For what it’s worth, I still think SHA1 is valuable as a consistency check if not for security.

    At the time no SHA1 hash collisions were known, but in 2017, the news broke that some researchers finally generated a collision. So I looked up the research paper and downloaded the files. I used OPENROWSET to get the binary strings and I created my entry for Michael Coles’ contest:

    --  Begin script
    DECLARE @A varbinary(8000),
          @B varbinary(8000),
          @hA binary(20),
          @hB binary(20);
     
    -- Replace the ? below with binary strings
     
    SELECT @A = 0x255044462D312E330A25E2E3CFD30A0A0A312030206F626A0A3C3C2F57696474682032203020522F4865696768742033203020522F547970652034203020522F537562747970652035203020522F46696C7465722036203020522F436F6C6F7253706163652037203020522F4C656E6774682038203020522F42697473506572436F6D706F6E656E7420383E3E0A73747265616D0AFFD8FFFE00245348412D3120697320646561642121212121852FEC092339759C39B1A1C63C4C97E1FFFE017F46DC93A6B67E013B029AAA1DB2560B45CA67D688C7F84B8C4C791FE02B3DF614F86DB1690901C56B45C1530AFEDFB76038E972722FE7AD728F0E4904E046C230570FE9D41398ABE12EF5BC942BE33542A4802D98B5D70F2A332EC37FAC3514E74DDC0F2CC1A874CD0C78305A21566461309789606BD0BF3F98CDA8044629A10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000FFFE00FE0000000000000000FFE000104A46494600010101004800480000FFFE00134372656174656420776974682047494D50FFDB00430001010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFDB00430101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFC20011080008000803011100021101031101FFC40014000100000000000000000000000000000008FFC40014010100000000000000000000000000000009FFFE0006FFFE002FFFDA000C03010002100310000001539DC51FFFC4001510010100000000000000000000000000001626FFFE0006FFFE0033FFDA0008010100010502A953FFC4001F1100000309000000000000000000000000141517001316274563658695FFFE0006FFFE0041FFDA0008010301013F019A8AA56D533BB238739E612166B90E605BFFC4001E11000004070000000000000000000000000014151713274563658594FFFE0006FFFE0040FFDA0008010201013F01984E1555C155B66CDC3E04A21A444C40FFC4001E10000101090000000000000000000000001413001215164462648594FFFE0006FFFE0033FFDA0008010100063F02AD9A4DB175DCE6086D743B05BFFFC40014100100000000000000000000000000000000FFFE0006FFFE0012FFDA0008010100013F216001FFFE0006FFFE002BFFDA000C030100020003000000101FFFC40014110100000000000000000000000000000000FFFE0006FFFE0028FFDA0008010301013F106980FFC40014110100000000000000000000000000000000FFFE0006FFFE0028FFDA0008010201013F106BC7FFC40014100100000000000000000000000000000000FFFE0006FFFE0014FFDA0008010100013F10153FFFD9414E4745FFE000104A46494600010101004800480000FFFE00134372656174656420776974682047494D50FFDB00430001010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFDB00430101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFC20011080008000803011100021101031101FFC40014000100000000000000000000000000000009FFC4001501010100000000000000000000000000000607FFDA000C03010002100310000001524A5FFF00FFC40014100100000000000000000000000000000000FFDA00080101000105027FFFC40018110002030000000000000000000000000000F037A7B7FFDA0008010301013F01DFDBFD9FCFFFC40018110002030000000000000000000000000000F036A6B6FFDA0008010201013F01CA3546287FFFC40018100002030000000000000000000000000000F03767A7FFDA0008010100063F02A99C99898FFFC40014100100000000000000000000000000000000FFDA0008010100013F217FFFDA000C030100020003000000101FFFC40014110100000000000000000000000000000000FFDA0008010301013F100FFFC40014110100000000000000000000000000000000FFDA0008010201013F100FFFC40014100100000000000000000000000000000000FFDA0008010100013F100FFFD90A656E6473747265616D0A656E646F626A0A0A322030206F626A0A380A656E646F626A0A0A332030206F626A0A380A656E646F626A0A0A342030206F626A0A2F584F626A6563740A656E646F626A0A0A352030206F626A0A2F496D6167650A656E646F626A0A0A362030206F626A0A2F4443544465636F64650A656E646F626A0A0A372030206F626A0A2F4465766963655247420A656E646F626A0A0A382030206F626A0A313639330A656E646F626A0A0A392030206F626A0A3C3C0A20202F54797065202F436174616C6F670A20202F5061676573203130203020520A3E3E0A656E646F626A0A0A0A31302030206F626A0A3C3C0A20202F54797065202F50616765730A20202F436F756E7420310A20202F4B696473205B3131203020525D0A3E3E0A656E646F626A0A0A31312030206F626A0A3C3C0A20202F54797065202F506167650A20202F506172656E74203130203020520A20202F4D65646961426F78205B302030203820385D0A20202F43726F70426F78205B302030203820385D0A20202F436F6E74656E7473203132203020520A20202F5265736F75726365730A20203C3C0A202020202F584F626A656374203C3C2F496D302031203020523E3E0A20203E3E0A3E3E0A656E646F626A0A0A31322030206F626A0A3C3C2F4C656E6774682033303E3E0A73747265616D0A710A2020382030203020382030203020636D0A20202F496D3020446F0A510A656E6473747265616D0A656E646F626A0A0A0A0A787265660A30203133200A303030303030303030302036353533352066200A30303030303030303137203030303030206E200A30303030303031383631203030303030206E200A30303030303031383739203030303030206E200A30303030303031383937203030303030206E200A30303030303031393232203030303030206E200A30303030303031393435203030303030206E200A30303030303031393732203030303030206E200A30303030303031393939203030303030206E200A30303030303032303230203030303030206E200A30303030303032303736203030303030206E200A30303030303032313432203030303030206E200A30303030303032333039203030303030206E200A0A747261696C6572203C3C202F526F6F74203920302052202F53697A652031333E3E0A0A7374617274787265660A323339310A2525454F460A,
           @B = 0x255044462D312E330A25E2E3CFD30A0A0A312030206F626A0A3C3C2F57696474682032203020522F4865696768742033203020522F547970652034203020522F537562747970652035203020522F46696C7465722036203020522F436F6C6F7253706163652037203020522F4C656E6774682038203020522F42697473506572436F6D706F6E656E7420383E3E0A73747265616D0AFFD8FFFE00245348412D3120697320646561642121212121852FEC092339759C39B1A1C63C4C97E1FFFE017346DC9166B67E118F029AB621B2560FF9CA67CCA8C7F85BA84C79030C2B3DE218F86DB3A90901D5DF45C14F26FEDFB3DC38E96AC22FE7BD728F0E45BCE046D23C570FEB141398BB552EF5A0A82BE331FEA48037B8B5D71F0E332EDF93AC3500EB4DDC0DECC1A864790C782C76215660DD309791D06BD0AF3F98CDA4BC4629B10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000FFFE00FE0000000000000000FFE000104A46494600010101004800480000FFFE00134372656174656420776974682047494D50FFDB00430001010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFDB00430101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFC20011080008000803011100021101031101FFC40014000100000000000000000000000000000008FFC40014010100000000000000000000000000000009FFFE0006FFFE002FFFDA000C03010002100310000001539DC51FFFC4001510010100000000000000000000000000001626FFFE0006FFFE0033FFDA0008010100010502A953FFC4001F1100000309000000000000000000000000141517001316274563658695FFFE0006FFFE0041FFDA0008010301013F019A8AA56D533BB238739E612166B90E605BFFC4001E11000004070000000000000000000000000014151713274563658594FFFE0006FFFE0040FFDA0008010201013F01984E1555C155B66CDC3E04A21A444C40FFC4001E10000101090000000000000000000000001413001215164462648594FFFE0006FFFE0033FFDA0008010100063F02AD9A4DB175DCE6086D743B05BFFFC40014100100000000000000000000000000000000FFFE0006FFFE0012FFDA0008010100013F216001FFFE0006FFFE002BFFDA000C030100020003000000101FFFC40014110100000000000000000000000000000000FFFE0006FFFE0028FFDA0008010301013F106980FFC40014110100000000000000000000000000000000FFFE0006FFFE0028FFDA0008010201013F106BC7FFC40014100100000000000000000000000000000000FFFE0006FFFE0014FFDA0008010100013F10153FFFD9414E4745FFE000104A46494600010101004800480000FFFE00134372656174656420776974682047494D50FFDB00430001010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFDB00430101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101FFC20011080008000803011100021101031101FFC40014000100000000000000000000000000000009FFC4001501010100000000000000000000000000000607FFDA000C03010002100310000001524A5FFF00FFC40014100100000000000000000000000000000000FFDA00080101000105027FFFC40018110002030000000000000000000000000000F037A7B7FFDA0008010301013F01DFDBFD9FCFFFC40018110002030000000000000000000000000000F036A6B6FFDA0008010201013F01CA3546287FFFC40018100002030000000000000000000000000000F03767A7FFDA0008010100063F02A99C99898FFFC40014100100000000000000000000000000000000FFDA0008010100013F217FFFDA000C030100020003000000101FFFC40014110100000000000000000000000000000000FFDA0008010301013F100FFFC40014110100000000000000000000000000000000FFDA0008010201013F100FFFC40014100100000000000000000000000000000000FFDA0008010100013F100FFFD90A656E6473747265616D0A656E646F626A0A0A322030206F626A0A380A656E646F626A0A0A332030206F626A0A380A656E646F626A0A0A342030206F626A0A2F584F626A6563740A656E646F626A0A0A352030206F626A0A2F496D6167650A656E646F626A0A0A362030206F626A0A2F4443544465636F64650A656E646F626A0A0A372030206F626A0A2F4465766963655247420A656E646F626A0A0A382030206F626A0A313639330A656E646F626A0A0A392030206F626A0A3C3C0A20202F54797065202F436174616C6F670A20202F5061676573203130203020520A3E3E0A656E646F626A0A0A0A31302030206F626A0A3C3C0A20202F54797065202F50616765730A20202F436F756E7420310A20202F4B696473205B3131203020525D0A3E3E0A656E646F626A0A0A31312030206F626A0A3C3C0A20202F54797065202F506167650A20202F506172656E74203130203020520A20202F4D65646961426F78205B302030203820385D0A20202F43726F70426F78205B302030203820385D0A20202F436F6E74656E7473203132203020520A20202F5265736F75726365730A20203C3C0A202020202F584F626A656374203C3C2F496D302031203020523E3E0A20203E3E0A3E3E0A656E646F626A0A0A31322030206F626A0A3C3C2F4C656E6774682033303E3E0A73747265616D0A710A2020382030203020382030203020636D0A20202F496D3020446F0A510A656E6473747265616D0A656E646F626A0A0A0A0A787265660A30203133200A303030303030303030302036353533352066200A30303030303030303137203030303030206E200A30303030303031383631203030303030206E200A30303030303031383739203030303030206E200A30303030303031383937203030303030206E200A30303030303031393232203030303030206E200A30303030303031393435203030303030206E200A30303030303031393732203030303030206E200A30303030303031393939203030303030206E200A30303030303032303230203030303030206E200A30303030303032303736203030303030206E200A30303030303032313432203030303030206E200A30303030303032333039203030303030206E200A0A747261696C6572203C3C202F526F6F74203920302052202F53697A652031333E3E0A0A7374617274787265660A323339310A2525454F460A;
     
    SELECT @hA = HASHBYTES('SHA1', @A),
          @hB = HASHBYTES('SHA1', @B);
     
    SELECT CASE WHEN @A = @B
                      THEN '@A Equals @B'
                      ELSE '@A Is Not Equal To @B'
                      END AS AB_Equal,
                CASE WHEN @hA = @hB
                      THEN '@hA Equals @hB'
                      ELSE '@hA Is Not Equal To @hB'
                      END AS Hash_Equal;
    -- End script

    This gives me the output that wins the contest:

    Unfortunately upon closer inspection, I see that the rules of the contest say that entries must be received prior to midnight U.S. Eastern Standard Time on October 31, 2010.

    Rats, 7 years too late!

    January 15, 2018

    100 Percent Online Deployments: Stage and Switch

    Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 12:42 pm
    100 Percent Online Deployments
    How to deploy schema changes without scheduled downtime

    In the first draft of this series, this post didn’t exist. I wanted to show a really simple example of a column switch and include it in the Blue-Green (Details) post. I planned for something simple. But I ran into some hiccups that I though were pretty instructive, so I turned it into the post you see here.

    The Plan

    For this demo, I wanted to use the WideWorldImporters database. In table Warehouse.ColdRoomTemperatures I wanted to change the column

    ColdRoomSensorNumber INT NOT NULL,

    into

    ColdRoomSensorLabel NVARCHAR(100) NOT NULL,

    because maybe we want to track sensors via some serial number or other code.

    The Blue-Green plan would be simple:

    The Trouble

    But nothing is ever easy. Even SQL Server Data Tools (SSDT) gives up when I ask it to do this change with this error dialog:

    Never Easy

    There’s two things going on here (and one hidden thing):

    1. The first two messages point out that a procedure is referencing the column ColdRoomSensorNumber with schemabinding. The reason it’s using schemabinding is because it’s a natively compiled stored procedure. And that tells me that the table Warehouse.ColdRoomTemperatures is an In-Memory table. That’s not all. I noticed another wrinkle. The procedure takes a table-valued parameter whose table type contains a column called ColdRoomSensorLabel. We’re going to have to replace that too. Ugh. Part of me wanted to look for another example.
    2. The last message tells me that the table is a system versioned table. So there’s a corresponding archive table where history is maintained. That has to be dealt with too. Luckily Microsoft has a great article on Changing the Schema of a System-Versioned Temporal Table.
    3. One last thing to worry about is a index on ColdRoomSensorNumber. That should be replaced with an index on ColdRoomSensorLabel. SSDT didn’t warn me about that because apparently, it can deal with that pretty nicely.

    So now my plan becomes:

    Blue The original schema

    Aqua After the pre-migration scripts are run

    An extra step is required here to update the new column and keep the new and old columns in sync.

    Green After the switch, we clean up the old objects and our schema change is finished:

    Without further ado, here are the scripts:

    Pre-Migration (Add Green Objects)

    In the following scripts, I’ve omitted the IF EXISTS checks for clarity.

    -- Add the four green objects
     
    ALTER TABLE Warehouse.ColdRoomTemperatures
    ADD ColdRoomSensorLabel NVARCHAR(100) NOT NULL 
        CONSTRAINT DF_Warehouse_ColdRoomTemperatures_ColdRoomSensorLabel DEFAULT '';
    GO
     
    ALTER TABLE Warehouse.ColdRoomTemperatures
    ADD INDEX IX_Warehouse_ColdRoomTemperatures_ColdRoomSensorLabel (ColdRoomSensorLabel);
    GO
     
    CREATE TYPE Website.SensorDataList_v2 AS TABLE(
        SensorDataListID int IDENTITY(1,1) NOT NULL,
        ColdRoomSensorLabel VARCHAR(100) NULL,
        RecordedWhen datetime2(7) NULL,
        Temperature decimal(18, 2) NULL,
        PRIMARY KEY NONCLUSTERED (SensorDataListID)
    )
    GO
     
    CREATE PROCEDURE Website.RecordColdRoomTemperatures_v2
        @SensorReadings Website.SensorDataList_v2 READONLY
    AS
        --straight-forward definition left as exercise for reader
    GO

    Pre-Migration (Populate and Keep in Sync)

    Normally, I would use triggers to keep the new and old column values in sync like this, but you can’t do that with In-Memory tables. So I altered the procedure Website.RecordColdRoomTemperatures to achieve something similar. The only alteration I made is to set the ColdRoomSensorLabel value in the INSERT statement:

    ALTER PROCEDURE Website.RecordColdRoomTemperatures
    @SensorReadings Website.SensorDataList READONLY
    WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
    AS
    BEGIN ATOMIC WITH
    (
        TRANSACTION ISOLATION LEVEL = SNAPSHOT,
        LANGUAGE = N'English'
    )
        BEGIN TRY
     
            DECLARE @NumberOfReadings int = (SELECT MAX(SensorDataListID) FROM @SensorReadings);
            DECLARE @Counter int = (SELECT MIN(SensorDataListID) FROM @SensorReadings);
     
            DECLARE @ColdRoomSensorNumber int;
            DECLARE @RecordedWhen datetime2(7);
            DECLARE @Temperature decimal(18,2);
     
            -- note that we cannot use a merge here because multiple readings might exist for each sensor
     
            WHILE @Counter <= @NumberOfReadings
            BEGIN
                SELECT @ColdRoomSensorNumber = ColdRoomSensorNumber,
                       @RecordedWhen = RecordedWhen,
                       @Temperature = Temperature
                FROM @SensorReadings
                WHERE SensorDataListID = @Counter;
     
                UPDATE Warehouse.ColdRoomTemperatures
                    SET RecordedWhen = @RecordedWhen,
                        Temperature = @Temperature
                WHERE ColdRoomSensorNumber = @ColdRoomSensorNumber;
     
                IF @@ROWCOUNT = 0
                BEGIN
                    INSERT Warehouse.ColdRoomTemperatures
                        (ColdRoomSensorNumber, ColdRoomSensorLabel, RecordedWhen, Temperature)
                    VALUES (@ColdRoomSensorNumber, 
                            'HQ-' + CAST(@ColdRoomSensorNumber AS NVARCHAR(50)), 
                            @RecordedWhen, 
                            @Temperature);
                END;
                SET @Counter += 1;
            END;
     
        END TRY
        BEGIN CATCH
            THROW 51000, N'Unable to apply the sensor data', 2;
     
            RETURN 1;
        END CATCH;
    END;

    That keeps the values in sync for new rows. But now it’s time to update the values for existing rows. In my example, I imagine that the initial label for the sensors are initially: “HQ-1”, “HQ-2”, etc…

    UPDATE Warehouse.ColdRoomTemperatures
    SET ColdRoomSensorLabel = 'HQ-' + CAST(ColdRoomSensorNumber as nvarchar(50));

    Eagle-eyed readers will notice that I haven’t dealt with the history table here. If the history table is large use batching to update it. Or better yet, turn off system versioning and then turn it back on immediately using a new/empty history table (if feasible).

    Post-Migration

    After a successful switch, the green application is only calling Website.RecordColdRoomTemperatures_v2. It’s time now to clean up. Again, remember that order matters.

    DROP PROCEDURE Website.RecordColdRoomTemperatures;
     
    DROP TYPE Website.SensorDataList;
     
    ALTER TABLE Warehouse.ColdRoomTemperatures
    DROP INDEX IX_Warehouse_ColdRoomTemperatures_ColdRoomSensorNumber;
     
    ALTER TABLE Warehouse.ColdRoomTemperatures
    DROP COLUMN ColdRoomSensorNumber;

    January 12, 2018

    100 Percent Online Deployments: Keep Changes OLTP-Friendly

    Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 9:00 am
    100 Percent Online Deployments
    How to deploy schema changes without scheduled downtime

    Using the Blue-Green deployment method, database changes are decoupled from applications changes. That leaves us with one last challenge to tackle. The schema changes have to be performed while the application is online. It’s true that you can’t always write an online script for every kind of schema change you want.

    I got the moves like Jagger

    The challenge of writing online schema changes is essentially a concurrency problem and the guiding principle I follow is: Do whatever you need to do, but avoid excessive blocking.

    Locks Are Hot Potatoes

    You can’t hold them for long. This applies to schema changes too. Logically if you don’t hold a lock long, you can’t block activity. One exception might be the SCH-M lock which can participate in blocking chains:

    SCH-M locks

    There are two main kinds of SQL queries. SELECT/INSERT/UPDATE/DELETE statements are examples of Data Manipulation Language (DML). CREATE/ALTER/DROP statements are examples of Data Definition Language (DDL).

    With schema changes – DDL – we have the added complexity of the SCH-M lock. It’s a kind of lock you don’t see with DML statements. DML statements take and hold schema stability locks (SCH-S) on the tables they need. This can cause interesting blocking chains between the two types where new queries can’t start until the schema change succeeds:

    Some suggestions:

    • Don’t rebuild indexes while changing schema
    • Rely on the OLTP workload which has many short queries. In an OLTP workload, the lead blocker shouldn’t be a lead blocker for long. Contrast that with an OLAP workload with long-running and overlapping queries. OLAP workloads can’t tolerate changing tables without delays or interruptions.
    • When using Enterprise Edition, use ONLINE=ON for indexes. It takes and holds a SCH-M lock only briefly.

    Changes to Big Tables

    Scripts that change schema are one-time scripts. If the size of the table is less than 50,000 rows, I write a simple script and then move on.

    If the table is larger, look for metadata-only changes. For example, these changes are metadata-only changes:

    If a table change is not a meta-data change, then it’s a size-of-data change. Then it’s time to get creative. Look for my other post in this series for an example of batching and an example of a column switcheroo.

    Pragmatism Example

    If you think “good enough” is neither, you may want to skip this section. There are some schema changes that are still very difficult or impossible to write online. With some creativity, we’ve always been able to mitigate these issues with shortcuts and I want to give an example which I think is pretty illustrative.

    When a colleague asked for a rowversion column on a humongous table. We avoided that requirement by instead creating a datetime column called LastModifiedDate. Since 2012, new columns with constant default values are online. So we added the column with a constant default, and then changed the default value to something more dynamic:

    alter table dbo.MYTABLE
    add LastModifiedDate DATETIME NOT NULL 
        CONSTRAINT DF_TABLE_LastModifiedDate DEFAULT '20000101'
     
    alter table dbo.MYTABLE
    drop CONSTRAINT DF_TABLE_LastModifiedDate;
     
    alter table dbo.MYTABLE
    add CONSTRAINT DF_TABLE_LastModifiedDate 
          DEFAULT GETUTCDATE() for LastModifiedDate;

    It’s a cool situation because it seems like the column has two defaults, one constant default for rows with missing values. And another definition to be used for new rows:

    select pc.default_value, d.definition as [default definition]
    from sys.system_internals_partitions p
    join sys.system_internals_partition_columns pc 
    	on p.partition_id = pc.partition_id
    join sys.default_constraints d
    	on d.parent_object_id = p.object_id
    	and d.parent_column_id = pc.partition_column_id
    where p.object_id = object_id('MYTABLE')
    and pc.partition_column_id = 2
     
    /* Gives 
     
    default_value  default definition
    -------------  ------------------
    2000-01-01     (getutcdate())
     
    */

    So be creative and pragmatic. Successful 100% online schema changes involve creativity and close collaboration between everyone involved.

    January 10, 2018

    100 Percent Online Deployments: Blue Green Details

    Filed under: SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 9:54 am
    100 Percent Online Deployments
    How to deploy schema changes without scheduled downtime

    So now for the nitty gritty. In my last post, Blue-Green Deployment, I talked about replacing old blue things with new green things as an alternative to altering things. But Blue-Green doesn’t work with databases, so I introduced the Blue-Aqua-Green method. This helps keep databases and other services online 24/7.

    The Aqua Database

    What does the Aqua database look like? It’s a smaller version of Blue-Green, but only for those database objects that are being modified. Borrowing some icons from Management Studio’s Object Explorer, here’s what one Blue-Aqua-Green migration might look like:

    Start with a database in the original blue state:

    After the pre-migration scripts, the database is in the aqua state, the new green objects have been created and are ready for traffic from the green application servers. Any type of database object can use the Blue-Green method. Even objects as granular as indexes or columns.

    Finally when the load has switched over to the green servers and they’re nice and stable, run the post-migration steps to get to the green state.

    Blue-Green for Database Objects

    How is the Blue-Green method applied to each kind of database object? With care. Each kind of object has its own subtle differences.

    PROCEDURES:
    Procedures are very easy to Blue-Green. Brand new procedures are added during the pre-migration phase. Obsolete procedures are dropped during the post-migration phase.

    If the procedure is changing but is logically the same, then it can be altered during the pre-migration phase. This is common when the only change to a procedure is a performance improvement.

    But if the procedure is changing in other ways. For instance, when a new parameter is added, or dropped, or the resultset is changing. Then use the Blue-Green method to replace it: During the pre-migration phase, create a new version of the procedure. It must be named differently and the green version of the application has to be updated to call the new procedure. The original blue version of the procedure is deleted during the post-migration phase. It’s not always elegant calling a procedure something like s_USERS_Create_v2 but it works.

    VIEWS:
    Views are treated the same as procedures with the exception of indexed views.
    That SCHEMA_BINDING keyword is a real thorn in the side of Blue-Green and online migrations in general. If you’re going to use indexed views, remember that you can’t change the underlying tables as easily.

    Creating an index on a view is difficult because (ONLINE=ON) can’t be used. If you want to get fancy go look at How to Create Indexed Views Online.

    INDEXES:
    The creation of other indexes are nice and easy if you have Enterprise Edition because you can use the (ONLINE=ON) keyword. But if you’re on Standard Edition, you’re a bit stuck. In SQL Server 2016 SP1, Microsoft included a whole bunch of Enterprise features into Standard, but ONLINE index builds didn’t make the cut.

    If necessary, the Blue-Green process works for indexes that need to be altered too. The blue index and the green index will exist at the same time during the aqua phase, but that’s usually acceptable.

    CONSTRAINTS:
    Creating constraints like CHECKS and FOREIGN KEYS can be tricky because they require size-of-data scans. This can block activity for the duration of the scan.

    My preferred approach is to use the WITH NOCHECK syntax. The constraint is created and enabled, but existing data is not looked at. The constraint will be enforced for any future rows that get updated or inserted.

    That seems kind of weird at first. The constraint is marked internally as not trusted. For peace of mind, you could always run a query on the existing data.

    TABLES:
    The creation of tables doesn’t present any problems, it’s done in the pre-migration phase. Dropping tables is done in the post-migration phase.

    What about altering tables? Does the Blue-Green method work? Replacing a table while online is hard because it involves co-ordinating changes during the aqua phase. One technique is to create a temporary table, populate it, keep it in sync and cut over to it during the switch. It sounds difficult. It requires time, space, triggers and an eye for detail. Some years ago, I implemented this strategy on a really complicated table and blogged about it if you want to see what that looks like.

    If this seems daunting, take heart. A lot of this work can be avoided by going more granular: When possible, Blue-Green columns instead.

    COLUMNS:
    New columns are created during the pre-migration phase. If the table is large, then the new columns should be nullable or have a default value. Old columns are removed during the post-migration phase.

    Altering columns is sometimes easy. Most of the time altering columns is quick like when it only involves a metadata change.

    But sometimes it’s not easy. When altering columns on a large table, it may be necessary to use the Blue-Green technique to replace a column. Then you have to use triggers and co-ordinate the changes with the application, but the process is much easier than doing it for a whole table. Test well and make sure each step is “OLTP-Friendly”. I will give an example of the Blue-Green method for a tricky column in the post “Stage and Switch”.

    Computed persisted columns can be challenging. When creating persisted computed columns on large tables, they can lock the table for too long. Sometimes indexed views fill the same need.

    DATA:
    Technically, data changes are not schema changes but migration scripts often require data changes to so it’s important to keep those online too. See my next post “Keep Changes OLTP-Friendly”

    Automation

    Easy things should be easy and hard things should be possible and this applies to writing migration scripts. Steve Jones asked me on twitter about “some more complex idempotent code”. He would like to see an example of a migration script that is re-runnable when making schema changes. I have the benefit of some helper migration scripts procedures we wrote at work. So a migration script that I write might look something like this:

    declare @Columns migration.indexColumnSet;
     
    INSERT @Columns (column_name, is_descending_key, is_included_column)
    VALUES ('UserId', 0, 0)
     
    exec migration.s_INDEX_AlterOrCreateNonClustered_Online
    	@ObjectName = 'SOME_TABLE',
    	@IndexName = 'IX_SOME_TABLE_UserId',
    	@IndexColumns = @Columns;

    We’ve got these helper scripts for most standard changes. Unfortunately, I can’t share the definition of s_INDEX_AlterOrCreateNonClustered_Online because it’s not open source. But if you know of any products or open source scripts that do the same job, let me know. I’d be happy to link to them here.

    Where To Next?

    So that’s Blue-Green, or more accurately Blue-Aqua-Green. Decoupling database changes from application changes allows instant cut-overs. In the next post Keep Changes OLTP-Friendly I talk about what migration scripts are safe to run concurrently with busy OLTP traffic.

    Older Posts »

    Powered by WordPress