Michael J. Swart

April 23, 2014

Removing Comments from SQL

Filed under: Miscelleaneous SQL,SQL Scripts,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 10:20 am

Check out the following deliberately crazy SQL Script:

create table [/*] /* 
  -- huh? */
(
    "--
     --" integer identity, -- /*
    [*/] varchar(20) /* -- */
         default '*/ /* -- */' /* /* /* */ */ */
); 
go

It’s not surprising that my blog’s syntax colorer has trouble with this statement. But SQL Server will run this statement without complaining. Management Studio doesn’t even show any red squiggly lines anywhere. The same statement without comments looks like this:

create table [/*] 
(
    "--
     --" integer identity, 
    [*/] varchar(20) 
         default '*/ /* -- */' 
); 
go

I want a program to remove comments from any valid SQL and I want it to handle even this crazy example. I describe a handy method that lets me do that.

Using C#

  • In your C# project, find and add a reference to Microsoft.SqlServer.TransactSql.ScriptDom. It’s available with SQL Server 2012’s Feature Pack (search for “ScriptDom” and download).
  • Add using Microsoft.SqlServer.Management.TransactSql.ScriptDom; to your “usings”.
  • Then add this method to your class:
    public string StripCommentsFromSQL( string SQL ) {
     
        TSql110Parser parser = new TSql110Parser( true );
        IList<ParseError> errors;
        var fragments = parser.Parse( new System.IO.StringReader( SQL ), out errors );
     
        // clear comments
        string result = string.Join ( 
          string.Empty,
          fragments.ScriptTokenStream
              .Where( x => x.TokenType != TSqlTokenType.MultilineComment )
              .Where( x => x.TokenType != TSqlTokenType.SingleLineComment )
              .Select( x => x.Text ) );
     
        return result;
     
    }

… and profit! This method works as well as I hoped, even on the given SQL example.

Why I Prefer This Method

A number of reasons. By using Microsoft’s own parser, I don’t have to worry about comments in strings, or strings in comments which are problems with most T-SQL-only solutions. I also don’t have to worry about nested multiline comments which can be a problem with regex solutions.

Did you know that there’s another sql parsing library by Microsoft? It’s found at Microsoft.SqlServer.Management.SqlParser.Parser. This was the old way of doing things and it’s not supported very well. I believe this library is mostly intended for use by features like Management Studio’s Intellisense. The ScriptDom library is better supported and it’s easier to code with.

Let Me Know If You Found This Useful

Add comments below. Be warned though, if you’re a spammer, I will quickly remove your comments. I’ve had practice.

7 Comments »

  1. I have to agree the Dom Parser is pretty damn awesome, there really is so much in it.
    Im sure you are aware but that is how i drive TSQLSmells http://dataidol.com/davebally/ssdt-tsql-smells-add-in/

    Comment by Dave Ballantyne — April 23, 2014 @ 11:16 am

  2. Yes, I did come across a couple of your posts while googling for this stuff.
    Your stuff is impressive and it’s not surprising that you make use of the Dom Parser. It’s still the best parser.

    It makes sense though, by using this parser, there’s no chance of misinterpreting syntax.
    Thanks for the link Dave

    Comment by Michael J. Swart — April 23, 2014 @ 11:36 am

  3. […] Removing Comments from SQL, Michael J. Swart […]

    Pingback by Remove SQL Comments - The Daily Six Pack: April 24, 2014 — April 24, 2014 @ 3:21 am

  4. I have played around with parsing sql for years including trying antlr and parsing it myself, it really isn’t easy!

    The dom parser is the nuts!

    I have always wondered why the actual sql parser isn’t exposed anywhere.

    Eggy

    Comment by sqlegg — May 2, 2014 @ 2:02 pm

  5. I’ve had a look at that parser a few times and it seems pretty interesting, but (like a lot of stuff) it really needs a comprehensive set of samples on how to use it. I know some C# but there’s an expectation for Microsoft to throw libraries and people and that others will innately understand how they work… I don’t have that skill 🙂

    Comment by Cody — May 2, 2014 @ 11:19 pm

  6. I guess I achieved the same results using regular expressions.
    Please check here: http://drizin.com.br/Blog/post/removing-comments-from-sql-scripts
    The SMO Parser is brutally slower if all you need is removing comments.

    Comment by Ricardo Drizin — November 29, 2015 @ 2:54 pm

  7. […] See Removing Comments From SQL […]

    Pingback by Regular Expression To Match All Comments In A T-SQL Script - Programming Questions And Solutions Blog — February 19, 2023 @ 6:07 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress