Michael J. Swart

June 25, 2008

Splitting and Joining strings in SQL Server

Filed under: Technical Articles — Michael J. Swart @ 7:00 pm

Wouldn’t it be nice to do something like this:

--split
SELECT t.id, split.value
FROM someTable t
CROSS APPLY dbo.Split(t.csv, ',') as split

or this:

--join
SELECT ManagerID, JOIN(EmployeeID + ' ')
FROM HumanResources.Employee
GROUP BY ManagerID

Well the first example can be implemented as a table-valued function and it’s possible if you’re careful. Here’s my implementation of Split:

CREATE FUNCTION [dbo].[Split]
(
	@SearchString NVARCHAR (max),
	@Separator NVARCHAR (5)
)
 
RETURNS @strtable TABLE (strval NVARCHAR (max))
AS
 
BEGIN
	DECLARE @SeparatorLength INT
	SET @SeparatorLength = DATALENGTH(@Separator) / 2
 
	IF (@SeparatorLength = 0)
		RETURN
 
	WHILE 1=1
	BEGIN
		IF (CHARINDEX(@Separator,@SearchString) = 0)
		BEGIN
			INSERT INTO @strtable
			VALUES (@SearchString);
			BREAK;
		END
 
		INSERT INTO @strtable
		VALUES ( LEFT(@SearchString, CHARINDEX(@Separator,@SearchString) - 1))
 
		SET @SearchString = RIGHT(@SearchString, DATALENGTH(@SearchString)/2 - CHARINDEX(@Separator,@SearchString) - @SeparatorLength + 1)
	END
	RETURN
END

For the second example, the join, I don’t think there there’s a simple way, But there are a couple options. There’s an example of creating the CONCATENATE aggregate with CLR code here. But it’s pretty clunky to enable clr and integrate an assembly. But here’s another alternative if you’re willing to resort to a kludge is to use select a column called data() and use the for xml path syntax:

SELECT
	M.ManagerID,
	(
		SELECT CAST(E.EmployeeID AS VARCHAR(10))
			+ ',' AS [data()]
		FROM HumanResources.Employee E
		WHERE E.ManagerID = M.ManagerID
		FOR XML PATH('')
	) AS EmployeeID
FROM
	(
		SELECT DISTINCT ManagerID
		FROM HumanResources.Employee
	) AS M

Your mileage may vary. I give no guarantees about the performance. It breaks a rule of thumb of mine where I don’t like to include subqueries in the SELECT clause of SELECT Statements.

Update! Jul 4, 2008: Shown up by a pro
In SQL Server Magazine’s “Performance Tuning and Optimization” forum,Andrew J Kelly has a much nicer split function than I do. For one thing it’s set based:
(Caveat: because of the recursion limit of 100, this function doesn’t work on lists with more than 100 elements)

CREATE FUNCTION [dbo].[fn_split_inline_cte]
    (@list NVARCHAR (MAX),
    @delimiter NCHAR(1) = ',')
RETURNS TABLE
AS
 
RETURN
WITH cte_list([BeginChar], [EndChar]) AS (
    SELECT [BeginChar] = CONVERT(BIGINT, 1), [EndChar] = CHARINDEX(@delimiter, @list + @delimiter)
    UNION ALL
    SELECT [BeginChar] = [EndChar] + 1, [EndChar] = CHARINDEX(@delimiter, @list + @delimiter, [EndChar] + 1)
    FROM cte_list
    WHERE [EndChar] > 0
)
SELECT LTRIM(RTRIM(SUBSTRING(@list, [BeginChar],
    CASE WHEN [EndChar] > 0 THEN [EndChar] - [BeginChar] ELSE 0 END))) AS [ParsedValue]
FROM cte_list
WHERE [EndChar] > 0 ;

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress