A couple of weeks ago I posed a T-SQL problem here and later some possible solutions here. You should probably read those posts before you read this. In a nutshell I wanted to concatenate values from multiple rows into a comma-delimited list. In the second post a I promised a performance comparison of the two main solutions that were put forward and that's what this post is for.
Here's the first solution:
WITH rownumcte1 AS ( SELECT ROW_NUMBER() OVER (PARTITION BY id ORDER BY id,name) AS rownum,id,name FROM t1 )
,rownumcte2 AS
(
SELECT rownum,id,name
FROM rownumcte1
WHERE rownum = 1
UNION ALL
SELECT r1.rownum,r1.id,r2.NAME + ',' + r1.name
FROM rownumcte1 r1
INNER JOIN rownumcte2 r2
ON r1.id = r2.id
AND r1.rownum = r2.rownum + 1
)
SELECT r2.id,r2.NAME
FROM (
SELECT id,MAX(rownum)maxrownum
FROM rownumcte1
GROUP BY id
) r1
INNER JOIN rownumcte2 r2
ON r1.id = r2.id
AND r1.maxrownum = r2.rownum
and here's the second:
SELECT id, STUFF( ( SELECT ','+ name FROM t1 a WHERE b.id = a.id FOR XML PATH('')),1 ,1, '')
FROM t1 b GROUP BY id
In order to test them out I needed some test data which I produced using the following script:
USE tempdb
GO
if exists (select * from sys.tables where name = 't1')
drop table t1
go
create table t1 (id INT, NAME VARCHAR(MAX))
go
declare @m int, @n int
set @m = 100
set @n = 100
declare @outerloopcounter int, @innerloopcounter int
set @outerloopcounter = 1
while @outerloopcounter <= @n
begin
set @innerloopcounter = 1
while @innerloopcounter <= @m
begin
insert t1
values (@innerloopcounter, CHAR(ROUND(97 + (RAND() * (25)),0)) )
set @innerloopcounter = @innerloopcounter + 1
end
set @outerloopcounter = @outerloopcounter + 1
end
The script gives us a 10000 row dataset with 100 distinct values in the [ID] column and a random character in the [Name] column.
I was going to do a detailed analysis of the two solutions with multiple tests and all that sort of good stuff but on the first execution I realised there was no point. Solution 1 took 14m50s whereas solution 2 took less than a second. That's pretty conclusive on which one is quicker. I then wanted to know why.
If you break the query down it becomes fairly obvious. The second CTE produces 10000 rows; that's 100 rows for each of the 100 values in [ID]. The screenshot below shows what the second CTE returns: 
You get the idea. Now take a look at what comes after the CTEs. The first CTE (10000) is joined with the second (10000 rows) and that join produces 100000000 rows which is then grouped. The GROUP BY implicitly causes a sort operation and taking a look at the execution plan tells me that the sort of those 100000000 rows costs 96% of our total query time.
The second solution that uses the FOR XML PATH('') construct results in a NESTED LOOP (i.e. inner join) operation over 100 rows and no SORT operations. When we break it down like this it is clear why the second solution is so much quicker.
I doubt you're still reading at this point because analysing queries isn't exactly a fun task. If you are though I urge you to take the code from above and try it out for yourself. Play with the values of @m and @n and see how the execution time differs for each solution. I have learnt alot out of this little episode and you may do too.
-Jamie

You get the idea. Now take a look at what comes after the CTEs. The first CTE (10000) is joined with the second (10000 rows) and that join produces 100000000 rows which is then grouped. The GROUP BY implicitly causes a sort operation and taking a look at the execution plan tells me that the sort of those 100000000 rows costs 96% of our total query time.
The second solution that uses the FOR XML PATH('') construct results in a NESTED LOOP (i.e. inner join) operation over 100 rows and no SORT operations. When we break it down like this it is clear why the second solution is so much quicker.
I doubt you're still reading at this point because analysing queries isn't exactly a fun task. If you are though I urge you to take the code from above and try it out for yourself. Play with the values of @m and @n and see how the execution time differs for each solution. I have learnt alot out of this little episode and you may do too.
-Jamie
P.S. I have no idea why the last part of this blog post appears twice. For some reason our obscenely pathetic blog engine puts it there twice - I've spent ages trying to correct it and have given up, exasperated.