Replacing While with MERGE
I had to re-write a stored procedure. Well, I had to merge the logic of a couple of stored procedures into one actually. Our team swtiched to SQL Server 2008 R2 a couple of months back.
Now something about what the 2 stored procs do; Both of them are used to send out notifications to the user. The first stored proc gets the user details. The second one, is called per user to get the notification details. For each user, there can be more than one notification. So say, we have around 500 users, and each has 3 or more notifications, the current logic was to loop over all the 500 users and combine all their notifications to send out a single email to each one of them.
When I combined the procedures, I had to create another WHILE loop to loop over the users. This was problematic as I already had a WHILE loop in the second stored proc, which would mean that I'll be having nested WHILE loop in the new one. What it was doing was it was first getting the users and then looping over them to update some other details from various tables. Still, I went ahead with it. When I ran the stored proc., it took about 25-30 mins. for 4500 test users to get around 168000 notification records. The second WHILE loop looked something like this:
DECLARE @Id INT
DECLARE @value1 NVARCHAR(100)
DECLARE @value2 INT
DECLARE @value3 INT
DECLARE @contentId INT
SELECT @Id = MIN(Id) FROM #TemptableToUpdate
WHILE(@Id IS NOT NULL)
BEGIN
SELECT @contentId = ContentId FROM #TemptableToUpdate
WHERE Id = @Id
SELECT @value1 = content_name FROM table1 t1 INNER JOIN
table2 t2 ON t1.PK_ParentId = t2.FK_ParentId
WHERE t1.ContentId = @contentId
SELECT @value2 = columnValue2 FROM table3
WHERE ParentContentId = @contentId
UPDATE #TemptableToUpdate
SET firstColumn = @value1,
secondColumn = @value2
WHERE ContentId = @contentId
SELECT @Id = MIN(Id) FROM #TemptableToUpdate
WHERE Id > @Id
END
I thought that even for a background service this was too much of time. I started looking for effective ways of updating the table, and that's when I found MERGE statement. One of the main advantages of MERGE is that you can do all the activites (INSERT, UPDATE and DELETE) on a table in one go. As far as performance is concerned, this is a big plus. I replaced the current WHILE loop with MERGE statement and bang! There was a significant performance improvement. The changed version looks something like this:
MERGE #TemptableToUpdate AS target
USING (
SELECT a.UserId, a.ContentId, b.content_name, b.columnValue2
FROM #TemptableToUpdate a LEFT OUTER JOIN
(
SELECT t1.content_name, t2.ParentContentId, t1.ContentId
FROM table1 t1 INNER JOIN table2 t2
ON t1.PK_ParentId = t2.FK_ParentId
) AS b
ON a.ContentId = b.ContentId
LEFT OUTER JOIN table3 c ON a.ContentId = c.ContentId
) AS source (UserId, ContentId, content_name, columnValue2)
ON (target.ContentId = source.ContentId AND target.UserId = source.UserId)
WHEN MATCHED THEN
UPDATE SET target.firstColumn = ISNULL(source.content_name,''),
target.secondColumn = ISNULL(source.columnValue2,'');
But the improvement was not there immediately. While I had taken care of the inner WHILE loop, my outer loop was still there. So, to take care of that, instead of getting the data on a user basis, I entered all the users in a temporary table and then did a similar MERGE on the other tables to get the data in one go. This way, I got rid of the loops completely. I finally got it to execute in around 8-10 mins. for the same set of users :) .
Related Readings:

blog comments powered by Disqus