I had to re-write a stored procedure. Well, I had to merge the logic of a couple of stored procedures into one actually. Our team swtiched to SQL Server 2008 R2 a couple of months back.

Now something about what the 2 stored procs do; Both of them are used to send out notifications to the user. The first stored proc gets the user details. The second one, is called per user to get the notification details. For each user, there can be more than one notification. So say, we have around 500 users, and each has 3 or more notifications, the current logic was to loop over all the 500 users and combine all their notifications to send out a single email to each one of them.

When I combined the procedures, I had to create another WHILE loop to loop over the users. This was problematic as I already had a WHILE loop in the second stored proc, which would mean that I'll be having nested WHILE loop in the new one. What it was doing was it was first getting the users and then looping over them to update some other details from various tables. Still, I went ahead with it. When I ran the stored proc., it took about 25-30 mins. for 4500 test users to get around 168000 notification records. The second WHILE loop looked something like this:

 DECLARE @Id INT
    DECLARE @value1 NVARCHAR(100)
    DECLARE @value2 INT
    DECLARE @value3 INT
    DECLARE @contentId INT

    SELECT @Id = MIN(Id) FROM #TemptableToUpdate
    WHILE(@Id IS NOT NULL)
    BEGIN
      SELECT @contentId = ContentId FROM #TemptableToUpdate
      WHERE Id = @Id
 
      SELECT @value1 = content_name FROM table1 t1 INNER JOIN
      table2 t2 ON t1.PK_ParentId = t2.FK_ParentId
      WHERE t1.ContentId = @contentId
     
      SELECT @value2 = columnValue2 FROM table3
      WHERE ParentContentId = @contentId
 
     UPDATE #TemptableToUpdate
     SET firstColumn = @value1,
     secondColumn = @value2
     WHERE ContentId = @contentId
 
     SELECT @Id = MIN(Id) FROM #TemptableToUpdate
     WHERE Id > @Id
     END

I thought that even for a background service this was too much of time. I started looking for effective ways of updating the table, and that's when I found MERGE statement. One of the main advantages of MERGE is that you can do all the activites (INSERT, UPDATE and DELETE) on a table in one go. As far as performance is concerned, this is a big plus. I replaced the current WHILE loop with MERGE statement and bang! There was a significant performance improvement. The changed version looks something like this:

MERGE #TemptableToUpdate AS target
USING (
	SELECT a.UserId, a.ContentId, b.content_name, b.columnValue2
	FROM #TemptableToUpdate a LEFT OUTER JOIN
	(
		SELECT t1.content_name, t2.ParentContentId, t1.ContentId
		FROM table1 t1 INNER JOIN table2 t2
		ON t1.PK_ParentId = t2.FK_ParentId
	) AS b
	ON a.ContentId = b.ContentId
	LEFT OUTER JOIN table3 c ON a.ContentId = c.ContentId
 ) AS source (UserId, ContentId, content_name, columnValue2)
ON (target.ContentId = source.ContentId AND target.UserId = source.UserId)
WHEN MATCHED THEN
UPDATE SET target.firstColumn = ISNULL(source.content_name,''),
	target.secondColumn = ISNULL(source.columnValue2,'');

But the improvement was not there immediately. While I had taken care of the inner WHILE loop, my outer loop was still there. So, to take care of that, instead of getting the data on a user basis, I entered all the users in a temporary table and then did a similar MERGE on the other tables to get the data in one go. This way, I got rid of the loops completely. I finally got it to execute in around 8-10 mins. for the same set of users :) .

Related Readings: