Thursday, May 6, 2010

Running Total in TSQL

There are many ways to do running total on SQL Server. Let's create a test table:

CREATE TABLE [dbo].[t](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Amt] [float] NULL
) ON [PRIMARY]
go

If you have SQL Server 2005 or earlier you could do a running total using a subquery or join:

select t.id, t.amt
, (select Sum(Amt) from dbo.t i where i.id <= t.id) RunTotal
from dbo.t t
order by 1

select t.id, t.amt, sum(a.amt) RunTotal
from dbo.t t
inner join dbo.t a
on a.id <= t.id
group by t.id, t.amt
order by 1

These look "natural" and seem to be using sets. Unfortunately, it is not really like that. If youre result has 4 rows, SQL Server will have to do a sum of 1 row for first, 2 rows for second, 3 rows for 3rd and 4 rows for 4th - in total 10 rows. For resultset with 10,000 rows it will have to total: (10,000^2+10,000)/2 = 50,005,000 (yes - 50 million rows). Even a solution with cursor is better then this.

If everyting is in a table, you could use a strange variation of Update statement to do thi faster.
Let's first add a column to the table:

ALTER TABLE dbo.t

ADD RT float NULL
GO

Now you can use a special form of Update statement that writes both into the table and in a varaiable:

DECLARE @rt float
SET @rt = 0


UPDATE t
SET @rt = RT = @rt + amt
WHERE id = id


select * from t

Apparently, this goes through the table just once and is much lighter on SQL Server. Unfortunately, it is based on non-documented behavior of SQL Server and behind a scene depends on row by row operation.

Alternatively, you can implement cursor a based solution (either in TSQL or in SQLCLR).

Unfortunately, it is not possible to do this using Over clause. SQL Server does not support (not even SQL Server 2008 R2) use of Over clause with Order By when using an aggregation function. We should ask Microsoft to implement this or this.

No comments:

Post a Comment