Friday, 11 May 2012

T-SQL Analytic Functions SUM

Let's try this blog thing out :-)

I often get asked by friends and colleagues to help them out with some really troublesome SQL problems that they run into, or for better ways of writing their SQL code. I have what I consider to be the best way, I know that at any given time there's probably 11 or more ways of solving the same business problems with SQL code.

I Love getting other people's feedback so please feel free to criticize my code at will so that I can also learn.

"My brain has frozen up & I can’t solve a simple issue … hope you can quickly help. See attached.
 It’s a lot more complex than the example, but for some reason I’m sitting here & just not getting the simplest example right. "

So, the test data is as follows.











Set A
DimA DimB DimC Value
A001 B001 C001 10
A002 B002 C002 15
A003 B003 C003 20
Set B
DimA DimC DimD Value
A001 C001 D001 100
A001 C001 D002 100
A003 C003 D003 50

-- and the desired outcome is:
DimA DimB DimC DimD Value Comments
A001 B001 C001 D0015 (10/200*100)
A001 B001 C001 D002 5   (10/200*100)
A002 B002 C002 Null 15  As is In A
A003 B003 C003 D003 20  As is in A

Additional Information that was provided was:

Join on DimA and DimC
All Combinations must exist
Value must total Set A's values, with Set B's Ratios
If Set B doesn't match, return Set A (Left Outer), with Set A Value

First steps to troubleshoot.
I created 2 table variables and replicated the result sets.

DECLARE @SetA TABLE (DimA VARCHAR(10), DimB VARCHAR(10), DimC VARCHAR(10), Value INT)
INSERT INTO @SetA VALUES ('A001',   'B001',     'C001',     10)
INSERT INTO @SetA VALUES ('A002',   'B002',     'C002',     15)
INSERT INTO @SetA VALUES ('A003',   'B003',     'C003',     20)

DECLARE @SetB TABLE (DimA VARCHAR(10), DimC VARCHAR(10), DimD VARCHAR(10), Value INT)
INSERT INTO @SetB VALUES ('A001',   'C001',     'D001',     100)
INSERT INTO @SetB VALUES ('A001',   'C001',     'D002',     100)
INSERT INTO @SetB VALUES ('A003',   'C003',     'D003',     50)

SELECT * FROM @SetA
SELECT * FROM @SetB

I then checked the result sets on using an inner join and a left join, to see which one closely matches the required result sets. (Small overhead for additional testing which might provide another solution to the problem.)

SELECT * FROM @SetA A INNER JOIN @SetB B ON A.DimA = B.DimA AND A.DimC = B.DimC
SELECT * FROM @SetA A LEFT JOIN @SetB B ON A.DimA = B.DimA AND A.DimC = B.DimC

What I then needed to do was add up all the values for each grouping, and then divide those values into the totals for those groups. (This is where the analytic function comes in handy.)

SELECT A.DimA, A.DimB, A.DimC, B.DimD, A.Value AS ValueA, B.Value AS ValueB, SUM(B.Value) OVER (PARTITION BY A.DimA, A.DimB, A.DimC) AS TotalPart
FROM @SetA A LEFT JOIN @SetB B ON A.DimA = B.DimA AND A.DimC = B.DimC

Additional logic needed to be added for "Value must total Set A's values, with Set B's Ratios".

SELECT *, CASE WHEN TotalPart IS NULL THEN ValueA ELSE (CAST(ValueA AS DECIMAL(18,2)) / CAST(TotalPart AS DECIMAL(18,2)) ) * CAST(ValueB AS DECIMAL(18,2)) END Results FROM (
SELECT A.DimA, A.DimB, A.DimC, B.DimD, A.Value AS ValueA, B.Value AS ValueB, SUM(B.Value) OVER (PARTITION BY A.DimA, A.DimB, A.DimC) AS TotalPart
FROM @SetA A LEFT JOIN @SetB B ON A.DimA = B.DimA AND A.DimC = B.DimC
) TB

That will give you the correct result set that's required based on the logic.

I personally feel that for a first try its not too bad, but I'll try to improve and make more sense as I go along :-)

If you have any questions or scenarios that you want me to try out, and post, then please let me know... looking forward to hearing from you guys.


Cheers for now.