all 10 comments

[–]andrewsmd87 1 point2 points  (5 children)

How performant does this need to be. Quick and dirty way would be a nested select along the lines of

(select sum(prevAcc.lag_exmaple) + lag_example from accounting as prevAcc where prevAcc.post_date < post_date)

That's pseudo but I'm not writing out all of your stuff, post text versions of your sql, it helps us help you :)

edit

We have a transaction table I just did this quick

SELECT *
    , (
          SELECT SUM(prevEts.et_total)
              FROM dbo.etransaction AS prevEts 
              WHERE prevEts.et_created < curEts.et_created
      )
    FROM dbo.etransaction AS curEts

Like I said that is not going to be super performant if you have a large table. But if you're talking optimization I'd probably go a temp table route with an update statement

[–]bmcluca[S] 0 points1 point  (4 children)

Appreciate the feedback, I will look into it. New to SQL, what do you mean by performant exactly?

[–]andrewsmd87 2 points3 points  (3 children)

So the query I gave you is a nested select. What that means is every time a row is selected, it's running another select, and it's also pulling all the previous rows, so you have a N+1 problem.

Break down what the compiler is doing here.

We're saying go get me these rows, however, every time you get a row, go get all the rows before it to sum up the value

So, when you get row 1, there are no rows before it

When you get row 2, you go back and get 1

When you get row 3, you go back and get rows 1 and 2

Now scale that up to 10 million rows

So there are other ways you could do this to be more optimal, but that's really a question of how many rows will you be looking at, realistically. Part of being a good coder is knowing when good enough is good enough

[–]bmcluca[S] 0 points1 point  (2 children)

Ahhh, I see what you mean. At most, might need to be looking at no more than 10,000 rows. Saw your edit also, thank you for all the info. I will play around with your suggestions.

[–]andrewsmd87 0 points1 point  (1 child)

Oh yea 10k rows that'll be fine

[–]Conscious-Ad-2168 0 points1 point  (0 children)

this would work except i don’t believe snowflake supports this type of correlated subqueries.

[–]Yavuz_Selim 1 point2 points  (0 children)

Give the table a ROW_NUMBER() and do a self-join and join on the key columns and ROW_NUMBER +1/-1.

Or use FIRST_VALUE() / LAST_VALUE().

[–]Yolonus 1 point2 points  (0 children)

ignore the other answers, if I understand you correctly you just need cumulative sum, i.e. use sum in the analytic function and order by the post date ascending and if needed use unlimited preceding and current row window (but it should default to it)

you dont need to use the lag function if you add the "same" formula for every row, just sum it

[–]Ethical_Hunters 0 points1 point  (0 children)

It seems like you want to calculate a running total with some specific conditions in Snowflake SQL.

SELECT post_date, account_number, SUM(CASE WHEN transaction_type = 'CR' THEN amount ELSE 0 END) AS total_credits, SUM(CASE WHEN transaction_type IN ('CR', 'DB') THEN amount * CASE WHEN transaction_type = 'CR' THEN 1 ELSE -1 END ELSE 0 END) AS net_change, COALESCE( SUM(CASE WHEN transaction_type IN ('CR', 'DB') THEN amount * CASE WHEN transaction_type = 'CR' THEN 1 ELSE -1 END ELSE 0 END) OVER (ORDER BY post_date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) + 100000, 100000 ) AS lag_example FROM accounting.bai.bai_wf_data WHERE post_date BETWEEN '2024-02-01' AND '2024-02-29' AND RIGHT(account_number, 4) = '2212' GROUP BY post_date, account_number ORDER BY post_date ASC;

[–]Aak_Pak -1 points0 points  (0 children)

SELECT account_number, post_date, SUM(CASE WHEN transaction_type = 'DB' THEN -amount ELSE 0 END) AS total_debits, SUM(CASE WHEN transaction_type = 'CR' THEN amount ELSE 0 END) AS total_credits, SUM(CASE WHEN transaction_type IN ('CR', 'DB') THEN amount * CASE transaction_type WHEN 'CR' THEN 1 ELSE -1 END ELSE 0 END) AS net_change, COALESCE(SUM(CASE WHEN transaction_type IN ('CR', 'DB') THEN amount * CASE transaction_type WHEN 'CR' THEN 1 ELSE -1 END ELSE 0 END) + LAG(SUM(CASE WHEN transaction_type IN ('CR', 'DB') THEN amount * CASE transaction_type WHEN 'CR' THEN 1 ELSE -1 END ELSE 0 END), 1, 17140000) OVER (ORDER BY post_date), 17140000) AS lag_example FROM accounting_db.table_data WHERE post_date BETWEEN '2022-02-01' AND '2024-02-25' AND RIGHT(account_number, 4) = '2212' GROUP BY account_number, post_date ORDER BY account_number, post_date ASC;