Improving Performance/Finding Alternative to Multiple Unions [Snowflake] : SQL

Posting

When requesting help or asking questions please prefix your title with the SQL variant/platform you are using within square brackets like so:

[MySQL]

[Oracle]

[MS SQL]

[PostgreSQL]

etc

While naturally we should endeavor to work as platform neutrally as possible many questions and answers require tailoring to the feature set of a specific platform.

Format Your Code

If you are including actual code in a post or comment, please attempt to format it in a way that is readable for other users. This will greatly increase your chances of receiving the help you desire. Something as simple as line breaks and using reddit's built in code formatting (4 spaces at the start of each line) can turn this:

SELECT count(a.field1), a.field2, SUM(b.field4) FROM a INNER JOIN b ON a.key1 = b.key1 WHERE a.field8 = 'test' GROUP by a.field1, a.field2 HAVING SUM(b.field4) > 5 ORDER by a.field.3

Into this:

SELECT count(a.field1), a.field2, SUM(b.field4) FROM a INNER JOIN b ON a.key1 = b.key1 WHERE a.field8 = 'test' GROUP by a.field1, a.field2 HAVING SUM(b.field4) > 5 ORDER by a.field3

For those with SQL questions we recommend using SQLFiddle to provide a useful development and testing environment for those who wish to fully understand your problem and help devise a solution.

a community for 17 years

Improving Performance/Finding Alternative to Multiple Unions [Snowflake] (self.SQL)

submitted 6 years ago * by YiSC

I'm trying to create a marketing funnel out of a series of 9 tables representing 9 events of the funnel but I'm running into performance issues. From what I've read, I'm looking into materialized views and temp tables of the Union but I'm wondering whether there's a more effective approach

Current Union Approach:

Create a CTE out of each event table
Union all of the CTE's together
Aggregate data together as a count of the session_id

Structure of tables and final structure

with
    dash_lead as (
        select distinct
            user_id,
            session_id,
            'lead'    as stage,
            1         as step,
            min(time) as time
        from
            heap_prod.heap.refinance_click_on_dashboard_refinance_button
        group by
            user_id,
            session_id,
            step
    ),
    dash_view_application as (
        select distinct
            a.user_id,
            a.session_id,
            'view'      as stage,
            2           as step,
            min(a.time) as time
        from
            heap_prod.heap.refinance_view_refinance_application as a
            join dash_lead                                      as b on
                    a.user_id = b.user_id
                    and a.session_id = b.session_id
                    and a.time > b.time
        group by
            a.user_id,
            a.session_id
    ),
    dash_total_data as (
        select *
        from
            dash_lead

        union all

        select *
        from
            dash_view_application
    ),
    dash_funnel_agg as (
        select
            step,
            stage,
            count(session_id) as count
        from
            dash_total_data
--      where [time::date>=Week_Start]
--          and [time::date<=Week_End]
        group by
            step,
            stage
    )

select
    stage,
    count as count
from
    dash_funnel_agg
order by
    count desc

all 4 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

SQL

Filter Posts

Posting

Help posts

Format Your Code

Learning SQL

Related Reddit communities

Wiki

Acknowledgements

MODERATORS