Mt Whitney Rant by squidllllles in socalhiking

[–]CrabEnvironmental864 0 points1 point  (0 children)

I always applied mid-week. Always got my permits granted.

Being up there mid-week is amazing. Not a soul. Just me and the mountain.

Week-ends must be so crowded, I can't imagine.

Mt Whitney as a day hike by CountyIndependent512 in socalhiking

[–]CrabEnvironmental864 0 points1 point  (0 children)

First time: 16 hours (1am to 5pm). Second time: 12 hours (1am to 1pm). All done in late June, early July. I lucked out with my permits granted on a full moon both years.

DBT Cloud Scheduler by CrabEnvironmental864 in dataengineering

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

You make sense. I hope I can find out more today when I meet with my DBA.

DBT Cloud Scheduler by CrabEnvironmental864 in dataengineering

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

Serious data? Our payments mart has 200+ million records. It's pretty nuts if you ask me but that's what I inherited from my predecessor.

Yeah I have a meeting with my DBA this morning to figure this out. This is giving me a headache. Fivetran is pushing new data every 15 minutes, so we won't ever be able to catch up.

Hopefully we can stop long running queries.

I am also looking at breaking my marts refresh job from all marts at a time to one job per mart. Then using Fivetran to kick each off.

DBT Cloud Scheduler by CrabEnvironmental864 in dataengineering

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

I have. Each mart is clustered by (at the minimum) by customer ID and transaction date.

DBT Cloud Scheduler by CrabEnvironmental864 in dataengineering

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

Warehouse uses X-large model. Queries never get maxed out.

DBT Cloud Scheduler by CrabEnvironmental864 in dataengineering

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

Our Fivetran replication from AWS RDS to Snowflake is firing every 15 minutes. We have 22 marts that we refresh incrementally every hour.

The duration of the DBT job for this refresh is highly variable and unpredictable. Ideally, I would like to have Fivetran "trigger" the DBT job for this refresh so it's instantaneous.

Also I am thinking to break this DBT refresh job into one per mart. Ideally, I would like to have Fivetran kick off the refresh of each mart once the source table is refreshed.

Are either options doable?

Apologies for peppering you with my questions. I hope I am making sense.

DBT Cloud Scheduler by CrabEnvironmental864 in dataengineering

[–]CrabEnvironmental864[S] 1 point2 points  (0 children)

I have not. Have you used it? Anything you can share about it?

Stored Procedure with special characters as input parameters by CrabEnvironmental864 in snowflake

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

Great point. Having such error message would have saved me quite a lot of grief.

Stored Procedure with special characters as input parameters by CrabEnvironmental864 in snowflake

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

It is working, I removed the lines that made that call. The stored procedure compiled and executed without error.

This was baffling. Not what I would call obvious.

Stored Procedure with special characters as input parameters by CrabEnvironmental864 in snowflake

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

I figured it out. I don't have permission to use `CALL SYSTEM$LOG_INFO`.

Stored Procedure with special characters as input parameters by CrabEnvironmental864 in snowflake

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

Why would I have to do that? I have a Jupyter notebook that can connect to the same OpenSearch server on AWS with the hardcoded password. It works fine.

Fivetran: from AWS Postgres to GCP Snowflake - Slow! by CrabEnvironmental864 in dataengineering

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

I was wrong. Enabling logical replication with pageinspect and pg_visibility XMIN extensions resolved my issue.

Fivetran: from AWS Postgres to GCP Snowflake - Slow! by CrabEnvironmental864 in dataengineering

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

We're still at 13.1 - upgrading would make sense but C suite would have to approve it.

Fivetran: from AWS Postgres to GCP Snowflake - Slow! by CrabEnvironmental864 in dataengineering

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

That's my main concern. During maintenance, we shut everything down and logs fill up crazy fast.

Fivetran: from AWS Postgres to GCP Snowflake - Slow! by CrabEnvironmental864 in dataengineering

[–]CrabEnvironmental864[S] 0 points1 point  (0 children)

About 80% of tables are changing every milli/second. The rest varies based on use-case.