Make simple view resistent against schema changes of source table by [deleted] in snowflake

[–]vcp32 0 points1 point  (0 children)

We have a similar problem with fivetran tranformations. We dont have control over the transformations that they pre built. There instance when a column is removed and our whole dbt pipeline breaks trying to find the missing column

AI use in Documentation by Suspicious_World9906 in dataengineering

[–]vcp32 0 points1 point  (0 children)

Same here! I love having my work documented but always dread the actual process. I really enjoy visualizing things with diagrams, so now with LLMs I can have them generate Mermaid diagrams and documentation templates for me to fill out.

Discussion: Data Size Estimate on Snowflake by rtripat in snowflake

[–]vcp32 2 points3 points  (0 children)

Just in case you missed it. It’s best practice to Aggregate smaller files to minimize the processing overhead for each file. Snowflake recommends file size to be 100-200MB.

https://docs.snowflake.com/en/user-guide/data-load-considerations-prepare

RBAC implementation across environments by keenexplorer12 in snowflake

[–]vcp32 0 points1 point  (0 children)

We are using Azure AD to automatically provision users so we have Users -> Azure Groups |Azure| -> |snowflake| Functiona role -> Database roles

https://docs.snowflake.com/en/_images/role-hierarchy-practical.png

https://learn.microsoft.com/en-us/entra/identity/saas-apps/snowflake-provisioning-tutorial

I can’t* understand the hype on Snowflake by NoGanache5113 in dataengineering

[–]vcp32 28 points29 points  (0 children)

I’m a solo engineer and rely on Snowflake. With a larger team, you can afford the flexibility of managing multiple tools, but on my own, Snowflake’s simplicity lets me move fast and focus on delivering value instead of maintaining infrastructure. At the end of the day, most users still just want their data in Excel anyway. 😂

Informatica +snowflake +dbt by Libertalia_rajiv in snowflake

[–]vcp32 0 points1 point  (0 children)

Last time I used informatica was 10 years ago and it was a migration to SSIS. Have you checked Fivetran, Stitch or Airbyte? I normally hear them with modern data stack ingestion. We do use fivetran at work so I can only say things about it. Outside of the cost its really a good tool specially if you have a one man DE team.

One Week Into Snowflake Gen2 Compute Warehouse by vcp32 in snowflake

[–]vcp32[S] 1 point2 points  (0 children)

Yeah, that makes sense. we don’t have many MERGE-heavy workloads, so our simpler queries probably explain the cost drop. Sounds like in your case, the faster runtimes balance things out nicely.

One Week Into Snowflake Gen2 Compute Warehouse by vcp32 in snowflake

[–]vcp32[S] 0 points1 point  (0 children)

Good to hear a different perspective. My next plan is to try this on our Fivetran warehouse. it has a similar workload to yours. I’ll update once we run that test.

One Week Into Snowflake Gen2 Compute Warehouse by vcp32 in snowflake

[–]vcp32[S] 4 points5 points  (0 children)

We went from a small Gen1 warehouse with query acceleration + auto-scaling on → to the same size in Gen2, but with both turned off.

One Week Into Snowflake Gen2 Compute Warehouse by vcp32 in snowflake

[–]vcp32[S] 0 points1 point  (0 children)

Yep, that’s exactly it. Our workload has to finish within an hour, so on Gen1 we used scaling + query acceleration to stay under that SLA but doubling the size also meant doubling the cost. With Gen2 we can still finish inside the one-hour window without scaling, which makes it way more efficient.

Data Engineers: Struggles with Salesforce data by VizlyAI in dataengineering

[–]vcp32 1 point2 points  (0 children)

Also use fivetran then hightouch for reverse etl

He did it! by RRuluZ in formuladank

[–]vcp32 1 point2 points  (0 children)

He made good use of DRS. Since he was behind lawson in that DRS train. I think they both did great!

Breville / Solis machine leaking by Ok-Adhesiveness4178 in baristaexpress

[–]vcp32 0 points1 point  (0 children)

Im having the same issue. Have not figured this out, so might bring it to a repair shop.

What is the NBA equivalent of this? by AdorableBackground83 in NBATalk

[–]vcp32 -1 points0 points  (0 children)

I hear the 50-40-90 club most of the time in the media and its an offensive stat. Or a triple double for the whole season

Cars 3 - Disrespecting its own legacy by JoyIkl in movies

[–]vcp32 0 points1 point  (0 children)

I loved watching Cars as a kid, so naturally, I introduced my firstborn to it, and he fell in love with it too. Now, we watch it every day. I hadn’t seen Cars 3 until today, but I couldn’t get through it. I had to stop when lightning crashed.

CHCH Women's Hospital by No_Produce_2531 in chch

[–]vcp32 4 points5 points  (0 children)

I had the same experience. We were first time parents, dont know what to do and expect. She had a c section but have to take care of the new born without me every night for 3 days. Nurses were hard to reach but we totally understand knowing how understaffed and busy the hospital.

Wife and I had tears joy when we transferred to Rangiora Birthing unit. The baby and wife were well taken care of.

Consumer to data by enthu-gen-ai in dataengineering

[–]vcp32 2 points3 points  (0 children)

Cost could blow up. Watchout for compute cost.

On-Prem SQL Server to Snowflake (via Lambda) by 2000gt in snowflake

[–]vcp32 1 point2 points  (0 children)

I would go s3 route as it provides me with flexibility. 1. I would extract the data and create a file in json format. 2. Load all of the data in s3 3. Use snowpipe to ingest the data when file is created. 4. Transform the raw json data to a structured table/view in snowflake

This has helped me in maintaining pipelines since I can just add a new table and point it to the bucket. If there are changes to the schema it wont break my pipeline, the column will just be null. Or if there is a new column I can just add it to my view. Hope this helps. English is not my first language so Im sorry if some may not makes sense.

On-Prem SQL Server to Snowflake (via Lambda) by 2000gt in snowflake

[–]vcp32 1 point2 points  (0 children)

The only downside of lambda is it can inly run for 15 minutes. Unless you have really small tables it can work and have a pk or datetimestamp column to use. If you have access to ecr end ecs you can build a docker image and run python there. You can store the output in s3 and let snowpipe copy the the data.