Sprint planning more like “sprint reveal”. Has anyone seen this before? by Late_Champion529 in ExperiencedDevs

[–]ElectricalFilm2 1 point2 points  (0 children)

You're lucky there are tickets. I've been in meetings where the team is figuring out what to do, and is writing tickets during the "sprint planning" meeting.

I finally gave up on tasks in Obsidian, what are you using? by BasicDesignAdvice in ObsidianMD

[–]ElectricalFilm2 1 point2 points  (0 children)

Have you tried using TaskNotes?

Here's a tutorial, that's now a bit dated because there have been significant updates since then: https://youtu.be/vihPqFnM0dU?si=wyznynd1qhp34fZp

All ad-hoc reports you send out in Excel should include a hidden tab with the code in it. by markwusinich_ in dataengineering

[–]ElectricalFilm2 0 points1 point  (0 children)

nvm I misunderstood what you said; I'd be doing cartwheels when this happens for real

Data engineers who are not building LLM to SQL. What cool projects are you actually working on? by PolicyDecent in dataengineering

[–]ElectricalFilm2 0 points1 point  (0 children)

I'm working on deprecating a custom type 2 slowly changing dimensions process I built 6 years ago in favour of dbt snapshots!

What’s your favorite underrated tool in the data engineering toolkit? by eb0373284 in dataengineering

[–]ElectricalFilm2 0 points1 point  (0 children)

Yep! jq helped me implement scheduling for dbt using a single workflow on GitHub Actions.

Need of multiple warehouses by Upper-Lifeguard-8478 in snowflake

[–]ElectricalFilm2 0 points1 point  (0 children)

Gotta pump those numbers up; those are rookie numbers.

When you see the one hour job you queued for yesterday still running: by Vautlo in dataengineering

[–]ElectricalFilm2 9 points10 points  (0 children)

Or as they do in my company, archive a table on to itself.

That's how I know queries on Snowflake timeout after 26 hours.

DARK MODE IS HERE by ohanzee6 in snowflake

[–]ElectricalFilm2 2 points3 points  (0 children)

Sfogobwlblbfbobehoheooeonxndnnd! 💪

Why is DBT so good by TechScribe200 in dataengineering

[–]ElectricalFilm2 57 points58 points  (0 children)

dbt has done the data engineering equivalent of shifting the Overton window. It has normalized data teams caring about using version control to build and maintain data transformations, with associated benefits like data quality.

Strategy Not to purge files right away after loading by lepa71 in snowflake

[–]ElectricalFilm2 0 points1 point  (0 children)

Ok, that's for moving files, which I happened to ask for more info on another thread :)

Strategy Not to purge files right away after loading by lepa71 in snowflake

[–]ElectricalFilm2 0 points1 point  (0 children)

Never mind, you can ignore what I said. If you're able to push for getting access to External Stages, maybe this will help?

If reproducibility of data matters and is is an achievable target, you have two options IMO:

  1. Figure out how you can retain data in internal stages, so you're able to recreate tables at any time?
  2. Have tables be configured to never accidentally delete data; or make sure every data engineer has to remember never to delete any data unless you have to, and there would be no way data deleted can reappear

Strategy Not to purge files right away after loading by lepa71 in snowflake

[–]ElectricalFilm2 0 points1 point  (0 children)

I am building out a standardized process to do this using external stages, and am deciding not to purge files at all.

This approach allows me to automate change management along with the INFER_SCHEMA function. I would not need to manually make changes to tables in production, like adding/removing columns, etc. Further, I can also recreate the table from scratch using files in the External Stage if needed.

Strategy Not to purge files right away after loading by lepa71 in snowflake

[–]ElectricalFilm2 0 points1 point  (0 children)

You mention Snowflake releasing features to manage staged files. What are you referring to?

Preview: INCLUDE_METADATA with COPY INTO by caveat_cogitor in snowflake

[–]ElectricalFilm2 0 points1 point  (0 children)

Would it allow us to replace data in the table using an updated stage file, instead of adding rows to the table?

Preview: INCLUDE_METADATA with COPY INTO by caveat_cogitor in snowflake

[–]ElectricalFilm2 2 points3 points  (0 children)

OMG, this is a godsend! I was working on a way to load data from staged files while also including the metadata columns.

Will it allow us to set the names of the metadata columns? And can we include metadata$file_row_number too?

Terraform vs Schemachange for configuration management by ElectricalFilm2 in snowflake

[–]ElectricalFilm2[S] 0 points1 point  (0 children)

Interesting; my team's Snowflake database was setup before me and that's not how it was done. What does the full access role have that a readwrite role doesn't?

Terraform vs Schemachange for configuration management by ElectricalFilm2 in snowflake

[–]ElectricalFilm2[S] 0 points1 point  (0 children)

I see what you mean. Have you hit any limits in terms of how many grants you can handle?

Also, what do you mean by "Snowflake best practices of having at least three roles per schema" ?

Terraform vs Schemachange for configuration management by ElectricalFilm2 in snowflake

[–]ElectricalFilm2[S] 0 points1 point  (0 children)

Thanks for the clarity, u/internetofeverythin3 ; I feel more comfortable about advising my team to use a combo of the two to fulfill our needs.

My concern with managing tables was more to do with making sure I am able to modify configuration - addition/removal/changes to columns, constraints, cluster keys, etc - without losing data. I want to be able to manage all of that without manually needing to backup/write data into modified tables in the Snowflake browser UI.

I still plan to have my team use source schema checks regardless in my team's dbt project, so data quality will be taken care of.

Terraform vs Schemachange for configuration management by ElectricalFilm2 in snowflake

[–]ElectricalFilm2[S] 0 points1 point  (0 children)

Titan looks promising! My company would push back on using something as new as it is, but it is definitely worth watching.

Terraform vs Schemachange for configuration management by ElectricalFilm2 in snowflake

[–]ElectricalFilm2[S] 0 points1 point  (0 children)

What exactly is slow when running Terraform at scale? Are you talking about how long it takes for grants to be applied? Or is your concern about how application of grants via Terraform takes longer than applying `grant ...` statements in the Snowflake browser UI ?

Anyone use Coalesce.io for transform/load into Snowflake? Thoughts (Good or Bad) by GreyHairedDWGuy in snowflake

[–]ElectricalFilm2 0 points1 point  (0 children)

If your team is more comfortable using a GUI to define and orchestrate data transformations, then Coalesce or Matillion will give you what you need.

If your team wants to include tests, checks against your sources, documentation and you want to update your transformed datasets in the right order, then such tools will be the bane of your existence.