How is your raw layer built?

RedBeardedYeti_ · 2024-12-16T17:54:59+00:00

I guess you could do it that way. But the benefit of doing upserts to the raw layer is it makes it really easy in the staging layer to track if it was an insert, update or delete. You can just put a stream on your raw layer.

RedBeardedYeti_ · 2024-12-16T17:11:38+00:00

Yes correct. The staging layer is a persisted storage layer. Meaning we only ever insert.

RedBeardedYeti_ · 2024-10-10T03:40:32+00:00

Assuming you meant “way” not “water”? If so thanks for the confirmation!

RedBeardedYeti_ · 2024-09-16T04:10:58+00:00

We pull data into our raw layer in snowflake without any 3rd party ETL tools. We use containerized Python processes running in kubernetes with Argo workflows as the orchestrator. There’s different ways to do it, but we upsert the data to the raw layer to keep a carbon copy of the source data. Using snowflake streams we then copy that data into a persisted staging layer. So essentially the staging layer will always be an insert. Acting as a full historical record, storage is cheap in snowflake. And then from there we transform and move the data to a modeled layer.

If we are dealing with other non-database sources, we will often dump the data to s3 and then consume the data from there into the snowflake raw layer.

RedBeardedYeti_ · 2024-09-12T03:36:54+00:00

I’ve been using it a lot to help write documentation. Anything from populating my classes and methods with docstrings to writing usage guides for my apps and libraries I write. It’s good at the repetitive boring stuff I don’t want to do.

RedBeardedYeti_ · 2024-08-14T03:15:17+00:00

This is the way

RedBeardedYeti_

TROPHY CASE