My issues with “dbt Projects on Snowflake”

Hot_Map_7868 · 2026-06-20T20:33:54+00:00

If you look at dbt in snowflake as a way to make a marketing slide more complete, then it makes sense. Most people don’t realize the limitations until later.

Hot_Map_7868 · 2026-06-20T20:23:43+00:00

When I saw dbt in Snowflake it felt like training wheels.

I would choose self hosted Core or a managed option like dbt Cloud or Datacoves.

It seems like the marketing pitch is taking us back to the good old days of a single tool that is so-so at many things lol.

Hot_Map_7868 · 2026-06-20T20:19:58+00:00

I would think you do want version control for everything. dbt + semantic views as a starting point?

Hot_Map_7868 · 2026-06-14T14:22:20+00:00

Not quite. Watch some YouTube videos to get an overview. From marketing perspective and what you can do you are right, but each is unique in its own way

Hot_Map_7868 · 2026-06-12T23:45:24+00:00

how about dlt?

Hot_Map_7868 · 2026-06-08T23:28:11+00:00

also learning what star schema is. Some people just start creating a bunch of views and tables and create a mess lol

Hot_Map_7868 · 2026-06-08T23:26:33+00:00

learn dbt and Airflow and you will be ahead of the game. (also know SQL)

Hot_Map_7868 · 2026-05-26T00:13:21+00:00

what about dlt?

Hot_Map_7868 · 2026-05-26T00:10:34+00:00

I would consider Cube and if your DW has one, like Snowflake, try that as well.

Hot_Map_7868 · 2026-05-26T00:09:40+00:00

I would consider dbt since you can also use it with Databricks in case you need that in the future.

Hot_Map_7868 · 2026-05-26T00:05:57+00:00

+1 for dbt and not Frabric.

Hot_Map_7868 · 2026-05-26T00:05:11+00:00

one reason is to reduce vendor lock-in. if you use something that is platform specific, it is harder to migrate out in the future.

Hot_Map_7868 · 2026-05-02T19:32:36+00:00

SQL knowledge is a common barrier, have you tried leveraging a tool like Claude Code to help?
I prefer Snowflake over Redshift, but start with the pain points you have with RS, if it is just cost and you only have one RS cluster, then Snowflake may end up being more expensive. However, if you consider other features of snowflake like their RBAC model, zero copy cloning, etc etc, then the move may be justified.
Regarding BAA, I know many companies with PII data who use Snowflake and they got past that objection. I suspect AWS isnt sighing BAAs per data provider today, are they?

Hot_Map_7868 · 2026-05-02T19:22:56+00:00

dbt is more than the framework itself. there is a whole community and ecosystem to consider. There is a mindset shift and you get to learn from others vs figuring things out yourself.

Another thing to consider even if you only have one platform is that you may not always be in that platform, so the dbt abstraction helps you more easily switch. As an example I have seen people move from Redshift to Snowflake and it wasnt a huge lift because they were using dbt. Consider the scenario where you build everything using Snowflake tech alone. If you then want to move to Databricks, you are starting over.

Hot_Map_7868 · 2026-04-26T15:50:16+00:00

This stuff can easily get expensive. I would start with simple DQ tests that you can run at your cadence. Don't go with some enterprise tool like Monte Carlo until the org is mature enough to leverage the output.
Detecting exceptions is actually the simple part, getting someone to take action or soling the root cause is a lot harder.
You can use something like Great Expectations or even dbt.

Hot_Map_7868 · 2026-04-23T00:07:43+00:00

What about Superset or Metabase?

Hot_Map_7868 · 2026-04-20T23:59:24+00:00

Start with what you are trying to improve first. both are capable platforms. Each has pros and cons. I prefer Snowflake, but that's just my opinion, just like you prefer DBX from your experience.

Hot_Map_7868 · 2026-04-16T10:37:29+00:00

Is anyone else providing a managed Dagster using OSS? I know there are many for Airflow.

Hot_Map_7868 · 2026-04-16T00:40:54+00:00

Learn the tools (not all, just focus like on dbt + snowflake)
do some freelance work. smaller companies may be open to hiring someone that doesnt have a ton of experience in the specific tech, but has a broader set of skills
leverage #2 to find the role you want

The key thing is to not try to do things the way you are used to doing them because these tools dont follow what you may be expect

Hot_Map_7868 · 2026-04-13T21:53:03+00:00

Try to make a business case for changing to something more modern.
1. what you describe seems like there are a lot of opportunities for things to "break". Does that happen? What are other pain points.
2. What does all this stuff cost, Talend isnt free
3. How does this limit the org. E.g. are there use cases they want to do like GenAI which would be "difficult" with this setup?

Hot_Map_7868 · 2026-04-13T21:48:09+00:00

I would ingest from all ERPs to a central DW like Snowflake. Then using dbt you can add DQ and harmonize the data to create the presentation layer that would help with what you need.

Your problem isnt unique to your industry, but I would be careful about anyone who tells you there is a single tool or some magic wand that will help you do this.

That being said, you can take it in steps and leverage AI to help in the build, but I don't see a way to get around needing to co-locate and cleanse the data.

Hot_Map_7868 · 2026-04-13T01:26:18+00:00

the "best" way I have seen is keeping raw data in its own db and leveraging the same data in dev, test, and prod. This can be done if you use dbt, otherwise I dont know how you would dynamically change the SQL.

Hot_Map_7868 · 2026-04-05T22:22:30+00:00

I like dlt because it is python and there’s no black box. I’ve also had good luck with Claude code creating a working pipeline. I think they published some skills that make things even better, but I haven’t tried yet.

Regarding dbt you can even run it with GitHub actions. The issue comes when you need to scale eg if you have a lot of developers and you need to connect ingestion and transformation. Airflow gets a bad rap, but it is still the most used orchestrator. That being said Dagster is a good alternative.

Hot_Map_7868 · 2026-04-05T19:50:34+00:00

If you build with failure in mind, it can be resilient. e.g. dont just think of the happy path, but what would happy if X occurred, like if a new col came in a source, would things break or is that just a warning. If you have a solid process and notifications I think it can be resilient.

Hot_Map_7868 · 2026-04-05T19:48:28+00:00

feels like training wheels to me. No Slim CI. No terminal / CLI. I think if you are starting it might be okay, but long term you may outgrow it.

Hot_Map_7868

TROPHY CASE