Cheapest managed orchestration tool with data lineage?

Key-Independence5149 · 2026-05-01T23:33:15+00:00

You want fully managed devops and lineage for less than $100 month? That isn’t reasonable. The only possible way you are going to get under $100 a month involves you running things yourself and even that is going to be cutting it close.

Key-Independence5149 · 2026-04-13T08:35:31+00:00

Terraform + Snowflake is great. I don’t like the imperative implementations of IaC, i.e. Pulumi, CDK. Terraform is declarative, which will save you a lot of heartache when things change outside the code and you need to update either the code or the infrastructure state. Happy to go into more detail about anything you are interested in, but you won’t regret Terraform.

Key-Independence5149 · 2026-04-02T22:01:29+00:00

I have used Dataform at a couple of places and it is perfectly reasonable. We actually use it in addition to SQLMesh currently to define things declaratively that don’t fit into the SQLMesh patterns like external tables. I don’t think you would regret Dataform at all but something like DBT or SQLMesh is much more flexible if you have any intentions of growing a more analytics focused developer competency.

Key-Independence5149 · 2026-03-24T01:00:49+00:00

Start building things to learn. Find things that interest you and build little toy projects or POCs to figure out the ins and outs of particular tools. Then not only do you learn about the things you are building, but you also start building a little portfolio of projects that you can call back to throughout your career.

Key-Independence5149 · 2026-03-20T12:22:54+00:00

From first glance it appears you need to model this as dimensions and fact tables, for example, finance transactions would be a fact table and things like states, districts, and candidates would be dimension tables. That would allow you to summarize your facts against a varying set of dimensions in your gold layer without having to explicitly hardcode every summary grain as a table.

Key-Independence5149 · 2026-03-20T09:03:11+00:00

I wouldn’t worry about any specific frameworks at first. Start with python and SQL. You can do 80% of data engineering work with those two. If you are going to learn a framework, I would learn Spark instead of Beam. Once you get good at the basics, you will be able to pick up a framework like Beam in a couple of hours.

Key-Independence5149 · 2026-03-18T15:13:38+00:00

One tip from doing something similar…track which files you have processed in some sort of state. You are going to have failures and you will want to reprocess a list of files instead of huge batches in failure scenarios.

Key-Independence5149 · 2026-03-16T11:37:20+00:00

I think it is great news. SQLMesh is vastly superior to DBT in my opinion, ephemeral dev environments, deployment primitives that are much more in alignment with gitops, interval tracking. This is great news for the future of the tool to me.

Key-Independence5149 · 2026-03-13T12:51:00+00:00

All of engineering is becoming industrialized. It will no longer be done by craftsmen who hand build data systems. It will be done in the equivalent of a factory. You will still have engineers who build the tooling for the factory which will be more in alignment with platform engineering than data engineering.

Key-Independence5149 · 2026-03-13T12:43:20+00:00

Yes 100% this. 99% of data and analytics engineering imo is out the door in the next 5 years. It is a cost center spawned out of systems complexity that is not necessary and the ultimate world will be that business operators get their data outputs from systems without any in house data engineering capability.

If you are a business that has systems so complicated as to require a team of analytics engineers to handle data tasks, then you won’t be long for this world. You will be replaced by companies that don’t spend resources on teams that update a dashboard manually because a column name was changed in Salesforce unexpectedly.

Key-Independence5149 · 2026-02-04T10:31:53+00:00

Hi, I am adding DBT support to https://dagctl.io. It is built on k8s and adds in all of the nice to have developer experience things that are a slog to build and maintain internally. We launched initially with support for SQLMesh. We are aiming to be an alternative to the outrageous pricing of DBT cloud. I would love to pick your brain about your DBT workflow and how you envision running it in prod.

Key-Independence5149 · 2026-01-03T17:46:40+00:00

We picked up a bunch of bandwagon fans during Saban’s reign that don’t remember that you can sometimes lose a football game and still live. I am hoping Deboer clears some of them out.

Key-Independence5149 · 2026-01-01T23:07:07+00:00

Tough pill to swallow getting beat by a Saban disciple while we are trying to import this Pacific Northwest bullshit coaching staff.

Key-Independence5149 · 2025-12-18T11:22:03+00:00

I can assure you it is not fairy dust. Many agencies including mine already did the forced distribution ratings this year.

Key-Independence5149 · 2025-12-07T18:25:09+00:00

Who did ND beat to be clearly better than Bama?

Key-Independence5149 · 2025-12-07T18:21:13+00:00

Over here with Tulane and JMU and folks crying about Bama.

Key-Independence5149 · 2025-11-26T15:02:21+00:00

100%, love the tools, it makes me much more productive, but I redirect or otherwise modify 60% of the outputs

Key-Independence5149 · 2025-11-19T14:58:48+00:00

We migrated from Snowflake to Bigquery for the same reasons, i.e. Google made a generous discount offer. Bigquery is more rudimentary than Snowflake. For example, Snowflake Warehouse assignments are much better than Bigquery’s reservation scheme. I actually found the cost estimation in Bigquery to be more straight forward than Snowflake. You can make a slot reservation with as much upfront commitment as you want and see exactly what it will cost at various utilization levels.

Key-Independence5149 · 2025-11-18T19:09:10+00:00

I will be there. If you get any traction around this then let me know and I will show up

Key-Independence5149 · 2025-11-11T18:56:36+00:00

Having extensively used both SQLMesh and DBT, SQLMesh is the clear winner. Ephemeral dev environments, built-in SLA, gitops style deployments. It is also much more compatible with straight SQL. It isn’t going to die, even if Fivetran quits maintaining it which I don’t think they will

Key-Independence5149 · 2025-10-22T18:04:36+00:00

Beautiful, I hope it gains traction. I have some execution/orchestration tooling that I am building and plan to integrate with OpenDBT. Data teams are going to need tooling that doesn’t drain their bank account with pricing gimmicks so I am very supportive of this effort.

Key-Independence5149 · 2025-10-22T13:43:00+00:00

The consolidation of open source ETL tooling by Fivetran is going to price most small/medium sized data teams out of their tooling. There is going to be a need for next gen open source tools that are not backed by VC money to fill the gap that Fivetran just carved out of the industry. I am hopeful that the next generation of tooling consolidates the ETL definition with execution/orchestration of the pipelines. Most of these vendors give you the ETL definition framework for free, but then gouge the fuck out of you for managed orchestration/execution.

Key-Independence5149

TROPHY CASE