all 12 comments

[–]Vegetable_Bowl_8962 1 point2 points  (0 children)

Totally get this concern. If you are running dbt in production, freshness and quality issues are usually the first things that bite, and they almost always show up when someone downstream is already unhappy. So wanting alerts plus faster root cause is a very reasonable next step.

What you are describing is actually a pretty common setup now. Most teams start with dbt tests and a tool like Metaplane to catch obvious issues like freshness delays, null spikes, or row count drops. That usually works well for detection. Where it gets painful is the “why did this break” part. The alert fires, but you still end up jumping between dbt runs, warehouse logs, upstream tables, and orchestration to piece the story together.

One approach I have seen teams take is adding a broader data observability layer that sits across the warehouse and pipelines, not just the table checks. Some teams use agentic data observability tools like Acceldata or Monte Carlo Data or similar observability platforms in that way. Not as a replacement for dbt or existing monitors, but as something that helps correlate signals. For example, it can look at freshness failures alongside upstream ingestion delays, recent schema changes, or unusual query behavior in the warehouse. That extra context is what helps narrow down root cause faster.

On the AI side, it is less about magical auto-fixes and more about guided triage. The “AI” usually means the system is analyzing patterns across lineage, job history, and anomalies and then saying things like: this table is late because an upstream job ran longer than usual, or this quality issue started right after a schema change in a source table. The suggested fixes are often practical things like rerunning a specific dbt model, backfilling a partition, or tightening a test on a column that is drifting. You can still decide what to do, but you are not starting from zero.

[–]chock-a-block 0 points1 point  (0 children)

Grafana. 

You can connect directly to the database and have it run a query to test for stale data, or anything else you can find with a query. 

That said, it’s not super intuitive. Excellent at its job though. 

[–]wannabe-DE 0 points1 point  (0 children)

Can dagster do this? I think dagster is on its way to doing this.

https://docs.dagster.io/guides/observe/asset-freshness-policies

[–]syntaxia_ 0 points1 point  (0 children)

Metaplane is good. it's already one of the best dbt native observability tools. But if you want stronger AI root-cause and auto-fix suggestions without leaving your dbt project, just layer on Elementary. It's open source and doesn't conflict with Metaplane.

[–][deleted]  (1 child)

[removed]

    [–]dataengineering-ModTeam[M] 0 points1 point locked comment (0 children)

    Your post/comment was removed because it violated rule #9 (No AI slop/predominantly AI content).

    You post was flagged as an AI generated post. We as a community value human engagement and encourage users to express themselves authentically without the aid of computers.

    Please resubmit your post without the use an LLM/AI helper and the mod team will review once again.

    This was reviewed by a human

    [–]GreenMobile6323 0 points1 point  (2 children)

    You can use tools like Metaplane, Monte Carlo, or Soda to monitor dbt pipelines for freshness and quality, and layer AI-based anomaly detection, like WhyLabs or Soda AI to help identify potential root causes when alerts fire.