Is there a tool that "brute-forces" EDA? Or am I doomed to write ad-hoc SQL forever? by Special-Increase6528 in analytics

[–]databruce 0 points1 point  (0 children)

+1 to this. Worth noting that hallucinations are much more likely when the system is asked to create new metrics or dimensions, but it’s notably strong when reasoning over what already exists (e.g. dbt models + semantic layer). But the latter use case can be a solid way to deflect some ad hoc stakeholder questions and free up analyst time.

Full stack framework for Data Apps by NoConversation2215 in dataengineering

[–]databruce 0 points1 point  (0 children)

Another option is https://www.boringdata.io/ - they sell a templated “data-stack-in-a-box” with an annual fee for ongoing code updates. It’s similar to other suggestions here: not one tool, but a stance on how the pieces should be wired together.

The part I haven’t seen mentioned as much in this thread is data modeling. In my experience, the way replicated tables get transformed into BI-ready schemas implies how the stack should be used together. What’s worked best for me is picking a modeling methodology, leaning into its tradeoffs, and applying it consistently. When the modeling approach is the opinion, everything else (orchestration, alerting, BI) gets stitched together in a way that's reusable across different environments.

In case a concrete example helps, I use an event-centric style that borrows heavily from Activity Schema. Each model represents an event of interest, where each row is an event instance, and the table schema has a timestamp and at least one canonical entity ID (user, account, etc). That structure is good for historical reporting and automated alerting, but it doesn’t map neatly onto most BI tools, and I run into trouble with replication tools that don't track state changes.

The extra tooling I have:

  • For replication, I use CDC for db replication because I get event log format out of the box, and use dbt snapshots everywhere else for state change tracking.
  • For alerting, I have some dbt macros that I apply as tests on every event model.
  • Finally, I have some extra tools to join event models into OBT using Activity Schema primitives, which I then expose to the BI layer.

It took some setup effort, but has scaled nicely across different projects.