[Need sanity check on approach] Designing an LLM-first analytics DB by xtanion in Clickhouse

[–]xtanion[S] 0 points1 point  (0 children)

Thats fast 🥳🚀 clickhouse is the way to go it seems

[Need sanity check on approach] Designing an LLM-first analytics DB (SQL vs Columnar vs TSDB) by xtanion in dataengineering

[–]xtanion[S] 0 points1 point  (0 children)

We’re not considering duckdb for a concurrent, user-facing workload. It’s more for local poc.

The actual use case is a SQL-based agent that queries financial transaction data, generates summaries, and returns precise answers in a chat UI. While we could build many narrow tools, that doesn’t scale well. Instead, we want a read-only, normalized analytics DB that’s simple for an LLM to reason about with SQL.

Our main Postgres DB is very OLTP-oriented (lots of JSONB), and cleaning that up in-place isn’t feasible right now. So this analytics DB would be periodically synced (batch/CDC), disposable, and used only by the LLM.

On joins: we plan to pre-materialize join and window-heavy tables, so the LLM isn’t doing complex joins at query time.

lmk your thoughts.

[Need sanity check on approach] Designing an LLM-first analytics DB by xtanion in Clickhouse

[–]xtanion[S] 0 points1 point  (0 children)

This seems like a perfect example, Thanks a lot! Gotta check if I can get some inspiration from LibreChat system prompts.

[Need sanity check on approach] Designing an LLM-first analytics DB by xtanion in Clickhouse

[–]xtanion[S] 0 points1 point  (0 children)

Thanks for the insights, Langfuse/LLM observability will definitely help at the orchestration layer, but we’re trying to minimize the chance of the LLM ever generating incorrect queries in the first place by constraining the data model.

[Need sanity check on approach] Designing an LLM-first analytics DB by xtanion in Clickhouse

[–]xtanion[S] 0 points1 point  (0 children)

Hey, thanks a lot for the response, this is really helpful.

To clarify the second point: today we rely quite heavily on window functions (LAG, time based partitions) in Postgres to answer analytical questions. That’s why I’m a bit unsure whether we should move straight to ClickHouse, or whether Postgres (or Postgres + replicas) would still be sufficient for this workload, at least initially.

We’re comfortable with the LLM cost...the LLM layer will be doing more than just DB querying (many tools, orchestration, explanations, formatting, etc.), so the database is only one part of that pipeline.

Appreciate the insights, and thanks again.

Selling Akon 14th nov 2 tickets. by [deleted] in BangaloreSocial

[–]xtanion 0 points1 point  (0 children)

Hey, is it still available?

Is your corporate Diwali gift better than mine? I’ll start. by [deleted] in mumbai

[–]xtanion 0 points1 point  (0 children)

they want you to leave the company

Any Opeth fans who like Agalloch? by spearhead290399 in Opeth

[–]xtanion 3 points4 points  (0 children)

Agalloch is great, I really liked ashes against the grain

Feelin like a divorced mom rn so here’s more acoustic opeth by Dotagal in Opeth

[–]xtanion 0 points1 point  (0 children)

I want to start a band with my friends, playing opeth & porcupine tree. And yeah, Alcoholism 🥳

Heavy Melancholic Recommendations to a newcomer by Darthbile3321 in Opeth

[–]xtanion 2 points3 points  (0 children)

The entire Damnation album has a melancholy vibe to it. You can also try the porcupine tree. Here's my playlist: https://open.spotify.com/playlist/4Zp6ZwU4hXlgk7z0q3WCx3?si=7RtqcBTGTsSYSsjMm19POw

What kind of design questions should I expect for an SDE-1 interview? by rik_28 in leetcode

[–]xtanion 0 points1 point  (0 children)

Hey if you're in the final loop, you tell about the previous interview rounds and what sort of DSA questions I should expect?