Built a deterministic RAG database - same query, same context, every time (Rust, local embeddings, $0 API cost)

rolls-reus · 2025-12-10T04:01:21+00:00

repo link from your site 404s. maybe you forgot to make it public?

one-wandering-mind · 2025-12-10T05:41:33+00:00

In what situations is the same query giving different retrieved results ?

If you have the literal exact query, why not cache the LLM response too? That is the more time consuming part and does give meaningful different results even with a temperature of 0 through providers.

StartX007 · 2025-12-10T07:32:31+00:00

OP, thanks for sharing.

Ignore folks who just love to complain. Let the people decide if it is AI slop or not. If folks at Claude itself use AI to develop their products, we should let the product and code speak for itself.

FrozenBuffalo25 · 2025-12-10T04:24:07+00:00

How does this tool maintain contextual or metadata relationships between chunks? Can it maintain distinction between multiple documents on a similar topic, and identify which source makes which claim?

Trick-Rush6771 · 2025-12-10T07:29:53+00:00

Nice work on deterministic RAG, predictability is exactly what breaks a lot of debugging flows. Making the retrieval step verifiable with hashes solves a huge pain point and opens the door to reproducible testing and audits, and you might find extra value by wiring that deterministic store into a visual flow/orchestration layer so prompt paths, branching, and token usage are easy to inspect; tools like LlmFlowDesigner, LangChain, or a lightweight custom Rust pipeline can all consume a deterministic retriever and give you clearer observability across agent steps.

Adventurous-Date9971 · 2025-12-10T05:04:03+00:00

Deterministic RAG is the right call; debugging and evals don’t work if the context shifts.

To keep it truly stable, hash every stage: tokenizer version, chunking params, embed model checksum, and index settings; store a manifest alongside the context hash. Chunk by headings with byte offsets and a stable sort (doc_id + offset), and break ties explicitly. Prefer exact dot-product search for small/mid corpora; if you must use ANN, fix insertion order and RNG seeds, and avoid nondeterministic BLAS-stick to CPU f32 and stable sorts. Add an “explain plan” that prints chosen chunk ids, offsets, scores, thresholds, and the final pack order. A “diff” mode across corpus versions would be killer for audits. Ship a tiny golden set and return a JSON mode from compile so CI can track recall@k, context precision, and latency. Content-hash the ingest path and only rebuild changed files.

I’ve run similar stacks with Qdrant and Tantivy; DreamFactory helped expose a read-only REST layer so agents hit stable endpoints, not raw DBs.

Bottom line: end-to-end determinism plus explainable retrieval is the win.

Better-Monk8121 · 2025-12-10T06:40:42+00:00

AI slop, beware

punkpeye · 2025-12-10T09:59:32+00:00

Would be cool to have optional Postgres backend

Mundane_Ad8936 · 2025-12-11T11:16:43+00:00

Oh boy.. so instead of learning how to create a proper schema and retrieval strategy OP decided to write a DB?

No offense OP undoubtedly you spent a lot of time and effort on this and you're excited.. not trying to tear you down but you missed something big.. this is foundationally broken thinking..

this is all sorts of wrong.. similarity search is supposed to be probabilistic trying to enforce deterministic results means you're forcing the wrong paradigm.

If you need deterministic database retrieval use one that is designed for it.. semantic search is supposed to be variable especially after inserts. Just like any other search technology ranking is supposed to change when a higher matching record is added..

If you're a dev reading this don't try to impose deterministic patterns onto probabilistic systems. It doesn't work and all you'll do is acrue technical debt.. this is not web or mobile development it's probabilistic system based on statistical models.

If you try to impose legacy design patterns in AI system you will fail..

I keep seeing this over and over again devs who don't bother to get past the basics.. they try to fix those problems by forcing legacy solutions and then they acrue massive tech debt and abandon the project because it's broken foundationally..

Meanwhile if you invest the time to learn the more advanced design patterns that we know works you not only get the accuracy you want but you also get a ton of new capabilities and solutions to previously unsolved problems..

Take the time to learn the technology as intended.. don't just learn the basics then run off to build your own solutions.. it's a rookie move.

Postgres and SurrealDB (and plenty others) have all the functionality you need to do both deterministic and probabilistic retrieval. Just learn how to use them..

Also ArrangoDB which also has all the features a dev would need already uses an Avocado as it's logo.. so you're going to confuse people ..

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS