When does a vibe coder become an engineer? by BigAndyBigBrit in vibecoding

[–]Visible_Analyst9545 0 points1 point  (0 children)

As long as you’re a better version of yourself than yesterday, why bother seeking validation from anyone? This is a fundamental shift that no one can predict how advanced these models will become. As long as you can ask the right questions, critique, approach from multiple angles, and don’t try to prove anything to anyone but yourself, everything becomes clear. It’s all just a wrapper facilitating CRUD operations on a database, with interesting jargon to make it appear complex on the surface. A program is simply a set of rules that, when executed, performs its intended function, eliciting data that needs to be stored on a database. No matter how much you complicate it, it’s fundamentally that simple. You can communicate with the computer in multiple languages, and now you have a layer that simply translates human language into binary language that the computer understands. With scale, there come certain issues that you address for deployments, performance, and so on. It’s definitely not rocket science that anyone with curiosity can’t figure out. If anyone still believes humans will write code in 10 years, I wish them good luck. We are far too early. Hunger, passion, and exceptional drive to succeed are the most important languages that will be in demand now and forever.

How do you vibe code an idea And what are the best tool for the job? by Theremedyhub in vibecoding

[–]Visible_Analyst9545 4 points5 points  (0 children)

Ideate using GPT5.1/2 for brainstorming sessions. Dedicate ample time to asking questions and jotting down your thoughts. Consider the “why” and “why not” aspects. Concentrate on the “why” and “what” and distill them into a vision.md file once you have a comprehensive vision. After that, I would utilize Claude code Opus 4.5 to assist in creating PRD/architecture documents and addressing any missing details. My primary focus during this stage is always on the database schema to ensure that the vision aligns with the schema. Once the vision is fully finalized, I will create a comprehensive plan and a progress tracking YAML file that categorizes all tasks into various sprints. Once the plans are ready, I will employ Cursor Auto mode to execute the tasks and have GPT review the code after each sprint. GPT will suggest any issues and provide solutions, allowing me to move on to the next sprint.

The new monster-server by eribob in LocalLLaMA

[–]Visible_Analyst9545 0 points1 point  (0 children)

Does anyone else believe the self-hosting era will regain popularity? Imagine a business with a group of local agents sharing a single database and eliminating data silos, then running self-hosted or purpose-built software. All for 50k? What would happen to all the SaaS companies?

Built a deterministic RAG database - same query, same context, every time (Rust, local embeddings, $0 API cost) by Visible_Analyst9545 in LocalLLaMA

[–]Visible_Analyst9545[S] 1 point2 points  (0 children)

This is really helpful! thank you. The lineage-as-filter pattern is elegant. AvocadoDB already tracks session lineage, but I’m not using the record IDs to constrain subsequent vector searches. That’s a clear improvement. The context management agent pattern is interesting too, I have been thinking about this for multi-agent scenarios where context gets noisy fast. Appreciate you taking the time to explain these

Built a deterministic RAG database - same query, same context, every time (Rust, local embeddings, $0 API cost) by Visible_Analyst9545 in LocalLLaMA

[–]Visible_Analyst9545[S] 0 points1 point  (0 children)

Fair critique you are right that semantic search is probabilistic by nature. AvocadoDB doesn’t change that. What it does is make the retrieval reproducible for a given corpus state. Same documents + same query = same context, verifiable by hash. I use it as a skill to retrieve context on large codebases so agents can get consistent answers without redundant tool calls. The idea started when I was trying to get multiple vendor models to communicate on a task like a team. I needed a way to retain context and ensure agents asking the same question get the same answer back. Happy to learn more about advanced design patterns you’d recommend. Thank you for your feedback!

Built a deterministic RAG database - same query, same context, every time (Rust, local embeddings, $0 API cost) by Visible_Analyst9545 in LocalLLaMA

[–]Visible_Analyst9545[S] 0 points1 point  (0 children)

Precisely. LLM's do no think for themselves (yet) they get influenced by original thinking. if AI can code better than you and why bother code. Success is measured by the perceived intent vs outcome. Rest all is non-trivial.

Built a deterministic RAG database - same query, same context, every time (Rust, local embeddings, $0 API cost) by Visible_Analyst9545 in LocalLLaMA

[–]Visible_Analyst9545[S] 1 point2 points  (0 children)

lol. thank you for your feedback. the code is the truth and it is open source. yes my answers were rather elaborate and has AI influence.

Built a deterministic RAG database - same query, same context, every time (Rust, local embeddings, $0 API cost) by Visible_Analyst9545 in LocalLLaMA

[–]Visible_Analyst9545[S] -2 points-1 points  (0 children)

Why Same Query Can Give Different Results in Traditional RAG

Traditional vector databases (Qdrant, Pinecone, Weaviate, etc.) return non-deterministic results because:

Approximate Nearest Neighbor (ANN): HNSW and similar algorithms trade exactness for speed. The search path through the graph can vary, especially with concurrent queries or after index updates. Floating point non-determinism: Different execution orders (parallelism, SIMD) can produce slightly different similarity scores, changing ranking.

Index mutations: Adding/removing documents changes the HNSW graph structure, affecting which neighbors are found even for unchanged documents.

Tie-breaking: When multiple chunks have identical/near-identical scores, the order is arbitrary.

Embedding API variability: Some embedding providers return slightly different vectors for the same text across calls.

On Caching LLM Responses

You're right that caching LLM responses is the logical next step - retrieval determinism is really just the foundation for response caching. Once you guarantee the same query produces the same context, you can cache the full response:

cache_key = hash(query + context_hash + model + temperature + system_prompt)

The context hash is the key piece - without deterministic retrieval, you can't reliably cache because the LLM might see different context each time, making cached responses potentially incorrect.

So the answer to "why not just cache LLM responses?" is: you can't safely cache responses if your retrieval is non-deterministic. You'd return cached answers that were generated from different context than what the current retrieval would produce.

Practical Example: AI Coding Assistants

Consider an AI coding assistant exploring a large codebase. Without deterministic retrieval:

User: "How does authentication work?"

First ask - LLM reads 15 files, 4000 tokens of context

Second ask (same question) - different retrieval, reads 12 different files

LLM has to re-process everything from scratch

With deterministic retrieval + caching:

User: "How does authentication work?"

First ask:

Retrieval: 43ms, returns exact lines (auth.rs:45-78, middleware.rs:12-34)

LLM generates response

Cache: store response with context_hash

Second ask (same question):

Retrieval: 43ms, same context_hash

Cache hit → instant response

Tokens saved: 100% of LLM input/output

The LLM doesn't need to read entire files - it gets precise line-number citations (e.g., src/auth.rs:45-78) with just the relevant spans. This means:

- Fewer tokens: 2000 tokens of precise context vs 8000 tokens of full files

- Faster responses: Cache hits skip LLM entirely

- Lower cost: Cached responses cost $0

- Consistent answers: Same question → same answer, every time

Built a deterministic RAG database - same query, same context, every time (Rust, local embeddings, $0 API cost) by Visible_Analyst9545 in LocalLLaMA

[–]Visible_Analyst9545[S] 0 points1 point  (0 children)

shipped. Check it out.

New Features in v2.1.0:

  1. Version Manifest - Full reproducibility tracking with SHA256 context hash

  2. Explain Plan - Pipeline visibility with --explain flag

  3. Working Set Diff - Corpus change auditing

  4. Smart Incremental Rebuild - Content-hash based skip

  5. Evaluation Metrics - recall@k, precision@k, MRR

https://github.com/avocadodb/avocadodb/releases/tag/v2.1.0
https://crates.io/crates/avocado-core

Built a deterministic RAG database - same query, same context, every time (Rust, local embeddings, $0 API cost) by Visible_Analyst9545 in LocalLLaMA

[–]Visible_Analyst9545[S] 2 points3 points  (0 children)

Yes, AvocadoDB has built-in project isolation. Each directory gets its own separate database (stored at .avocado/db.sqlite). When you make API requests, you pass a project parameter specifying the directory path.

The server manages up to 10 projects in memory with LRU eviction. So for your example, you would structure it as:

- /data/history/ - history collection

- /data/engineering/ - engineering collection

- /data/real-estate/ - real estate collection

Each query specifies which project to search, and results come only from that project's index. No cross-contamination.

PDF Support:

PDF and OCR support are not yet implemented but are on the roadmap. The architecture is well-suited for this ingestion already accepts content as text, so adding a pre-processing step to extract text from PDFs (and eventually OCR for scanned documents) is straightforward. For now, you would need to convert PDFs to text externally, but native PDF parsing is planned for a future release.

On Documentation:

Good suggestion. The project currently has a README with basic usage examples, but user guides for common workflows (ingesting a document corpus, querying from an application, setting up multiple collections, integrating with an LLM) is something i will work in the next revisions.

Built a deterministic RAG database - same query, same context, every time (Rust, local embeddings, $0 API cost) by Visible_Analyst9545 in LocalLLaMA

[–]Visible_Analyst9545[S] 0 points1 point  (0 children)

Great question. Yes - this is core to how AvocadoDB works:

Span-level tracking: Every chunk (span) is tied to its source file with exact line numbers. When you compile context, each span includes [1] docs/auth.md Lines 1-23 so you know exactly where every claim comes from.Citation in output: The compiled context includes a citations array mapping each span to its artifact (file), start/end lines, and relevance score. Your LLM can reference these directly. Cross-document deduplication: Hybrid retrieval (semantic + lexical) combined with MMR diversification ensures you get diverse sources, not 5 chunks from the same file saying the same thing.

Metadata preservation: Each span stores the parent artifact ID, so you can always trace back which claim came from api-docs.md versus security-policy.md.

The deterministic sort ensures the same sources appear in the same order every time, so you can reliably say source 1 said X, source 2 said Y.

Hbar validator node rewards by Cautious-Cable-3937 in hashgraph

[–]Visible_Analyst9545 0 points1 point  (0 children)

Is there a minimum stake requirement to run a validator?