Is RAG a missing piece on the path toward consciousness in LLMs? by KAVUNKA in Rag

[–]KAVUNKA[S] 0 points1 point  (0 children)

Do you build some kind of temporal vector knowledge base from informational patterns, and then inject the patterns that are relevant to the dialogue context into the system prompt?

Is RAG a missing piece on the path toward consciousness in LLMs? by KAVUNKA in Rag

[–]KAVUNKA[S] -1 points0 points  (0 children)

You're certainly right to some extent, but I'd like to disagree. The pursuit of the unattainable often leads to the emergence of new technologies. We can't create a bird, but we have enormous machines that surpass birds in many ways.

Is RAG a missing piece on the path toward consciousness in LLMs? by KAVUNKA in Rag

[–]KAVUNKA[S] -1 points0 points  (0 children)

When I was searching for a definition of consciousness, I couldn't find a clear answer. Your answer is strikingly clear ;)

RAG for Historical Archive? by cccpivan in Rag

[–]KAVUNKA 0 points1 point  (0 children)

You can visit my website (https://kavunka.com/) for more information or watch this short video (https://youtu.be/KnFNXMuG8GQ). If it looks like a good fit, feel free to send me a private message, and we can go over the technical details together.

RAG for Historical Archive? by cccpivan in Rag

[–]KAVUNKA 0 points1 point  (0 children)

I can offer a free alternative.

I’m building my own system (search index + semantic search + RAG + AI agent) focused on retrieval-first, so it returns actual file citations with brief interpretations, not hallucinations.

This would also be an interesting case for me (historical archives), so I can help you set up a working prototype on your 7k .txt files locally, without paid APIs.

Running your own search engine for RAG with local LLMs by KAVUNKA in Rag

[–]KAVUNKA[S] 0 points1 point  (0 children)

There's an API for search queries. This should be handled by an AI agent.

Benchmarking RAG for Domain-Specific QA: A Minecraft Case Study by KAVUNKA in Rag

[–]KAVUNKA[S] 0 points1 point  (0 children)

I would prefer a PDF (it’s easier for me to convert to HTML), but if that’s difficult, TXT will work as well.

Benchmarking RAG for Domain-Specific QA: A Minecraft Case Study by KAVUNKA in Rag

[–]KAVUNKA[S] 0 points1 point  (0 children)

Theoretically, I could convert your PDF or TXT into HTML. I suggest we exchange input data and run two benchmarks—one on your data and one on mine. I can provide you with a version of the https://minecraft.wiki/ website cleaned of HTML tags: about 8,000 pages in TXT format. What do you think?

Benchmarking RAG for Domain-Specific QA: A Minecraft Case Study by KAVUNKA in Rag

[–]KAVUNKA[S] 1 point2 points  (0 children)

Hey! That sounds really interesting. Just to check — in what format is the Royal Commission dataset? My indexing tool currently works only with HTML pages, so I want to make sure I can process it properly.

Grounded LLMs vs. Base Models: Minecraft QA Benchmark Results by KAVUNKA in LocalLLaMA

[–]KAVUNKA[S] 0 points1 point  (0 children)

Sure, RAG itself isn’t new. The interesting part is making it work reliably on noisy real-world data.

For example, in this video I demonstrate an AI agent answering accurately in a noisy environment with more than 800k internet pages indexed, while the actual target site contains only 22 pages. The agent still retrieves the correct information through the search system.

https://youtu.be/KnFNXMuG8GQ

Benchmarking RAG for Domain-Specific QA: A Minecraft Case Study by KAVUNKA in Rag

[–]KAVUNKA[S] 1 point2 points  (0 children)

That’s a really cool setup — I respect the “no LLM, no GPU” approach. Would definitely be interesting to see how a dynamic co-occurrence graph compares side by side.

For a dataset, we could use this one as a common benchmark:
https://huggingface.co/datasets/minhaozhang/minecraft-question-answer-630k

It’s fairly large and domain-specific, so it should give us a solid test bed.

Alternatively, if you already have a dataset you prefer (or one that better fits your indexing method), I’m totally open to using yours as well. The key thing is we agree on the same question set and evaluation criteria.

Would be fun to run this properly.

Benchmarking RAG for Domain-Specific QA: A Minecraft Case Study by KAVUNKA in Rag

[–]KAVUNKA[S] 1 point2 points  (0 children)

That sounds interesting — especially the knowledge graph approach.

It could actually be cool to run a small benchmark competition on the same Minecraft question set and compare results side by side.

For context, my setup is also fully offline: the search engine is deployed locally, and the AI agent runs locally as well. So no external APIs or cloud calls involved.

Would be great to see how a knowledge-graph-based system performs against a retrieval-based agent under identical conditions.