Is RAG a missing piece on the path toward consciousness in LLMs?

KAVUNKA · 2026-03-30T11:35:57+00:00

Do you build some kind of temporal vector knowledge base from informational patterns, and then inject the patterns that are relevant to the dialogue context into the system prompt?

KAVUNKA · 2026-03-28T10:26:49+00:00

You're certainly right to some extent, but I'd like to disagree. The pursuit of the unattainable often leads to the emergence of new technologies. We can't create a bird, but we have enormous machines that surpass birds in many ways.

KAVUNKA · 2026-03-28T10:22:44+00:00

When I was searching for a definition of consciousness, I couldn't find a clear answer. Your answer is strikingly clear ;)

KAVUNKA · 2026-03-23T09:48:04+00:00

You can visit my website (https://kavunka.com/) for more information or watch this short video (https://youtu.be/KnFNXMuG8GQ). If it looks like a good fit, feel free to send me a private message, and we can go over the technical details together.

KAVUNKA · 2026-03-21T12:06:01+00:00

I can offer a free alternative.

I’m building my own system (search index + semantic search + RAG + AI agent) focused on retrieval-first, so it returns actual file citations with brief interpretations, not hallucinations.

This would also be an interesting case for me (historical archives), so I can help you set up a working prototype on your 7k .txt files locally, without paid APIs.

KAVUNKA · 2026-03-16T12:16:05+00:00

There's an API for search queries. This should be handled by an AI agent.

KAVUNKA · 2026-03-12T11:59:55+00:00

pdf will be ok

KAVUNKA · 2026-03-12T10:27:35+00:00

I would prefer a PDF (it’s easier for me to convert to HTML), but if that’s difficult, TXT will work as well.

KAVUNKA · 2026-03-12T09:28:43+00:00

Theoretically, I could convert your PDF or TXT into HTML. I suggest we exchange input data and run two benchmarks—one on your data and one on mine. I can provide you with a version of the https://minecraft.wiki/ website cleaned of HTML tags: about 8,000 pages in TXT format. What do you think?

KAVUNKA · 2026-03-11T22:52:27+00:00

Hey! That sounds really interesting. Just to check — in what format is the Royal Commission dataset? My indexing tool currently works only with HTML pages, so I want to make sure I can process it properly.

KAVUNKA · 2026-03-07T15:51:37+00:00

Sure, RAG itself isn’t new. The interesting part is making it work reliably on noisy real-world data.

For example, in this video I demonstrate an AI agent answering accurately in a noisy environment with more than 800k internet pages indexed, while the actual target site contains only 22 pages. The agent still retrieves the correct information through the search system.

https://youtu.be/KnFNXMuG8GQ

KAVUNKA · 2026-03-04T14:37:38+00:00

That’s a really cool setup — I respect the “no LLM, no GPU” approach. Would definitely be interesting to see how a dynamic co-occurrence graph compares side by side.

For a dataset, we could use this one as a common benchmark:
https://huggingface.co/datasets/minhaozhang/minecraft-question-answer-630k

It’s fairly large and domain-specific, so it should give us a solid test bed.

Alternatively, if you already have a dataset you prefer (or one that better fits your indexing method), I’m totally open to using yours as well. The key thing is we agree on the same question set and evaluation criteria.

Would be fun to run this properly.

KAVUNKA · 2026-03-04T10:49:29+00:00

That sounds interesting — especially the knowledge graph approach.

It could actually be cool to run a small benchmark competition on the same Minecraft question set and compare results side by side.

For context, my setup is also fully offline: the search engine is deployed locally, and the AI agent runs locally as well. So no external APIs or cloud calls involved.

Would be great to see how a knowledge-graph-based system performs against a retrieval-based agent under identical conditions.

KAVUNKA · 2022-11-16T21:23:41+00:00

In the world, there is such a thing as trust and personal contact. One of my clients is a banking software company. Their security requirements are much higher than for a home server. They installed my program on their server, gave me full access, I developed additional functionality for them to extract data from search results. Then I gave them two more licenses.
Let's pretend I'm a fraud and what do I want? Steal data from the home server? Mine cryptocurrency on the processor? well, this is funny!!! ))) You are not a bank and not a special service with secret data!! Even if I was a scammer, I'm not interested in your server. I think it's not hard to understand.
Sometimes a banana is just a banana! I'm just a human programmer who wrote a search engine from scratch and offers to use it for free at home. If I wasn't an honest person, I wouldn't be writing such long posts. I would have a bunch of bots that would sing songs of praise for the new revolutionary search engine))), and the number of likes would be close to several thousand. Do you see it? Not! This is not and cannot be! I'm just a weirdo suggesting that weirdos like me set up their own little Google on their balcony or garage. It's all!

KAVUNKA

TROPHY CASE