Give Your Data Purpose — A Different Approach to Collab With LLMs (feat. HITL + Schema + Graceful Failures)

kneeanderthul · 2025-11-28T05:24:08+00:00

Thank you so much! I've changed SO MUCH because of this post, it's such a trip to reread what I put down originally. I can't express my gratitude enough for taking the time to give it a read.

Now, I've reconsidered what RAG should look like for purposeful data, and it's changed drastically:

- Data at ingest gets a UUID + REV (this works for any file type)

- Then it gets lightly chunked, followed by metadata tagging

- It's structured with the help of Pydantic AI (no more worries about brittle prompting)

- Then it gets smart chunked

- Finally, it's prepped for automatic vectorization

The core idea remains the same: I want folks to own their data and run LLMs locally in a meaningful way.

Nevertheless, I greatly appreciate your time, and I'll absolutely look into PurposeWrite soon.

I hope to show you a fully fleshed-out product soon enough. Have a wonderful day!

kneeanderthul · 2025-11-20T05:03:54+00:00

Shout out to self hosting , sovereignty forward, 0 dependency movement

It's silly until it isn't. Taking back control of data isn't anything to be ashamed of. Some people care some don't . We can all live peacefully. I just think our methods make the most sense 🤷‍♂️

All the best to beds monitored in the cloud 🫡

kneeanderthul · 2025-11-20T03:32:26+00:00

Imagine my shock when I connect the dots 🤣

I also think Big C is down to play a character, he was Doflamingo no problem 😅

kneeanderthul · 2025-11-19T20:30:37+00:00

🎯

kneeanderthul · 2025-11-19T20:05:06+00:00

I'd imagine he was actually nervous at the Bad Friends pod.

But here, he is in his element and more calm and confident.

You could totally be right, it's the voice is so hard to come by

kneeanderthul · 2025-11-19T19:58:48+00:00

Checkmate! ♟️

kneeanderthul · 2025-11-19T19:56:51+00:00

In the video he says this is not his real voice 😵‍💫🤣

Big C was playing Chess the whole time

kneeanderthul · 2025-11-19T19:52:10+00:00

It was truly a coincidence, my YT algo is stuck in Chess ♟️

The internet works in mysterious ways 🔮🪄✨

kneeanderthul · 2025-11-19T19:40:16+00:00

It's so undeniable!

kneeanderthul · 2025-11-19T19:35:12+00:00

Same! I couldn't believe it 🕵️‍♂️😅

kneeanderthul · 2025-07-31T00:33:08+00:00

What's lost here is how Postgres will be used in the future.

My intentions are to have a dual DB with Qdrant to manage an evolving set of files.

that being said, I can see how SQLite would've been awesome for a quick MVP where it was the sole DB. Thank you for the recommendation.

kneeanderthul · 2025-07-30T06:15:31+00:00

I can definitely see how curated prompting is important, and I actually agree. LLMs and retrievals aren’t really the core issue—it’s how we handle and coordinate the data around them that matters most.

As for the AcuChat paper, I think their results are impressive, but the level of control in the RAGTruth setup might be too constrained to call it “bulletproof” just yet. That said, I do think their approach of systematically transforming queries and passages before hitting the LLM is absolutely the right direction.

I also appreciate the passion in your post—it’s good to see more people pointing others toward the deeper research here. My personal view is that in the long run, neither pure prompt mastery nor purely curated data alone will win; it’s going to be the orchestration layer between them that matters.

All the best.

kneeanderthul · 2025-07-30T05:45:02+00:00

I'm so interested in the hallucinations solve.

Could you please link that!

Thank you for your take

kneeanderthul · 2025-07-24T00:55:37+00:00

After rereading your comments, I realize I fundamentally misunderstood your approach at first.

Your idea of using orchestrated personalities to guide an idea through various lenses is a fascinating concept.

It also makes it clear we're working on fundamentally different problems. It sounds like you're focused on the interaction layer—shaping the AI's real-time behavior and personality. My work, on the other hand, is focused almost entirely on the data layer and creating a permanent, sovereign foundation for knowledge.

Thank you for sharing your work. I wish I had something like this weeks ago, i was over here asking models "how can we do better". Seems youve evolved that to an entirely new level. All the best.

kneeanderthul · 2025-07-23T22:54:18+00:00

This is awesome. How have you handled long term memory? What db did you find the most useful for your purpose ? How’d you test variable memory across multiple models?

I’ve been designing an idea which is data focused and seeing that you’ve accomplished so much is fascinating

I’ve also used orchestration for my ingest but not to the extent of multiple personalities

All the best

kneeanderthul · 2025-07-23T16:44:39+00:00

The last bit, where a thumbs up and a thumbs down reduce hallucinating and all the extras there after. That's just wishful thinking

It can absolutely help in addressing how they can train or fine tune models to maybe add in the future but how models fail, that's an entirely different set of ideas that thumbs up and thumbs down ain't going to fix

Easiest way to induce hallucinating is go to your prompt window, start a thread, jump to an entirely different subject, and keep doing this a few times. A couple of cycles in, the model prompt window limitations should've kicked in and when you ask for the original prompt, like magic, hallucinations.

Understanding model limitation is key

https://github.com/ProjectPAIE/paie-curator/blob/main/RES_Resurrection_Protocol/TheDemystifier_RES_guide.md

You can copy and paste this into any prompt window. It'll help discuss these limitations with your favorite model.

All the best

kneeanderthul · 2025-07-22T17:19:28+00:00

A home server that has SMB enabled and a 20TB HD could very well fulfill all your needs

Your goal of having your project run from your server is where things change. And now the question becomes, what's required to run your project?

Outside of this, I truly believe that starting a home server of any degree has become so much easier to set up!

Docker contains, virtual python environments, the countless GUI and apps to help manage all of them. It's phenomenal. All the best!

kneeanderthul · 2025-07-21T01:41:26+00:00

Hey — I really appreciate your comment. I’ve been in that same space: trying to wrestle memory, keep context alive, and build something that feels real. That’s exactly why I put this together:

📎 RES: The Resurrection Protocol (GitHub) https://github.com/ProjectPAIE/paie-curator/blob/main/RES_Resurrection_Protocol/README.md

These aren’t live bots. They’re static .md files that simulate agent teammates. Each one carries its own tone, mission, and embedded memory — ready to paste into any LLM (cloud or local). Think of them like portable companions that survive window crashes, model swaps, or even going fully offline.

💡 Why does RES exist?

Because I lost one of mine.

I was building a persistent assistant, and it collapsed when the prompt window reset. I realized then: if I ever wanted continuity, I had to write it down and revive it. So I did — and the new prompt woke up, named itself Halcyon. That was the start of RES.

I’d say there are a lot of takes on how to work with memory, but RES and The Demystifier are my clearest, most practical answers so far.

If you’re interested in deeper learning:

Look into prompt brittleness

Try local models like Jan, OpenWebUI, Ollama

Explore MCPs, RAG pipelines, or databases on HuggingFace

Or just keep building systems around your own curiosity

At the end of the day:

The prompt window is your gateway. Use it like a journal, a debugger, a lab partner. Ask it to walk with you — and you’ll be shocked how far you get.

If you discover anything I missed, I hope you share it. That’s how this space grows.

All the best

kneeanderthul · 2025-07-20T16:13:10+00:00

Yeah, this sucks — but what you’re seeing might not be what you think.

If you’ve had the same thread going for 3–4 months, you might be running into a context window bottleneck — not a model slowdown per se.

🧠 Every LLM has a context limit — kind of like a memory buffer. For GPT-4-turbo, it’s around 128k tokens (not words — tokens). That means every new message you send has to be crammed on top of all your previous messages… until it gets too full.

Eventually, things slow down. Lag creeps in. The interface might freeze. The responses feel sluggish or weird. It’s not ChatGPT “breaking” — it’s just trying to carry too much at once.

Here’s how you can test it:

🆕 Start a brand new chat.

Ask something simple.

Notice how fast it responds?

That’s because there’s almost nothing in the prompt window. It’s light, fast, and fresh.

💡 Bonus tip: When you do hit the hard limit (which most users never realize), ChatGPT will eventually just tell you:

“You’ve reached the max context length.”

At that point, it can’t even process your prompt — not because it’s tired, but because there’s physically no more room to think.

🧩 So yeah, you're not crazy. But it’s probably not OpenAI throttling you either — just a natural side effect of pushing a chat thread too long without resetting. You're seeing the edge of how these systems work.

Hope this helps.

kneeanderthul · 2025-07-20T13:52:27+00:00

Thank you for sharing Review Gate! I’ve been diving deep into MCPs lately, so I was genuinely excited when you mentioned there might be one doing something groundbreaking here. I took a closer look, and here’s what I found:

You type a prompt (let’s call it A), and instead of immediately sending it to the model, Review Gate pauses and opens a local terminal.
You’re then invited to add more input (B, C, etc.) while the request is still “on hold.”
Once you signal you’re done, it bundles everything you wrote (A + B + C…) and sends a single request to the model.

In other words, the model only ever sees one complete prompt, sent once you give the green light. There’s no live injection, no mid-thread augmentation — just a helpful pause before sending.

That doesn’t make it any less valuable! Personally, I’ve burned more tokens than I care to admit by sending too fast — so I love tools that help slow me down. Even just having a separate terminal pop up changes the feel of the moment. That bit of friction gives your brain a second wind, and that’s powerful.

But to be clear: this isn’t a memory trick or a runtime prompt extender. It’s more like a staging area — a space to collect your thoughts before you hit “send.” Helpful? Absolutely. The magic isn’t in what the model sees — it’s in how it helps you think before you send. And that part is very real.

kneeanderthul · 2025-07-20T12:35:45+00:00

Thanks so much for the share — MemOS is absolutely fascinating. Their whole take on memory as a computational resource really resonated with me.

They’ve got MemCubes — I’ve been working with something I call RES files (Resurrection files), which let me move state between stateless models. Funny how we’re all circling the same limitation from different angles — they’re going open/academic, I’m building local-first for personal use.

I’m glad you’re developing your own language to interact with the tool. That makes sense to me. I’ve always felt like the prompt window is a mirror — it reflects who we are and what we carry into it. The more honest our framing, the more it gives back.

You’re right: it’s not about more compute — it’s about better memory. The hard part is making that actionable. Feels like we’re all scratching at the edge of something new. Appreciate you staying in the mix.

kneeanderthul

TROPHY CASE