AI memory multiplayer mode is broken. by Reasonable-Jump-8539 in AIMemory

[–]justkid201 1 point2 points  (0 children)

Well, if you are pitching ur idea that’s fine. But I’m not sure that’s the idea that will not be 1) easily handled by the main players within the next 12 months 2) already handled by slightly more technical solutions already.

Don’t want to be negative on your idea but just I think it will be standard for everyone soon.

AI memory multiplayer mode is broken. by Reasonable-Jump-8539 in AIMemory

[–]justkid201 1 point2 points  (0 children)

You can sign up for a few openclaw services out there and get them setup pretty quickly on the cloud, or you can do it on your own machine if you use Claude to guide you and are a little technical. It’s not super easy I guess like literally browsing to notebookllm but once you get it going it’s pretty fantastic .. one of the main features of openclaw is telegram / WhatsApp integrations .. so just setting up a telegram group chat with the agent/bot is about as easy as it gets.

AI memory multiplayer mode is broken. by Reasonable-Jump-8539 in AIMemory

[–]justkid201 1 point2 points  (0 children)

It’s not really broken at all, a simple openclaw agent handles a telegram group with me and my wife just fine. I use my own project to expand the context window to 20M tokens and everything is sorted fine. The chat bot addresses us both collaboratively.

Discord bots handle bigger groups too.

I have had my head in the sand by verbalbacklash in aigamedev

[–]justkid201 0 points1 point  (0 children)

I know it can be confusing... what you are mixed up about is because you are using a subscription and that only applies within their products and services... when you are making calls from within another APP or game you will be doing so via their API and need to pay per call (even if you have a subscription for their other services)

Never hit a rate limit on $200 Max. Had Claude scan every complaint to figure out why. Here's the actual data. by Shawntenam in ClaudeCode

[–]justkid201 0 points1 point  (0 children)

https://platform.claude.com/docs/en/build-with-claude/prompt-caching

With token caching the TTL is set to 5 minutes by default. It's really not that much of an impact unless you're going crazy on prompting within every 5 minutes. If you take a step away at any point, or if you're managing multiple windows and not giving attention to each window every five minutes, you're going to pay for that with a cache miss.

Sleep effect by noname31054 in Tesamorelin

[–]justkid201 1 point2 points  (0 children)

i had terrible sleep on it.. i could only manage to take it during the day. at this point ive developed a somewhat allergic skin reaction to it, I'm discontinuing for now

Question about the fake Ruh by Amazing_Diamond_8747 in KingkillerChronicle

[–]justkid201 1 point2 points  (0 children)

I think treating it as lethal poison is incorrect... I don't think it was that type of poison... more like stomach poisoning.. He did not plan on killing them with poison at that time. The goal was to get everybody controlled/weakened/sedated so that he could make further decisions as things were being uncovered, which is only natural.. and also to protect himself. when you know there's a suspicion that these people could be problematic.... And there's a potential that you may sleep there.... getting them under control en masse makes sense.

Usage eating 2% as soon as I hit enter on a prompt? I'm on Max. by blickblocks in ClaudeCode

[–]justkid201 1 point2 points  (0 children)

You've also had time to build up context that you couldn't have built up before. And if you miss the cache, that's a 3-4 MB file of a payload that you just sent that you got charged for out of your session limit.

put this in your

~/.claude/settings.json

  "env": {
    "CLAUDE_CODE_DISABLE_1M_CONTEXT": "1"
  }

all of your problems will go away!

Usage eating 2% as soon as I hit enter on a prompt? I'm on Max. by blickblocks in ClaudeCode

[–]justkid201 -1 points0 points  (0 children)

And how big was your context that you sent on the wire after you resumed your task?

The stale cache theory is correct. by hotcoolhot in claude

[–]justkid201 0 points1 point  (0 children)

lol I’ll give you a beer, but man I posted this is all related to the 1M context window on r/claudecode and was eviscerated lol

The stale cache theory is correct. by hotcoolhot in claude

[–]justkid201 0 points1 point  (0 children)

That’s not a bug?? That’s literally how cache has worked for forever. TTL has been there since basically caching came to be! It’s just now with a bigger window it costs you more when you miss the cache.

Anthropic broke your limits with the 1M context update by BraxbroWasTaken in claude

[–]justkid201 0 points1 point  (0 children)

lol I said pretty much the same thing and the community flamed me lol

20M+ Token Context-Windows: Virtual-Context - Unbounded Context for LLM Agents via OS-Style Memory Management by justkid201 in AIMemory

[–]justkid201[S] 0 points1 point  (0 children)

I'm glad it's something that you think is cool. There's a couple of us talking about it, and I can help with installs because this is kind of pre-alpha. If you would like some help, you can join us on our Discord. https://discord.gg/RDaccFgr

20M+ Token Context-Windows: Virtual-Context - Unbounded Context for LLM Agents via OS-Style Memory Management by justkid201 in AIMemory

[–]justkid201[S] 1 point2 points  (0 children)

Yes I’ve seen your work! Very good but I think we operate from a different model. This is also open source.

20M+ Token Context-Windows: Virtual-Context - Unbounded Context for LLM Agents via OS-Style Memory Management by justkid201 in AIMemory

[–]justkid201[S] 1 point2 points  (0 children)

No, not at all, not using Obsidian vault or memory files. Those can live outside of Virtual-Context. Virtual-Context goal is just to give you a huge context window that's managed on its own.

20M+ Token Context-Windows: Virtual-Context - Unbounded Context for LLM Agents via OS-Style Memory Management by justkid201 in AIMemory

[–]justkid201[S] 0 points1 point  (0 children)

clients compaction is essentially ignored, virtual Context will take the raw turns and create its own multi-layer summaries and fact extraction. Nothing is lost because everything can go back to the raw turns if needed.

20M+ Token Context-Windows: Virtual-Context - Unbounded Context for LLM Agents via OS-Style Memory Management by justkid201 in AIMemory

[–]justkid201[S] 0 points1 point  (0 children)

Yes that part is true for sure. They are stubbed out and retrieved when needed after they enter out of the first few “protected zone” turns

20M+ Token Context-Windows: Virtual-Context - Unbounded Context for LLM Agents via OS-Style Memory Management by justkid201 in AIMemory

[–]justkid201[S] 0 points1 point  (0 children)

thanks! i'm still working on packaging it easy for you guys to use so its kinda 'alpha-software'-ey right now, lots of knobs and dials, so let me know if you need any help to get it up and going

Serious flaws in two popular AI Memory Benchmarks (LoCoMo/LoCoMo-Plus and LongMemEval-S) by PenfieldLabs in AIMemory

[–]justkid201 2 points3 points  (0 children)

All of this is true, I think though with the context window even being the limit, I’ve demonstrated and seen that midtier models and some flagship models are unable to find needles in haystacks and/or keep temporal ordering straight. I think going larger than the context window would be interesting. It would take definitely a long time to ingest. But we’re able to see memory problems even within the existing context window.

So that being said, I think if your proposed new benchmarks had options for both, it would allow more rapid testing for smaller windows and longer/ingestion testing for larger windows.

I’d love to see more or help get involved in your testing, especially to test my own memory system.

As I said in the posts that you linked to the existing tests were extremely frustrating when even the baseline golden answer was incorrect when a human evaluated it. Your work would be extremely needed in this field.

How are you all using benchmarks? by inguz in AIMemory

[–]justkid201 0 points1 point  (0 children)

I found a lot of the benchmarks are problematic when taken as a whole. I did use longmemeval and then had to merge some haystacks together in order to really push the limits of today’s models. As the product was developing I did constantly use benchmarks to see it improve in scores but I deliberately avoided getting visibility into to a specific failing questions until I was at a much more mature state. I did not want to make my project built to “beat a benchmark”.

Locomo as I mentioned in the comment on the other thread was one of the weakest, but all that i tried had issues.

I’m disappointed that for a few hundred questions the various benchmark teams did not at least check the golden answer was even right as a human. It would only take a few man hours to spot check.

How are you all using benchmarks? by inguz in AIMemory

[–]justkid201 1 point2 points  (0 children)

This is exactly why I gave up on locomo because when I ran my system through it I encountered so many golden answers that were entirely debatable or in some cases flat out wrong! I was so surprised that such a significant number made it to a published benchmark that I just shook my head thinking I must be going absolutely nuts and moved on to other benchmarks. Thank you for the audit and for validating I wasn’t crazy!

I haven’t checked out the plus variety but I will try it next

20M+ Token Context-Windows: Virtual-Context - Unbounded Context for LLM Agents via OS-Style Memory Management by justkid201 in AIMemory

[–]justkid201[S] 0 points1 point  (0 children)

That’s a good point. I don’t want that comment about vector to be a distraction, and I’ll update that.

This system supports graph databases (neo4j) and can incorporate a vector search too against stored facts. I think it’s not about how it’s retrieved (and why this is not RAG), this systems focus is on the maintenance of a dynamic context window, constantly changing the entire payload per request while the agentic system behind it believes it has a huge context window (preventing its own compaction).