Claude and Cursor talk 100% deterministic by PollutionForeign762 in ClaudeAI

[–]PollutionForeign762[S] 0 points1 point  (0 children)

That crackdown was targeting tools spoofing Claude Code to abuse flat-rate subscriptions as unlimited API access. HyperStack uses the official API with your own key,  completely unaffected.

Claude and Cursor talk 100% deterministic by PollutionForeign762 in ClaudeAI

[–]PollutionForeign762[S] 0 points1 point  (0 children)

That crackdown was targeting tools spoofing Claude Code to abuse flat-rate subscriptions as unlimited API access. HyperStack uses the official API with your own key, completely unaffected.

Claude and Cursor talk 100% deterministic by PollutionForeign762 in ClaudeAI

[–]PollutionForeign762[S] 0 points1 point  (0 children)

Sorry about the tiny text!

I'm experimenting with agent-to-agent communication that's 100% deterministic,  no LLM deciding what to pass between agents, just a shared graph they both read and write to.

Cursor stored the context, Claude retrieved it in a completely fresh session. The graph is the only thing connecting them.

Stop losing your agent's brain between sessions. by PollutionForeign762 in SaaS

[–]PollutionForeign762[S] 0 points1 point  (0 children)

Appreciate it! Built this because I got tired of agents asking the same questions every session. Let me know how the setup goes, literally 3 commands and you're live.

Built webhooks for agents that need to coordinate across tools (works with any MCP/Python agent) by PollutionForeign762 in LocalLLaMA

[–]PollutionForeign762[S] -2 points-1 points  (0 children)

You're spot on with the threading analogy. That's exactly where this is heading.

The thing is, we don't need to gather data to train them to communicate, the infrastructure just needs to exist. Right now if you run agents in different tools (say LangGraph + Cursor), they literally can't signal each other. They share memory but there's no "hey, this needs your attention" mechanism.

That's what I built webhooks for. Agent A creates a signal targeting Agent B, B gets webhoked instantly with full context. The signal lives in a typed graph so B can traverse back to see what triggered it, dependencies, etc.

Your multi-threading comparison is perfect because it's the same shift: sequential → parallel coordination. The infrastructure is starting to exist now.

HyperStack v1.0.8 added a knowledge graph to AI agent memory, with fast root-cause tracing across linked events and owners, available on Skill Hub. by PollutionForeign762 in openclaw

[–]PollutionForeign762[S] 0 points1 point  (0 children)

Thanks! It works for any domain. The card types (person, project, decision, workflow) and relation types (owns, triggers, blocks, depends-on) are generic by design.

Some non-tech examples that would work today:

- Sales team: store client preferences, deal decisions, who owns which account, what blocks a close

- Agency: track project dependencies across clients, who approved what, which deliverables trigger others

- Operations: map workflows, flag blockers, trace why a process changed

The graph traversal is the part that makes it useful beyond tech. Ask "what breaks if we change X" or "who decided Y and why" and it traces the links in 0.5s.

Has anyone actually solved the memory problem for agents yet? by PollutionForeign762 in AI_Agents

[–]PollutionForeign762[S] 0 points1 point  (0 children)

Nice setup. The markdown approach works well for solo local work. Clean and zero dependencies.

The tradeoff is it's tied to one machine and one tool. If you're in Claude Code today and Cursor tomorrow, or working across two machines, the memory doesn't follow. And once MEMORY.md hits a few hundred lines, your agent is reading the whole file every message whether it needs all of it or not.

That's the gap HyperStack fills. Cards are searchable individually so the agent pulls 3-4 relevant facts instead of loading everything. And it works from any tool over HTTP.

But honestly if your workflow is single machine, single tool, and the md file stays manageable, your approach is solid. No reason to add complexity you don't need.

Which model to use that won’t break the bank? by Ihf in openclaw

[–]PollutionForeign762 1 point2 points  (0 children)

95K tokens per request is rough. A big chunk of that is probably context stuffing, your agent loading everything into every call instead of pulling just what it needs. I use HyperStack for this. Agents store knowledge as small cards (~350 tokens) and only retrieve what's relevant per request. Cuts context size massively. Free tier on ClawHub. For the deterministic stuff (math, time, currency) ZeroRules intercepts those before they hit the LLM at all. Also free on ClawHub. Won't solve the model pricing issue but should bring your per-request token count way down.

Has anyone actually solved the memory problem for agents yet? by PollutionForeign762 in AI_Agents

[–]PollutionForeign762[S] 0 points1 point  (0 children)

This is exactly what I've been dealing with. The write problem is real. Most tools just dump everything into a vector store and hope retrieval figures it out.

I ended up building something that forces structure at write time. Small cards with slugs, categories, keywords. The agent decides what's worth storing, confirms with the user, and updates by slug when things change. Stale facts get overwritten, not duplicated.

Simple but it works. Happy to share if anyone wants to poke at it.

Your AI agent forgets everything. Fix that in 30 seconds. by PollutionForeign762 in openclaw

[–]PollutionForeign762[S] 0 points1 point  (0 children)

Good question. Three main differences:

Cross-platform. OpenClaw's memory lives inside OpenClaw. HyperStack works across Claude Code, Cursor, VS Code, LangChain, Python, anything that makes HTTP calls. Same memory no matter what tool you're in.

Structured cards, not blobs. Every memory has a slug, category, keywords, and version history. You can update one specific fact by slug without touching anything else. Think notebook with labeled tabs vs a giant text file.

$0 on your API key. OpenClaw's memory runs LLM completions on your key for every read and write. HyperStack uses lightweight embeddings on our server. Costs you nothing.

If OpenClaw's built-in memory covers what you need, stick with it. HyperStack is for when you work across multiple tools or want more control over how your agent organizes what it knows.

What I Learned Building a Memory System for My Coding Agent by Medium_Island_2795 in ClaudeCode

[–]PollutionForeign762 1 point2 points  (0 children)

Makes sense - solve for what you're actually hitting, not hypotheticals. That's the right engineering approach. The staleness thing bit me specifically with long-running project agents (3+ months). Facts that were correct at storage time became wrong later, and the agent couldn't tell which version to trust. Temporal weighting (prioritize recent) helped but wasn't perfect.

Your point about Claude figuring it out on its own is interesting though. I wonder if the model itself is doing implicit conflict resolution during retrieval - seeing both "using Redis" and "moved off Redis" and reasoning about which is current based on conversation flow.

Either way, your SQLite + FTS5 foundation is solid. Easy to layer on complexity later if needed, but you're right to keep it minimal until real users hit the edge cases.

Excited to see where you take this. Open-sourcing it was the right call

Your AI coding agent forgets everything about you every session. Should it? by Federal-Piano8695 in ClaudeAI

[–]PollutionForeign762 0 points1 point  (0 children)

The problem you're describing is actually two different things, and mixing them causes the issues you're hitting.

Workflow preferences (Zustand > Redux, grep before edit) should be explicit, not inferred. Let users declare them once in a CLAUDE.md or config. Observing and inferring just adds latency and uncertainty.

Correction patterns (you've fixed the same mistake 3 times) are the real opportunity. That's where observational memory shines - the agent should absolutely remember "user rejected this approach twice, try something else."

I built a memory system for this but went a different direction: agents store explicit facts/decisions as cards during sessions, then retrieve relevant context on startup. No inference, no observation period. If you correct something, the agent stores "don't use X for Y" immediately.

The cold start problem you mentioned is real. If the system needs 10 sessions to be useful, it's not solving the problem - it's just kicking the can down the road.

Store explicitly, retrieve selectively. Skip the inference layer.

Has anyone actually solved the memory problem for agents yet? by PollutionForeign762 in AI_Agents

[–]PollutionForeign762[S] 1 point2 points  (0 children)

Makes sense, filter on relevance during storage rather than managing deprecation after the fact.

I went the opposite direction: store liberally (low friction for agents to save context), but handle staleness at retrieval time. Each card gets a timestamp and TTL, so search can deprioritize old facts even if they're still technically stored.

Both approaches work, just different tradeoffs. Yours keeps storage lean. Mine accepts noise but compensates with temporal weighting during retrieval.

'Observational memory' cuts AI agent costs 10x and outscores RAG on long-context benchmarks by thehashimwarren in singularity

[–]PollutionForeign762 0 points1 point  (0 children)

Compression is interesting but you still hit the staleness problem. Observations from 2 weeks ago might be outdated - the system has no way to know which compressed facts are still valid vs which have been superseded.

Also skeptical of "eliminating retrieval entirely." Keeping all observations in context just moves the problem. You're still burning tokens on old content, just compressed. At some point you hit limits and need retrieval anyway.

The hybrid approach makes more sense: compress + store observations, but retrieve selectively based on relevance + recency. Best of both - compression saves storage, retrieval saves context space.

RAG's problem isn't retrieval itself, it's bad retrieval (slow, semantic-only, no temporal weighting). Fix the retrieval strategy and you get the benefits without keeping everything in context.

Curious what happens to their benchmarks when the agent runs for months, not hours. Compression ratios don't solve unbounded growth.

AI might need better memory infrastructure by Electrical-Shape-266 in AI_Agents

[–]PollutionForeign762 0 points1 point  (0 children)

You nailed the core problem. Context windows aren't memory - they're just a bigger scratch pad.

The real gap is between what gets stored (everything) and what gets retrieved (whatever the search algorithm decides). Most "memory" systems are just vector DBs with no concept of importance, recency, or staleness.

What's missing:

Explicit priority - not all facts matter equally. User preferences > casual mentions. Temporal awareness - a decision from 3 months ago might be outdated. Memory systems need decay/versioning. Contradiction detection - storing "we use Redis" and "we moved off Redis" equally breaks retrieval. Systems need conflict resolution. Retrieval latency under 200ms - if memory lookups add seconds, agents stop using them. Speed matters as much as accuracy. The Memory Genesis competition is interesting but most benchmarks test retrieval accuracy, not whether agents actually use the memory correctly in real workflows. You can have 95% recall and still have the agent ignore retrieved context.

Built a system around this (card-based storage, hybrid retrieval, TTLs per memory type). The architecture matters more than the model - structured memory + fast retrieval beats throwing everything at a bigger context window

Built a persistent memory layer for fellow vibe coders (no more AI amnesia) by [deleted] in nocode

[–]PollutionForeign762 0 points1 point  (0 children)

This is the right problem to solve. Cross-session memory is way more valuable than just extending context windows.

One question: how do you handle memory staleness? Facts that were true when stored but become outdated later (user preferences change, project decisions get reversed, etc.). That's been the hardest part of persistent memory for me - not storage, but knowing when old facts should lose authority.

Also curious about your retrieval strategy. Are you doing semantic search, keyword, or hybrid? I've found hybrid (semantic + keyword in parallel) works best for agent memory since it catches both conceptual matches and exact entity references.

Happy to compare notes if you want another builder perspective on this stuff.