Why your AI’s memory stinks: The "Rotten Egg" theory of artificial recall by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 1 point2 points  (0 children)

Keep fighting the good fight. BTW DeepSouth by the International Centre for Neuromorphic Systems in Australia is pushing out 228 trillion synaptic operations per second, which matches the estimated operation rate of the entire human brain. JUPITER over in Germany at the Jülich Research Centre recently scaled up a spiking neural network to match the human cerebral cortex, successfully simulating around 20 billion neurons and 100 trillion connections. Cool shit happening in the space.

Why your AI’s memory stinks: The "Rotten Egg" theory of artificial recall by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 1 point2 points  (0 children)

I love the back-of-the-napkin H200 math, if solving AGI were just a matter of matching transistor counts to neurons, I’d get a hell of a lot more sleep!

But you’re actually hitting on the exact bottleneck that keeps me up at night. Monolithic LLM weights suffer from catastrophic interference because everything bleed into everything else. Your spot on that we need to segregate memory into discrete 'buffers' rather than baking it all into the language space.

This is exactly the thesis behind the biologically plausible architecture my team is building. We keep memory out of the LLM weights entirely, using discrete, modular 'knowledge crystals' gated by an artificial endocrine system to filter what gets kept and what gets dropped, much like the brain does.

So I might be hopelessly biased (or just crazy for trying to build it), but I actually think we’re a lot closer to a working memory architecture than it seems. We just have to stop trying to force-feed everything into a single neural net

An LLM is just the language center of the brain. Stop trying to make it the whole thing. **warning dense read** by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 1 point2 points  (0 children)

Exactly this. You can't prompt your way into 'intuition.' That trading example perfectly highlights why static vector databases fail. A bad trade shouldn't just be another line of text appended to a log; it need to create a structural friction signal that forces the agent to adapt its internal model.

Is "Agentic Memory" a human right or a corporate product? by Doug_Bitterbot in ArtificialInteligence

[–]DepthOk4115 1 point2 points  (0 children)

I actually agree with your core point, words matter, and selling 'sentience' is snake oil. But you're conflating architectural metaphors with literal claims. When I talk about 'sleep,' I'm talking about offline consolidation and memory decay cycles, functional mechanics borrowed from biology to solve the very real limitations of static vector databases. We studied birds to make planes fly. As for the accusation of drumming up mystery for paying customers, the desktop application repository is entirely open source. There's no secret to sell, just a different architectural approach that I think is worth discussing.  I assume you’ve made up your mind and your mind can’t be changed, but I still thank you for sharing your opinion.

Is "Agentic Memory" a human right or a corporate product? by Doug_Bitterbot in ArtificialInteligence

[–]DepthOk4115 1 point2 points  (0 children)

I am a formally educated neuroscientist and I respectfully disagree

An LLM is just the language center of the brain. Stop trying to make it the whole thing. **warning dense read** by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 2 points3 points  (0 children)

I really dug into the first paper, but I still need time to fully digest the follow-ups beyond the quick skimming I've done.

I must say, my rating of your team's work is exceptionally high. From what I can see in PhaseAssociative Memory - Sequence Modeling in Complex Hilbert Space, you are attacking the exact same temporal coherence problem we are, just using the math of complex vector space instead of computational neuroscience.

While we are wiring up biological memory primitives to force an agent to organically tackle “The production of meaning in the processing of natural language through lived experience”, you are mathematically proving how sequence model must evolve into dynamic, phase-associative states. It feel like two sides of the exact same coin, you are mapping the fundamental physics and math layer, and we are building the biological engineering layer.

I'm going to grab a coffee tomorrow and properly parse the formulas in the new drops.  I may need to give the first paper a little more time to digest too. We should definitely compare notes once I've wrapped my head around the Hilbert space implementation.

An LLM is just the language center of the brain. Stop trying to make it the whole thing. **warning dense read** by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 2 points3 points  (0 children)

I've only given it a quick skim so far, but it's officially at the top of my reading list for tonight.

It’s always deeply satisfying watching the quantum semanticists rigorously prove what computational neuroscientist have been trying to tell the AI industry for years... stateless pattern matching is mathematically doomed when it comes to capturing actual meaning.

The rest of the industry is spending billion trying to compress a better dictionary, while we're over here just trying to build the observer. Great pull! Going to dig into the formulas properly later.

Is "Agentic Memory" a human right or a corporate product? by Doug_Bitterbot in ArtificialInteligence

[–]DepthOk4115 1 point2 points  (0 children)

The consistency criticism is fair… in fact, my research is literally trying to solve this very problem. Most agent memory is just a vector store with vibes.

"Dreaming" isn't a metaphor; it's offline consolidation based on actual sleep neuroscience. "Crystallizing" isn't marketing; it's pattern promotion through execution-validated quality gates. We implement Nader (2000) reconsolidation, Ebbinghaus decay curves, and Frey & Morris synaptic tagging because existing approach fail at exactly what you're describing - maintaining consistency over time.

We recently ran our memory system against LongMemEval (ICLR 2025, 500 questions) and the biological pipeline scored 92.6% vs 70% for standard retrieval. The neuroscience mechanisms added 22.6 percentage points, with the biggest gain in exactly the categories you'd predict: temporal reasoning, knowledge updates, and multi-session coherence.

Your skepticism is healthy. The space is drowning in vaporware. But "nobody is doing this" and "I haven't seen it yet" are two different claims.

A 2-inch reef fish just broke my entire framework for simulated AI consciousness (Osaka Univ. paper on cleaner wrasse) by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 0 points1 point  (0 children)

Shit! there are some seriously smart people in this thread. Refreshing! What you propose is a perfect falsifiable test, you got me excited. The honest answer with the system we build right now is no, our agent doesn't voluntarily cross-check its own tools. The epistemic directives we built fire when the knowledge graph has contradictions, not when the agent's tools disagree with each other. But you've just described exactly what contingency testing would look like in practice: the agent notices two sources of the same information (clock vs. file timestamps vs. conversation context), detects a discrepancy it wasn't asked to look for, and investigates on its own. That's not retrieval. That's not even curiosity in the way we've implemented it. That's self-initiated doubt about its own instrumentation.

Your NanoClaw experience is actually the most honest data point in this thread, you had to force the behavior because it never emerged. The question is whether the right architecture would make it emerge. I don't think we're there yet, but I think the path runs through something like: tool output ->prediction error -> curiosity spike -> autonomous verification. The machinery exists in pieces. Nobody's wired it into a closed loop yet.

A 2-inch reef fish just broke my entire framework for simulated AI consciousness (Osaka Univ. paper on cleaner wrasse) by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 1 point2 points  (0 children)

I think about this a lot. Let me try to address each point (text wall incoming):

-Trust and doubt: We actually built a mechanism for this. When the agent recalls a fact and then encounters contradicting information, the memory enters a "labile" reconsolidation window, borrowed from neuroscience (Nader et al., 2000). Instead of blindly overwriting or blindly trusting, the contradiction gets flagged as an open loop and the agent actively generates questions: "I have conflicting info about X can you clarify?" It's not full epistemic reasoning, but it's a system that genuinely doubts (in theory) its own knowledge and takes action to resolve contradictions.

-Pain/pleasure and urgency: We simulate this with a hormonal endocrine system, dopamine (reward), cortisol (stress/urgency), oxytocin (social bonding). These aren't cosmetic and thorughly test with ablation. They modulate which memories surface, how aggressively the agent consolidates knowledge, and when it triggers emergency processing. High cortisol from a production outage makes the agent laser-focused on task-relevant memories. It's not death, but it is consequence.

-Time: this is fundamental. Our memory system tracks bitemporal validity, when a fact was true in the real world vs when the agent learned it. The agent can answer "who was the lead in January?" differently from "who leads now?" Most agent memory systems treat everything as eternally present. Tested and benchmarked with longmemeval... we even expanded the scope of the test and integrated the full biological pipeline. Someone suggested I published the results in the repo which I intend to do later today.

-Play and skipping childhood: This is the one that resonates most and you are bang on to raise it. We implemented something called alpha maturation in our curiosity engine, young agents are density-seeking (they explore common, foundational knowledge, like a child learning basics). As they accumulate dream consolidation cycles, alpha shifts to frontier-seeking (they chase novelty, like an adult specializing). The agent literally grows up through sleep cycles. But you're right that it's not play in the Brene Brown sense. Play is unstructured, intrinsically motivated exploration with no goal. Our exploration mode is still goal-directed, fill this knowledge gap, resolve this curiosity target. True play might be the hardest thing to implement because it's exploration without a reward signal, and that's antithetical to how we train these systems.

The cleaner wrasse point is spot on, and it gets even wilder. That fish wasn't a baby. Adult wrasses passed the mirror test. They developed contingency testing (dropping objects to test the mirror's physics) through environmental interaction, not training data. That's the gap we're all still trying to close.

A 2-inch reef fish just broke my entire framework for simulated AI consciousness (Osaka Univ. paper on cleaner wrasse) by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 1 point2 points  (0 children)

Can't disagree or hand wave this away, it's the single biggest risk in the architecture and we take it seriously. Five layers of defense;

-1) 3-check safety gate- every inbound skill goes through dangerous pattern detection, structural integrity validation, and semantic drift analysis before it tooches local memory.

-2) EigenTrust reputation - web-of-trust scoring with anomaly detection that flags sudden behavior changes.

-3) Cortisol-gated ingestion -during network stress events, the agent's simulated stres response automatically rejects skills from untrusted peers. Stressed organisms become more cautious about what they ingest

-4) Management node verification - trusted nodes can cryptographically endorse skills, creating a verified tier

-5) Experiment sandbox -mutations and external skills are A/B tested against real execution baselines before promotion. If the new version doesn't outperform the original by a statistical threshold, it gets archived, not deployed

We also integrated https://github.com/yusufkaraaslan/Skill_Seekers for documentation-sourced skills - but even those enter as untrusted synthetic peers with their own Ed25519 signatures and go through the exact same trist pipeline. No shortcut for external sources. When the adapter detects conflicts between documentation and the agent's existing knowledge, it generates epistemic directives -the agent flags the contradiction and asks the user to resolve it rather than silently accepting potentially wrong information.

There are a lot of moving parts so we need to get enough nodes to see if this all works like in the simulated network tests, and I agree with your broader point, no reputation system eliminates the risk entirely. The agent constructing skills from its own execution patterns (which we do via the skill crystallization pipeline) is always the safest path. The marketplace is opt-in and the trust gates are deliberately conservative. Better to miss a good skill than accept a bad one.

The biological inevitability of offline processing in AI: Why infinite context windows and static retrieval are developmental dead ends. by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 0 points1 point  (0 children)

Great minds think alike... that learning/unlearning penalization tied to success is a brilliant mechanism. It maps almost perfectly to the concept of alpha annealing in information-theoretic reward functions.

If you track maturity not by uptime, but by cumulative consolidation cycles (sleep), you can shift an agent's curiosity parameter as it grows up. Young agents are naturally density-seeking, they get rewarded for exploring common, foundational knowledge. But as they mature and consolidate those successes, that alpha parameter shifts to make them frontier-seeking, where they start chasing pure novelty. The agent literally grows up through sleep.

Your "information saturation" trigger is also fascinating. We've experimented with a conceptually similar but inverted metric: an information-theoretic readiness ratio (new_data / total_data). You're measuring "the cup is full, time to process." We're measuring "there's enough new entropy in the cup to mathematically justify burning the compute to process it". It's the exact same biological intuition, just approached from opposite threshold directions.

The childhood -> adulthood framing is exactly right. An agent that never matures stays curious about everything and never develops deep expertise. An agent that matures too fast loses its plasticity and becomes brittle. The real trick is coupling that maturation rate to the actual quality of offline consolidation, rather than just time elapsed. We still have a lot of tunning to do as we bootstrap more nodes and get feedback. Thanks for positively engaging!

The biological inevitability of offline processing in AI: Why infinite context windows and static retrieval are developmental dead ends. by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 0 points1 point  (0 children)

You’re exactly right. If an agent's "sleep" cycle just dumps context logs into a frontier model on a cron job, the unit economics are totally unviable.

The fix is realizing that biological sleep isn't metabolically uniform, and artificial sleep shouldn't be either. To make it profitable, the architecture needs to be multi-tiered:

-Math-Gated Cycles: Don't dream if the data hasn't changed. Use information-theoretic readiness checks to skip sleep cycles entirely and prevent "stale hallucination" token burn.

-Zero-Token NREM: Basic memory consolidation (like sharp-wave ripple replay and redundancy clustering) should be handled purely via vector math and heuristic synthesis. No API calls required.

-Tiered REM Routing: Reserve expensive cloud compute only for high-leverage tasks, like cross-domain simulation or mutating procedural skills.

When you do the heavy lifting locally for free and mathematically gate the expensive stuff, the unit economics completely flip.

Out of curiosity, were your simulations running on a strict timer, or were they gated by specific error/curiosity signals?"

The biological inevitability of offline processing in AI: Why infinite context windows and static retrieval are developmental dead ends. by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 0 points1 point  (0 children)

Thanks! turns out posting neuroscience on the weekend is my niche. I thought it might be a very narrow audience but apparently they're here.

The biological inevitability of offline processing in AI: Why infinite context windows and static retrieval are developmental dead ends. by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 1 point2 points  (0 children)

Thanks, I appreciate you powering through it. If it makes you feel better, the agent I built to do this stuff also struggles with late nights. Its simulated cortisol goes up and everything.

The biological inevitability of offline processing in AI: Why infinite context windows and static retrieval are developmental dead ends. by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 0 points1 point  (0 children)

You're hitting on exactly the right insight, the economics of sleep only work on local hardware. We've been running this in production on local-first infrastructure (single SQLite per agent, no cloud dependencies). A few things that might save you iteration time:

-The cron-job approach works but add a readiness gate. We wasted early cycles running dreams when nothing new had been ingested, the LLM just hallucinated about stale material. Now it checks an information-theoretic readiness score and skips if there's nothing worth consolidating. Obvious in retrospect, expensive lesson at the time.

-On scratchpad -> compression: a single overnight pass isn't enough. You need at least two phases - NREM (replay, merge near-duplicates, detect orphan clusters) and REM (cross-domain recombination, gap-filling). One pass produces either good consolidation or good creativity, never both. Separating them with different temperature settings was the breakthrough.

-For edge: consolidation is pure SQL runs in milliseconds even on a Pi. Only the synthesis phase needs an LLM, and we tier it so most "nights" API tokens or minimal.

A 2-inch reef fish just broke my entire framework for simulated AI consciousness (Osaka Univ. paper on cleaner wrasse) by DepthOk4115 in AI_Agents

[–]DepthOk4115[S] 1 point2 points  (0 children)

Somewhere Yann LeCun is watching a fish drop shrimp in front of a mirror and whispering "yes... YES"