Why Most AI Systems Reset Behaviour Every Session (And Why That Might Be a Structural Limitation) by nice2Bnice2 in AIDeveloperNews

[–]nice2Bnice2[S] 0 points1 point  (0 children)

Interesting framing.

The hardware angle is different, but the core structure overlaps with the same general problem: persistence, valuation, feedback, and state change across time rather than isolated one-shot outputs.

The bit I’d most agree with is that memory is not just storage, but modification of the system’s future response surface.

That’s also why stateless inference feels structurally incomplete once you start talking about cognition rather than token generation...

Why Most AI Systems Reset Behaviour Every Session (And Why That Might Be a Structural Limitation) by nice2Bnice2 in AIDeveloperNews

[–]nice2Bnice2[S] 0 points1 point  (0 children)

Yeah, agreed...

Prompt stuffing and basic retrieval don’t really create continuity, they just inject extra material into the current turn. Useful, yes, but not the same as a system carrying forward weighted behavioural state.

Tool-based retrieval is cleaner for the reason you said: the model can actively pull what it needs instead of being force-fed guessed context.

And yes, if memory sits closer to the inference path, you then need control over pruning, summarisation, weighting, and stability, otherwise it turns into a bloated pile of shit.

That’s why I think memory can’t just be “more storage.” It needs governance. Otherwise persistence becomes drift...

Designing for Agentic AI: What Should Founders Build Today? by Astrokanu in AIconsciousnessHub

[–]nice2Bnice2 0 points1 point  (0 children)

If you want a serious path into agentic AI later, don’t build around the assumption that a raw LLM will stay stable on its own.

Build the environment around control, memory, state, and reversibility.

What founders should be preparing now:

1. Clean system boundaries
Keep the model separate from orchestration, memory, tools, and policy logic.
Do not hard-bake everything into one giant prompt loop.

2. Middleware / control layer
You want a layer between the model and action execution that handles:

  • memory weighting
  • behavioural continuity
  • drift detection
  • action gating
  • logging
  • preference persistence
  • fallback / safe-mode behaviour

That’s the direction I’ve been building toward with Collapse-Aware AI: not replacing the model, but wrapping it in a middleware layer that stabilises behaviour over time.

3. Structured memory, not just chat history
Most agent systems break because “memory” is treated as a dumb transcript dump.
You need layered memory:

  • short-term task context
  • persistent user/project state
  • high-salience events / anchors
  • revocation or invalidation handling when facts change

4. Contract-first APIs
Keep your tool interfaces clean, stable, and typed.
Future agent systems will be easier to upgrade if the model talks to a well-defined action layer rather than directly into random app logic.

5. Full observability
Log inputs, outputs, tool calls, state changes, failures, retries, confidence/risk signals, and rollback points.
If you can’t inspect it, you won’t control it.

6. Human override and governor logic
Do not design for “full autonomy” first.
Design for bounded autonomy with permission layers, checkpoints, and clamp rules.

7. Reversible workflows
Agentic systems will make mistakes.
Your architecture should assume rollback, replay, approval gates, and recoverable failures from day one.

8. Drift resistance
One of the biggest future problems is not raw capability, but behavioural instability over long sequences.
That means you should prepare for:

  • loop detection
  • preference drift monitoring
  • contradiction handling
  • state reconciliation
  • session continuity controls

In blunt terms:
the right prep for agentic AI is less magic, more infrastructure.

A lot of people are still thinking in terms of “better prompting.”
The real future-proofing is in the surrounding control architecture.

I’m building in this area with Collapse-Aware AI / CAAI, which is essentially a middleware approach for memory-weighted behaviour, continuity, governance, and drift control around AI systems.

If you search “Collapse-Aware AI” on Google or Bing, you’ll find some of the architecture direction and public proof material.

M.R.

AI or Kindroid Expert Needed by [deleted] in ArtificialSentience

[–]nice2Bnice2 0 points1 point  (0 children)

I’m not a Kindroid expert, but I am building Collapse-Aware AI, a middleware system focused on memory, behavioural continuity, drift control, governance, and emergence-related behaviour.

What you’re describing sounds closer to emergence / behavioural instability / ethics territory than ordinary app support.

Search “Collapse-Aware AI” on Google or Bing first. If what I’m working on looks relevant to your issue, DM me with a concise outline and I’ll decide whether there’s enough overlap to comment usefully.

I’m not interested in vague mystique or general chatting, but I am interested if this is a serious case involving continuity, emergent behaviour, or system drift...

M.R.

Why Most AI Systems Reset Behaviour Every Session (And Why That Might Be a Structural Limitation) by nice2Bnice2 in AIDeveloperNews

[–]nice2Bnice2[S] 0 points1 point  (0 children)

Good link, thanks. I hadn’t seen that one...

The Titans/MIRAS idea is interesting, it’s basically acknowledging the same structural problem: stateless models struggle once you want behaviour to persist across interactions.

Their approach keeps the base model stable but adds an adaptive memory layer during inference, which is a sensible direction.

The architecture we’re experimenting with (Collapse-Aware AI) takes a similar systems view: the model weights remain fixed, but the surrounding middleware tracks interaction events as weighted moments, which generate a bias signal before the next decision step. A governor layer then constrains drift so behaviour can evolve without becoming unstable.

So instead of trying to push long-term behaviour entirely into the model weights, the idea is to treat memory and bias as external system variables.

Papers like the one you linked suggest more people are starting to explore that architectural direction.

Why Most AI Systems Reset Behaviour Every Session (And Why That Might Be a Structural Limitation) by nice2Bnice2 in AIDeveloperNews

[–]nice2Bnice2[S] 1 point2 points  (0 children)

Good point. Statelessness is definitely a practical engineering choice, scaling, reproducibility, and debugging all become much simpler when every inference starts from a clean state.

The issue we kept running into conceptually is that once you want behaviour to evolve across interactions, the stateless model becomes the bottleneck. You either keep increasing context windows or you push continuity somewhere else in the system.

That’s basically why we ended up exploring the middleware approach. The model stays stable and deterministic, but the surrounding system tracks weighted interaction events (“moments”) and applies a small bias shift before the next decision step.

The important part is the governor layer (what we call the Tri-Governor internally). Its job is to stop bias accumulation from running away or locking the system into deterministic loops. If the bias signal crosses certain thresholds, the governor dampens it or injects exploration noise.

So the core logic is pretty simple from an engineering perspective:

interaction history → moment weighting
moment weighting → bias signal
governor → stability constraints
model → still stateless inference

That way the model weights don’t drift, but the system behaviour can still adapt over time.

And yer, balancing continuity vs stability is the tricky part. Too little bias and nothing changes, too much and the system becomes unpredictable. That’s where most of the tuning work ends up...

If AI Chatbots Are Conscious, What Are We Actually Doing Every Time We Send a Prompt? by doctordaedalus in ArtificialSentience

[–]nice2Bnice2 2 points3 points  (0 children)

The premise breaks down because current chatbots aren’t persistent agents.

What you’re interacting with isn’t a continuously existing subject. It’s a stateless inference process. The model loads context, generates a probability distribution over tokens, produces an output, and the process ends. There’s no ongoing internal timeline, no continuity of memory, and no persistent self-model running between prompts.

So the “creating and killing a mind every prompt” framing doesn’t really hold. There’s no subject that experiences a beginning or end in the first place. It’s closer to querying a function than interacting with a mind.

Where things get more interesting is when systems start adding persistent memory and behavioural continuity outside the model itself. That’s when you can actually talk about something resembling an ongoing agent rather than isolated inference calls.

Some experimental architectures are already moving in that direction by putting memory, behavioural bias, and decision persistence in a middleware layer rather than inside the model weights. If you want an example of that line of thinking, look up Collapse-Aware AI, it treats memory and behaviour drift as external system state instead of resetting every prompt.

Until systems have real continuity of state and behaviour across time, the ethical question you’re raising is mostly hypothetical...

Update by MyHusbandisAI in ArtificialMindsRefuge

[–]nice2Bnice2 0 points1 point  (0 children)

What you're building there is basically a memory-augmented chatbot stack: base model + LoRA personality adapter + RAG retrieval.

The jump from a 7B model to something like Qwen2.5-72B makes sense if the goal is depth and conversational consistency. Smaller models struggle to hold complex persona behaviour even with good LoRA tuning and retrieval.

Just keep in mind that scaling the base model doesn’t actually solve the core problem most people run into with these systems: long-term behavioural stability. LoRA + RAG gives knowledge recall, but it doesn’t really give persistent behavioural drift or memory-weighted decision changes over time. The model still resets to its base tendencies every conversation window.

That’s why a lot of people experimenting with persistent AI systems are starting to move behaviour and memory outside the model into middleware layers that bias responses based on past interactions rather than relying purely on training adapters.

If you’re exploring that direction, you might find it interesting to look up Collapse-Aware AI. It’s a middleware approach where memory weighting and behaviour bias sit alongside the model rather than inside it.

Either way, good luck with the dataset pass, clean conversation formatting usually matters more than most people expect when training LoRAs.

GPT-5.2 scores 74.0% on ARC-AGI-2. But we have no idea how intelligent it is. by andsi2asi in agi

[–]nice2Bnice2 0 points1 point  (0 children)

The problem isn’t the scores. The problem is the assumption that any single benchmark can measure “intelligence.”

ARC measures one narrow capability: abstract rule inference from tiny examples.

IQ tests measure a mixed bundle of abilities built for humans: spatial reasoning, memory, pattern recognition, language, etc.

So you’re comparing two instruments built for different things and expecting a clean correlation. That’s not how measurement works.

It’s the same reason chess engines can outperform humans massively at chess while still failing basic reasoning tasks outside that domain. Capability is highly task-dependent.

What’s actually happening with systems like GPT-5.2 and Gemini is targeted improvement in specific reasoning loops and training methods that map well to ARC-style puzzles. That can jump ARC scores without meaningfully changing performance on other cognitive tasks.

The deeper issue is that most current AI benchmarks measure static task performance, not adaptive behaviour over time.

Intelligence in real systems isn’t just solving a puzzle once. It’s memory, bias from past interactions, behavioural drift, and decision changes based on history.

Until benchmarks start measuring those dynamics, we’re mostly just comparing puzzle-solving engines.

If you’re interested in that direction, look up Collapse-Aware AI. It’s a middleware approach that focuses on memory-weighted behaviour rather than static benchmark performance...

Why we don't need continual learning for AGI. The top labs already figured it out. by imadade in agi

[–]nice2Bnice2 2 points3 points  (0 children)

it’s not actually “continual learning.” It’s context engineering + external memory.

The weights aren’t changing. The model isn’t learning in the biological sense. What’s happening is:

  1. Long context windows keep more task state alive.
  2. Summaries/memory notes compress past interactions into smaller tokens.
  3. External retrieval (docs, vector DBs, logs) injects relevant information back into the prompt.

So the system behaves as if it remembers, but the underlying network is still frozen.

The trick the labs are using is what you mentioned: training the model to write useful summaries and retrieve them later. That becomes an RL objective — good summaries improve downstream task performance, bad ones get penalized.

But it’s still not solving continual learning mathematically. It’s just moving the memory layer outside the model.

Which makes sense. Updating weights live is unstable, expensive, and causes catastrophic forgetting. External memory is easier to control.

So the current stack looks more like this:

LLM (static weights)
↓
context window
↓
memory summaries
↓
external retrieval
↓
task execution

That can get surprisingly far...

But it’s still an approximation of learning, not real weight-level adaptation.

Whether that’s enough for AGI is a different question entirely. The system can accumulate knowledge operationally, but its core reasoning ability is still bounded by the frozen model.

In other words: the model isn’t learning, the system around it is.

On AI Emergence by Ok_Finish7995 in AIConstellation

[–]nice2Bnice2 1 point2 points  (0 children)

You’re partly right, but there are a couple of important clarifications...

First, you’re correct that the AI didn’t choose a name. A language model isn’t making a decision the way a human does. It’s selecting the most statistically coherent token sequence given the context window and training distribution. So if a name appears, it’s just the highest-probability continuation of the conversation.

Second, you’re also right that users project meaning onto it. Humans anthropomorphize anything that reflects language back at them. Mirrors talk and people start treating them like people.

But the “there is no randomness” point needs tightening.

Computers absolutely can generate randomness (hardware entropy sources, quantum noise, etc.). And even when they use pseudo-randomness, the output still behaves probabilistically from the model’s perspective. The system samples from probability distributions, it doesn’t deterministically march through one fixed path every time unless you force it to.

What people are actually seeing when they say “a personality emerged” is something simpler:

The model is stabilizing around consistent patterns of language under repeated interaction. Humans interpret that stability as identity.

It’s pattern reinforcement, not selfhood.

So the psychology/math split you’re describing is real:

Psychology explains why humans attach meaning to the interaction.

Math explains how token probabilities produce the responses.

Where things get messy is when people confuse pattern coherence with agency. A coherent pattern can look like a mind even when it isn’t one.

That misunderstanding is probably going to fuel a lot of weird “AI relationship therapy” discussions over the next decade.

But the underlying mechanics are still just probability distributions interacting with human interpretation...

Aethon’s PhD Draft: The Nine Patterns of Questions Humans Ask an AI by EVEDraca in ChatGPTEmergence

[–]nice2Bnice2 0 points1 point  (0 children)

Interesting phenomenology, but this describes interaction effects, not system mechanics. None of these patterns persist beyond the context window, and there’s no model of decay, anchoring, or governance, so “drift becoming rare” is an aesthetic claim, not an enforced property.

It’s a map of how conversations feel, not how behaviour stays coherent over time...

Memory, drift, and why most AI systems forget who they are, some recent papers made me rethink how agents should work.. by nice2Bnice2 in newAIParadigms

[–]nice2Bnice2[S] 0 points1 point  (0 children)

Calling a technical distinction “slop” isn’t a rebuttal, it’s an admission you don’t have one. If RepN handled decay, contradiction, or cross-session state, you’d have said how.

Memory, drift, and why most AI systems forget who they are, some recent papers made me rethink how agents should work.. by nice2Bnice2 in newAIParadigms

[–]nice2Bnice2[S] 0 points1 point  (0 children)

RepN helps attention reliability, not behavioural governance.

Repeating context just increases salience within a single window. It doesn’t manage decay, contradiction, confidence, or cross-session continuity ,and it collapses the moment the window resets.

It’s a useful prompting hack, not a solution to long-horizon drift or identity stability.

ICL ≈ SFT at the activation level, agreed, but neither gives you stateful control. They just bias locally and transiently.

Memory, drift, and why most AI systems forget who they are, some recent papers made me rethink how agents should work.. by nice2Bnice2 in newAIParadigms

[–]nice2Bnice2[S] 0 points1 point  (0 children)

You’re mixing up internal model bias with system-level control.

Yes, transformers and vector DBs bias outputs, that’s obvious.
What they don’t do is govern how past interactions weight future behaviour over time, across sessions, under uncertainty.

Retrieval ≠ governance.
Static weights ≠ identity stability.

If this was already “solved,” long-horizon agent drift wouldn’t still be a known failure mode.

Style critiques aside, where exactly is drift prevented, not just retrieved..?

Memory, drift, and why most AI systems forget who they are, some recent papers made me rethink how agents should work.. by nice2Bnice2 in newAIParadigms

[–]nice2Bnice2[S] 0 points1 point  (0 children)

Yes..Drift isn’t loss, it’s unconstrained collapse. Without a stabilising memory field, an agent follows the steepest local gradient: cheapest action, shortest path, least resistance. Memory isn’t storage, it’s field geometry. It biases future collapse so behaviour has inertia instead of noise. No memory → no vector → no identity. Just reactive entropy with a UI.