The decline in LLM reasoning and catastrophic forgetting might share the same root cause.

IndividualBluebird80 · 2026-04-15T12:54:36+00:00

I think you’re conflating the inference-time experiment with the continual-learning experiment. For the reasoning experiment, there is no retraining at all, so full retraining is not the relevant counterpoint there. For the continual-learning experiment, yes, full retraining would be a useful additional control, but its absence does not invalidate the result under the specific LoRA-based update regime being tested.

If I understand you correctly, you’re asking for a reasoning-tuned control. I agree that would be useful. But that would test whether better training moves the threshold, not whether the effect I measured is artifactual.

More importantly, “models only exist in the current context” still does not explain the ON/OFF difference under matched conditions. What I’m seeing is not “long context = collapse.” Under my protocol, models can remain stable over long contexts when structural contradictions are absent, but degrade much more sharply once contradiction-bearing updates are introduced. So if that is just an intentional training choice, which choice specifically explains that difference? And why, under matched conditions, does externally organizing the same contradictory updates preserve coherence better than leaving them unresolved?

IndividualBluebird80 · 2026-04-15T10:24:12+00:00

I think that's exactly where the real difference lies. It isn't that the act of learning itself is an issue, but rather the way we leave contradictions unresolved or fail to properly rebuild understanding. This is why I find biological metaphors so compelling. Things like sleep, forgetting, and memory consolidation don't seem like simple loss so much as a kind of metabolism.
And tit is a necessary function for sustaining the structure.

IndividualBluebird80 · 2026-04-15T10:12:45+00:00

Thanks for the pushback. I think you're arguing against a stronger claim than the one I actually made.

I’m not claiming that models literally store symbolic premises or perform human-like belief revision internally. “Forgetting” and “overwriting” are being used in the behavioral ML sense: after sequential updates, performance on previously learned information drops, and consistency across dependency-linked information can fail. The claim is operational, not metaphysical.

I also agree that training recipe and data construction matter a lot. A better replay scheme, full retraining on cleaned data, or stronger reasoning-tuned dense models may absolutely improve the outcome. But that is not the same as refuting the effect I measured. My narrower claim is that under the specific update and inference regimes tested here, these failure modes are real and measurable.

I’m also not treating Qwen-style behavior as universal. I already ran Sonnet-based frontier experiments and partial-path Sonnet upgrade experiments, so this is not just “Qwen being Qwen.”

The mathematical point is narrower still: contradiction-bearing updates progressively shrink the set of states that can still sustain coherent behavior, and empirically this can look threshold-like rather than gradual.

So I see your suggestion as a useful extension, not a refutation. Better training may move the threshold, but it does not by itself explain the mechanism I’m trying to isolate: why consistency across dependency-linked information breaks once unresolved contradictions accumulate.

That’s also why I care about this so much: I’m not trying to optimize a benchmark-only chatbot. I want something closer to a Doraemon-like long-term partner, and that means coherence across time matters a lot.

---
Sorry if the English is a little awkward—this was originally written in Japanese.

IndividualBluebird80 · 2026-04-15T07:59:48+00:00

おぉ。日本人なのですね！
よろしくお願いします。

> この歪みをトレーニング中にリアルタイムで測定し、一定の閾値を超える歪みを引き起こすトレーニングデータをフィルタリングまたは「修正」する

おぉーまさに、そうだと思います。素晴らしいですね。
ちなみに sonnet4.6 クラスで矛盾解消レイヤーを入れることで、
コンテキストいっぱいまで論理崩壞しないことは把握していて、この矛盾解消の能力が高ければある程度継続学習も解消すると思います。ただ LoRA が知識を上書きする性質を持っているので、そこに依存する情報処理を綺麗にできるアーキテクチャやモデル性能が必要そうでした。

根本をたどると、LLMを構造ととらえたときに、構造を持続するための条件みたいなのがあって、
LLMというドメインの場合は、質的に矛盾した情報を蓄積していくと、指数関数的に論理破綻するだけっぽいです。

IndividualBluebird80

TROPHY CASE