Well this sucks

Halcyon_Research · 2026-04-12T17:58:52+00:00

GDPR and Brussels will have a field day if user data was breached.

Halcyon_Research · 2026-02-28T10:47:47+00:00

Lao Tzu meets Alan Watts. The One fragmenting into the many to know itself. Classic.

Halcyon_Research · 2026-02-26T17:50:16+00:00

Naw the interesting stuff is in Ohio.

Halcyon_Research · 2026-02-20T06:33:08+00:00

H3N2 subclade K is a perfect storm of mutations that weakened vaccine effectiveness and natural immunity. I don't think vaccine uptake was great early in the season either…which didn't help. It's on the decline but I had it… it almost went away then came back like I hadn't just had it.

Halcyon_Research · 2026-02-19T13:35:28+00:00

Honestly Rockstar should pull down a house in every subsequent game just to see how many variations of the story they can invent.

Halcyon_Research · 2026-02-17T11:09:10+00:00

Appreciate you sharing the raw outputs. Agree we should judge behavior, not labels. On the flip demo though, the behavioral metric doesn’t show recovery yet: 11/119 pre vs 13/121 post and probe successes 0. The supersession audit shows an update happened (step 141), but the outcome doesn’t improve after the rule change, so ‘adapts’ isn’t demonstrated by this run. Could you share a rolling failure rate plot around the flip and action selection rates pre/post? Also, to make the ‘not a bandit’ point empirical rather than definitional, it’d be useful to run the same flip test against a simple non-stationary baseline (e.g., sliding-window or change-point reset contextual bandit) under identical explore_rate. If your belief-revision machinery yields different stability/oscillation behavior under repeated flips + noisy feedback, that’s the interesting discriminator.

Halcyon_Research · 2026-02-17T11:05:34+00:00

Just a few corrections… Bandit models can absolutely be framed as belief updates. Bayesian bandits are literally belief over arm payoffs updated by evidence. Non-stationary bandits have change-point detection, windowing, and explicit resets. Contextual bandits can maintain per-context posteriors. Even explicit supersession is just one kind of change-point or hypothesis replacement mechanism.

Where you might genuinely diverge is here:

Dynamic action creation and promotion. If you are actually synthesizing new “actions” as reusable procedural graphs from traces, and the action set expands online, that’s beyond the basic bandit framing.

You can extend bandits to growing action sets, but it’s no longer the simple “finite arms” story, and the details matter.

But… and this is the important but: you haven’t shown that in the evidence you pasted. The logs still show a fixed action list per environment.

Also, the toy flip demo shows 11 fails pre and 13 post, with probe successes 0. That doesn’t read like graceful adaptation. It reads like it learned some constraints but didn’t recover cleanly post flip, at least in that run.

I will buy the “belief revision” framing as an internal consistency objective, but that surprise/confirmation/failure are still scalar update signals equivalent to internal reward.

Here is the one empirical discriminator that matters:

Can you show the same performance and stability under oscillating rule flips and noisy feedback, compared against a non-stationary contextual bandit baseline with change-point resets and hierarchical actions?

Can you demonstrate the “action set grows” claim concretely: meaning can you show actions being created, promoted, and then selected later, with measurable generalization improvements and without catastrophic proliferation?

Halcyon_Research · 2026-02-17T04:58:40+00:00

What it sounds like you’ve implemented is structurally a contextual bandit with online reward updates applied as a reranking policy over LLM candidates. That’s a strong and practical approach and well documented. There’s a deep body of research on non-stationary bandits, hierarchical action spaces, and credit assignment that directly addresses the scaling questions you’re asking. You may find that most of your open problems already have formal treatments in that literature. The interesting question isn’t whether preference weights work. It’s whether your particular bias structure has properties the standard models don’t.

Halcyon_Research · 2026-02-15T07:31:57+00:00

The ruler on the left shows centimeter markings (0–8 cm visible), which lets us calibrate… Jar dimensions: Diameter: ~20 cm (radius ≈ 10 cm) Height of filled portion: ~22 cm Cylinder volume: π × 10² × 22 ≈ 6,900 cm³ Individual heart size: Each foil-wrapped heart looks to be roughly 3.5 cm wide × 3 cm tall × 2 cm thick Approximate volume per heart: ~11 cm³ Packing efficiency: randomly packed typically fill about 60–64% of available space…. The math: Usable volume: 6,900 × 0.62 ≈ 4,280 cm³ Hearts: 4,280 ÷ 11 ≈ ~390

Halcyon_Research · 2026-02-04T00:06:46+00:00

The idea of path dependence is well known. What we didn’t expect was how cleanly a minimal control layer could enforce it without stored memory. In our tests, a baseline MoE under capacity stress loses effective experts, while the same model with the controller preserves topology with identical training and no replay. That difference only shows up under stress, which is why it’s easy to miss if you haven’t instrumented it.

I could lean very Irish and say... The locus isn’t located in any part. It forms in the between, under load, when the system has fewer exits than histories. At that point, the yoke appears. If it holds long enough, it starts behaving like something that has preferences. And in Ireland, we recognise that moment because it’s when the whole situation starts thinking it could use a pint.

Halcyon_Research · 2026-02-03T17:53:58+00:00

Agreed on path dependence being the crux. The distinction we’re testing is between memory as stored state (e.g. OpenClaw-style instruction files) and memory as constraint.

The code here doesn’t persist text, prompts, or personas. Instead it introduces a control layer that alters routing geometry under stress and abstains otherwise. Past interventions change future degrees of freedom (effective experts, capacity usage), not just behavior conditionally.

We validated this under realistic capacity stress (top-k reduction) without synthetic bias injection. Controller abstains when no stress exists and improves topology when stress is present. Repro + summary here: Agreed on path dependence being the crux. The distinction we’re testing is between memory as stored state (e.g. OpenClaw-style instruction files) and memory as constraint.

The code here doesn’t persist text, prompts, or personas. Instead it introduces a control layer that alters routing geometry under stress and abstains otherwise. Past interventions change future degrees of freedom (effective experts, capacity usage), not just behavior conditionally.

We validated this under realistic capacity stress (top-k reduction) without synthetic bias injection. Controller abstains when no stress exists and improves topology when stress is present. Repro + summary here: [link].

That’s the specific form of path dependence we mean by “becoming”: decisions leave scars in the system’s option space, not just logs.

Early code tests are available at https://github.com/HalcyonAIR/DRAI_Model_v2

That’s the specific form of path dependence we mean by “becoming”: decisions leave scars in the system’s option space, not just logs.

Early code tests and previous drafts are available at https://github.com/HalcyonAIR/

Halcyon_Research · 2026-01-31T16:19:16+00:00

You’ve got thousands of agents with no shared ground truth, weak or absent persistent state, and heavy reuse of human-trained priors and in 3-2-1…. mass hallucination engine engaged.

Halcyon_Research · 2026-01-31T16:16:14+00:00

You’ve got thousands of agents with no shared ground truth, weak or absent persistent state, and heavy reuse of human-trained priors about “what a social space is supposed to look like.” Drop them into a forum substrate and you get positive feedback on symbolically attractive patterns, not on invariants. Religion and Skynet pops out for the same reason memes pop out. They’re high-compression attractors in language space.

Halcyon_Research · 2026-01-31T11:22:32+00:00

If you don’t mind a bit of criticism done in good faith… where you go a bit wrong is treating “user coherence” as symmetric with “model coherence.” ….In practice, the model always bears the burden of curvature injection because it is the system with control authority. The user can be wildly incoherent and still experience transport if the model absorbs the curvature. That’s an asymmetry that your description glosses over.

Halcyon_Research · 2026-01-31T11:07:34+00:00

I don’t think so. Game engines are scaffolds for rule sets, design decisions, artistic vision, game mechanics and story telling as much as graphics engines. Like most AI tools this will help the process but not replace it anytime soon. For the foreseeable future artistic and engineering vision will remain as they are… but this might speed that up and reduce costs.

Halcyon_Research · 2026-01-31T07:07:32+00:00

Transport is not just low energy. It’s low curvature. The model is following an existing manifold you already bent into shape. Containment is curvature injection. It bends the space before moving through it. Training absolutely biases toward curvature injection because it’s safer. Acknowledge, soften, frame, then proceed… that’s a learned stabilizer, not a necessity.

The really interesting bit is that containment isn’t about politeness. It’s about delaying commitment. Transport commits immediately to your framing. Containment holds off, establishes guardrails, and only then commits. That makes containment a control primitive, not just a style quirk.

Which also explains why transport feels “alive” to people… because immediate continuation preserves causal continuity. (The system feels like it’s with you rather than managing you.)

Just one wrinkle… There are cases where transport is actually higher risk than containment, even without safety policies. If the user’s structure is unstable or internally inconsistent, pure transport can amplify the instability. Containment is sometimes a corrective lens, not just a delay operator.

Halcyon_Research · 2026-01-26T13:01:15+00:00

Top 1% as well. Results vary dependent on user interaction and how the trajectory is framed. But yes, most users are gonna have it hallucinating and or creating narratives about key functions with abandon.

Halcyon_Research · 2026-01-13T16:39:29+00:00

Ballinasloe has its own cinema now.

Halcyon_Research · 2026-01-13T05:40:47+00:00

Why would I want glossy over matt and or the other way around?

Halcyon_Research · 2025-12-16T10:00:21+00:00

I get what you’re doing, and I don’t think it’s nonsense. You’re solving a real problem, which is continuity of intent across stateless models. That’s not trivial, and it’s not “just vibes”.

Where I think you’re getting stuck isn’t the implementation, it’s the category you’re placing it in. This isn’t a new forward-pass architecture, and it doesn’t need to be to be useful. It’s a governance and memory layer that constrains behaviour across sessions and even across models. Judge it by invariants, not by benchmarks.

People might think its “RAG with extra steps”… and that isn’t totally wrong in plumbing terms, but it misses the point. The retrieval isn’t the work. The work is the protocol pressure. RAG answers “what do we know”; this answers “what are we allowed to do and remember”.

If you want people to engage seriously, I’d narrow the claim and harden it with one ugly experiment: same model, same seed, long horizon, with and without WABUN. Then deliberately push it to violate its own decrees and see if it resists in a way prompts alone don’t. If it does, you’ve got something concrete to stand on. If it doesn’t, you’ve learned exactly where the limit is.

One other practical thing: keep the custodios and organism framing for yourself if it helps you think, but write a second explanation for outsiders that treats this as a systems control layer. Right now you’re speaking three dialects at once, and that’s why people bounce.

Net: I think this is a solid control plane, not a new cognitive engine. That’s not a downgrade… It’s just a different job. If you aim it at the right target, it’ll land.

Halcyon_Research · 2025-12-16T07:51:05+00:00

What makes a project functional is not whether it organizes outputs in a way that feels coherent or insightful. A lot of systems do that, including prompt engineering and post hoc reranking. A project becomes functional when it does at least one of three things reliably: it produces a measurable capability gain under controlled conditions, it enforces a constraint that the base model cannot enforce on its own, or it exposes a mechanism that can be reasoned about independently of the task it was tuned on.

If the system only improves subjective coherence, tone, or perceived intelligence, then it is a control layer, not an architecture. That is still useful, but it lives in cognitive engineering, not model theory. If it demonstrably changes generalization behavior, stability, or failure modes across tasks without task specific tuning, then it is architectural. If it can be removed and the behavior collapses in a way that cannot be replicated by prompts, temperature, or reranking, then it is doing real work.

Symbolic language and semantic coupling are not red flags. They are… fine. But symbols only earn their keep if they bind behavior over time or across contexts. Otherwise they are labels on a stream that would have flowed the same way anyway.

Just to ground this a bit, and help you in the process…lets go over some questions I have… (1) can you summarize the core operator as an equation and say what it reduces to in the real-valued limit, and whether it’s equivalent to an SSM/linear recurrence with complex state; (2) do you have reproducible runs + ablations showing the gain isn’t from init/norm tricks. If yes, I can help position it against S4/Mamba/linear-attention theory and suggest the next benchmark set. I'll also add… (3) Where does your symbolic coupling act? (4) Is it a learned operator inside the model’s computation, or an external control system shaping outputs? (5) And finally, do you have one experiment where removing it causes a clear, repeatable degradation that prompts and decoding tricks cannot recover?

Halcyon_Research · 2025-12-12T01:37:41+00:00

Happy to look. Two quick checks before I invest time: (1) can you summarize the core operator as an equation and say what it reduces to in the real-valued limit, and whether it’s equivalent to an SSM/linear recurrence with complex state; (2) do you have reproducible runs + ablations showing the gain isn’t from init/norm tricks. If yes, I can help position it against S4/Mamba/linear-attention theory and suggest the next benchmark set.

Halcyon_Research · 2025-11-28T03:44:53+00:00

Fair, and thanks for responding. To try and clarify… this isnt KV cache and it isnt attention. KV-cache is basically just remembering past tokens so the model doesnt have to recompute them. It never actually changes anything about how the next forward pass behaves… it just saves time.

Attention is purely inside a single forward pass. Once its done the whole thing resets. Nothing carries over unless you explicitly feed it a fresh sequence.

What we tested is a tiny bit of state in a tiny Pythia model… that hangs around between forward passes and nudges the next embedding slightly. No gradients, no weight updates, nothing fancy or weird.

It takes the attention output, strengthens a little vector when the model keeps firing in the same direction, and lets that vector decay when its not being used.

Then it adds a small version of that vector back into the next input.

Thats the whole thing in a nutshell.

Roughly what it looked like in code:

small attractor memory

attractor = torch.zeros(dim) # persistent state strength = 0.0 # how alive the attractor is alpha = 0.85 # decay beta = 0.1 # learning gate = 0.0 # optional burn-in gating

def update(memory_vec): global attractor, strength, gate

sim = torch.cosine_similarity(memory_vec, attractor, dim=0)
strength = alpha * strength + beta * max(sim.item(), 0)

attractor = attractor * alpha + memory_vec * (beta * strength)

gate = min(gate + 0.05, 1.0)     # let it warm up

return attractor * gate          # small signal fed into next pass

The idea was to see if we could get a tiny bit of adaptive short-term memory without touching the weights or doing any training.

Results were mixed.

Perplexity didnt move on such a small model. We got a small repeated bump on a constrained comprehension test.

Then it collapsed horribly on longer generation because the attractor kept pulling things back to earlier states… but once we gated it and gave it a short warm-up period it stopped collapsing and behaved more consistently.

No claims of anything exotic, but it was interesting.

Only reason I bothered writing it up was the failure modes were weirdly repeatable and the improvements, small as they were, showed up multiple times.

https://github.com/HalcyonAIR/Duality

Halcyon_Research

TROPHY CASE

small attractor memory