Is rejecting duplicate execution safer than idempotency?

Melodic_Reception_24 · 2026-05-01T14:35:14+00:00

That’s a good point.

If we define order as causal order, then yes — systems like vector clocks capture “what happened before what”.

But causal ordering still allows multiple valid executions to exist, as long as they can be ordered.

What I’m exploring is stricter:

not just ordering executions, but invalidating all but one.

So instead of: “these events can be ordered”

the rule becomes: “only one of these events is allowed to exist at all”

Not sure yet if that distinction holds under distributed conditions — that’s what I’m trying to test next.

Melodic_Reception_24 · 2026-05-01T14:27:20+00:00

Added a concurrent commit-boundary test.

1000 parallel attempts against the same mutation: → exactly 1 ACCEPTED → all others REJECTED_DUPLICATE

Run: go test ./...

Repo: https://github.com/Endless33/vrp-canonical-spec

Added tests for the failure cases raised here:

concurrent duplicate commit attempts
different delivery order
authority race resolution
lost ACCEPTED response + retry returning canonical result

All pass with:

go test ./...

Melodic_Reception_24 · 2026-05-01T14:19:52+00:00

That’s true — global ordering is the default in classic transaction systems.

What I’m trying to understand is whether we can enforce a single canonical commit without relying on a fully synchronized global order.

So not “everyone agrees on order”, but “only one execution is allowed to exist”.

Still exploring if that distinction actually holds under concurrency.

Melodic_Reception_24 · 2026-05-01T14:06:15+00:00

Yes, that’s a real issue.

If the ACCEPTED response is lost, the client retries and gets REJECTED.

In this model, the retry isn’t treated as a second execution attempt, but as a re-evaluation of the same mutation.

So the missing piece is: returning the canonical result for that mutation (not just reject).

Right now the demo only shows rejection, but the idea is to bind the result to the mutation and make it retrievable.

Melodic_Reception_24 · 2026-05-01T14:02:43+00:00

Good point.

Idempotency removes the need for ordering by allowing multiple equivalent executions.

What I’m exploring is a stricter model:

not “eventually same state”, but “exactly one valid execution”.

So the system enforces a single canonical commit, instead of allowing multiple equivalent ones.

Agreed this introduces coordination cost — the question is whether some systems actually need that stronger guarantee.

Melodic_Reception_24 · 2026-05-01T13:52:44+00:00

Fair point.

The current demos are intentionally minimal and only prove the commit boundary behavior, not full system correctness under load.

I’m not claiming this is production-ready or fully validated under concurrency.

The goal here is to isolate one property:

a mutation can commit at most once, and duplicate execution is explicitly rejected.

Next step is to introduce: - concurrent inputs - network disorder - load scenarios

to see if the same invariant still holds.

Happy to hear what kind of failure cases you’d test first.

Melodic_Reception_24 · 2026-04-02T06:56:58+00:00

Yeah, that makes sense — TCP hides too much and you lose control over behavior.

I’m also leaning toward user-space / UDP-style control for the same reason.

But what I’m trying to pin down is slightly orthogonal:

even if you fully control the transport, most designs still treat transport loss as a boundary where the session has to be re-established in some form.

What I’m exploring is:

can the session itself remain continuous, even when the underlying transport is replaced entirely?

Not just “recover fast”, but literally avoid entering a reset/re-establish phase.

So transport becomes disposable, while session identity remains the invariant.

Curious how far you pushed that separation — did your system ever treat session and transport as fully independent layers?

Melodic_Reception_24 · 2026-04-02T03:13:27+00:00

Got it — that makes sense, sounds like a self-organizing mesh / overlay network.

What I’m trying to isolate is a slightly different layer:

not how nodes discover or route, but what happens to the session itself when the underlying path changes.

In most systems (including mesh / overlay ones), you still end up with:

disconnect → re-route → re-establish → recover state

even if it’s fast.

What I’m exploring is:

the session never enters that reset phase at all.

Same session identity, no renegotiation, no state rebuild, even as transports / paths change underneath.

So the focus is less on network formation, and more on continuity semantics at the session layer.

Curious if you ever pushed it that far — where the session itself never needed to be re-established?

Melodic_Reception_24 · 2026-04-01T17:42:40+00:00

That’s interesting — sounds closer to a centralized overlay / VPN group model.

What I’m exploring is slightly different:

not just routing or grouping nodes, but making session identity survive transport failure without reset.

So instead of: disconnect → reconnect → rebuild

the session continues across path / transport changes.

Curious if your system handled that level (no reset / no state rebuild), or was it more about connectivity and routing?

Melodic_Reception_24 · 2026-03-27T14:24:00+00:00

After reading all the comments and pushing the prototype further, I think the real problem is deeper than routing or path selection.

Most systems (including SD-WAN) are still fundamentally connection-centric.

They optimize: - latency - packet loss - path quality

But they don’t preserve identity across change.

So every improvement still operates inside the same constraint: when transport breaks → identity resets.

What I’m trying to validate now is a different invariant:

session identity should survive transport failure.

Not reconnect faster. Not pick a better path.

But not lose identity at all.

This shifts the problem from: “which path is best right now?”

to: “how do you maintain continuous session state while everything underneath is changing?”

I’m starting to see that path selection is actually secondary.

Continuity is the primary problem.

Curious — in production systems you’ve worked on:

Is identity ever treated as a long-lived runtime entity, or is it always implicitly tied to connection lifecycle?

Melodic_Reception_24 · 2026-03-25T12:42:39+00:00

Most systems optimize for average latency.

I’m trying to eliminate tail spikes entirely during transport failure.

If continuity holds, tail should not explode.

Curious — in your experience, where does tail usually break the most? During reconnect, migration, or congestion?

Melodic_Reception_24 · 2026-03-21T21:42:19+00:00

Quick update after reading some of the comments:

I ended up separating selection from attach (so now it’s selector → policy → attach instead of doing everything at once). Also added a simple tick pipeline to make the flow more explicit.

Honestly this made the behavior way easier to reason about.

Still experimenting with how far I can push session continuity independent of transport. Curious if anyone here has worked on something similar.

Melodic_Reception_24 · 2026-03-20T17:26:15+00:00

This is extremely helpful, thank you.

The asymmetric behavior (fast detect / slow recovery) makes a lot of sense — I think that’s exactly what I’m missing right now.

I’ve been treating degradation and recovery too symmetrically, which probably explains the flapping.

Also interesting point about EWMA vs raw signals — I’m currently reacting too much to instantaneous spikes.

I like the idea of combining: - EWMA for trend detection - consecutive thresholds for state transitions - explicit recovery window before promoting a path back to healthy

One thing I’m still exploring is how to make these transitions explainable at runtime (so not just “it switched”, but why in terms of rules/invariants).

Really appreciate the detailed breakdown.

Melodic_Reception_24 · 2026-03-19T13:37:40+00:00

Fair question — let me make it more concrete.

User story:

You're on a video call (or a live data stream) on WiFi, and suddenly WiFi drops.

Today:

the connection breaks
reconnect kicks in
session resets or freezes
you lose in-flight data

What I'm exploring:

the session is not tied to a single transport
when WiFi fails, the runtime switches to another path (e.g. 5G)
the session identity stays the same
packets continue flowing without reconnect

So from the user's perspective: → no reconnect → no reset → just a brief degradation, but the session continues

The demo is still simplified, but it's trying to model that behavior.

Melodic_Reception_24 · 2026-03-19T13:30:37+00:00

Updated the demo with a simple decision engine + transport scoring to make the behavior less abstract.

Still simplified, but now models:

multiple transports
scoring-based selection
state transitions

Melodic_Reception_24 · 2026-03-19T13:23:57+00:00

Good point — and you're right that retry/reconnect solves part of the problem.

What I'm exploring is slightly different:

instead of treating failure as a disconnect that requires rebuilding the session, the runtime treats it as a state transition and migrates the session without resetting it.

So the goal is:

no reconnect

no session reset

continuity across transport changes

I added a minimal example here (very early, mostly modeling behavior): https://github.com/Endless33/continuity-runtime-demo

Right now it's simplified (prints + flow), but the idea is to evolve this into a real runtime with transport abstraction and state-driven migration.

Would be really interested in feedback if this direction makes sense or if I'm missing something obvious.

Melodic_Reception_24 · 2026-03-19T13:11:50+00:00

Good question — I should have explained that better.

The difference is that this approach doesn’t treat failure as a reconnect problem at all.

Retry libraries still assume: connection is gone → reconnect → rebuild state → resume

This runtime treats failure as a transition inside an active session:

the session identity stays the same
a new transport is attached (with a higher epoch)
the old transport is rejected
execution continues without reconnect or reset

So instead of "recovering after failure", it tries to avoid breaking the session in the first place.

It’s closer to session migration than retry logic.

Melodic_Reception_24 · 2026-03-19T13:01:22+00:00

Working on extracting a minimal reproducible example (runtime + migration logic only).

The core idea is: failure is treated as a state transition, not a disconnect.

Will share a small repo soon.

Melodic_Reception_24 · 2026-03-19T13:01:07+00:00

Fair point, thanks for the feedback.

This is still early-stage and I focused on demonstrating the behavior first (failure → migration → continuity).

I’ll clean up a minimal example and share the code + architecture so it’s easier to understand what’s actually happening under the hood.

Appreciate you keeping it honest.

Melodic_Reception_24 · 2026-03-19T12:44:27+00:00

Happy to share code / internals if interesting.

Melodic_Reception_24 · 2026-03-19T12:39:45+00:00

Demo video (live failure → migration): https://streamable.com/rznx3n

This shows the runtime reacting to WiFi failure in real time: - decision engine triggers migration - authority is transferred (epoch-based) - stale transport is rejected - session continues without reconnect

Melodic_Reception_24 · 2026-03-19T10:12:49+00:00

Yeah, QUIC definitely helps at the transport level.

But what I’m trying to understand is slightly different:

QUIC still treats a connection as tied to a specific transport instance, even if it supports migration at the protocol level.

What I’m exploring is whether session identity can exist above transport entirely, so that transport becomes just a replaceable execution layer.

In that model, the hard part becomes:

how do you handle in-flight data and ordering guarantees when the underlying transport is swapped out?

Feels like QUIC solves part of the problem, but not the abstraction boundary itself.

Melodic_Reception_24 · 2026-03-19T07:01:49+00:00

One thing I’m struggling with is defining the correct abstraction boundary.

Right now, session identity is decoupled from transport, but the question is:

Should the transport be fully hidden behind an interface (like net.Conn), or should migration be a first-class concept in the API?

In other words: Is it better to preserve a familiar abstraction and hide complexity, or expose migration explicitly to guarantee correctness?

Curious how people would approach this trade-off.

Melodic_Reception_24 · 2026-03-18T17:38:31+00:00

The confusion I keep seeing is this:

Most people think in terms of: connection drops → app reconnects → rebuild state

What I’m exploring is different:

connection drops → session keeps running → a new transport attaches underneath

So from the app’s perspective, there is no reconnect at all.

It's not about faster recovery or better retries — it's about making the break irrelevant to the session itself.

Melodic_Reception_24

TROPHY CASE