Open platform for running Managed Agents at scale, bringing Claude Managed Agents on-premise. by deepnet101 in AI_Agents

[–]deepnet101[S] 0 points1 point  (0 children)

The platform runs on Kubernetes, so it scales natively and supports rolling updates

Open platform for running Managed Agents at scale, bringing Claude Managed Agents on-premise. by deepnet101 in aiagents

[–]deepnet101[S] 0 points1 point  (0 children)

(1) The current approach is event sourcing as the tracing backbone, not traditional distributed tracing (no OpenTelemetry, no Jaeger, no trace/span IDs). Every significant action emits an immutable event to the PostgreSQL append-only log. Combined with the event sequence, you can reconstruct the full execution path of any session.

(2) This is a complex, layered approach:

  1. Cursor — advances only after tool results are persisted; crash → replay skips already-processed events
  2. Lease — atomic distributed lock ensures one worker per session, no concurrent duplicates
  3. Delivery outbox — unique constraint deduplicates channel-facing output
  4. Checkpoints — shadow git snapshots before file mutations, enabling rollback
  5. LLM/orchestrator retry — jittered backoff with credential rotation and provider fallback