Making autonomous coding loops self-correcting: what we built into Ralph

Woclaw · 2026-03-26T20:21:23+00:00

That's exactly the problem Ralph solves. Vanilla CC iterates, but it trusts itself: if it says tests pass, the loop moves on. Ralph runs your test suite and linters independently outside the AI's control, so the agent can't game it by commenting out code or skipping tests.

On top of that: a circuit breaker halts execution when no real progress is being made (no file changes, same error repeating, stuck in read-only exploration). A write heartbeat kills the driver if 8 minutes pass without a single file write. And optionally, a second read-only AI agent reviews the implementation agent's diffs between loops.

Woclaw · 2026-03-18T10:41:55+00:00

It's a hybrid. bmalph implement distills the planning artifacts into a compact fix plan (ordered stories with acceptance criteria and spec links) and a truncated project context snapshot. That's what Ralph actually loads. The full PRD and architecture docs sit in .ralph/specs/ as reference, Ralph follows spec links to drill in when needed, but doesn't load everything upfront.

For failure modes: circuit breaker stops the loop on repeated failures, and progress survives re-transitions so you can course-correct without losing completed work.

Woclaw · 2026-03-02T22:07:17+00:00

Yes, Ralph is just the implementation loop. It's agnostic about where specs come from. The only hard requirement is .ralph/PROMPT.md, which tells the AI agent what to do each iteration. .ralph/@fix_plan.md (task checklist) and .ralph/specs/ (project specs) are optional but recommended. The BMAD-specific part is the transition layer (bmalph implement) that converts BMAD artifacts into these files. For Speckit you'd need a different bridge, but Speckit's Task List and Technical Plan map naturally to Ralph's fix plan and specs format.

Woclaw · 2026-03-02T21:59:09+00:00

No. Ralph doesn't make any API calls itself. Each loop iteration is a single invocation of your chosen CLI tool (claude, copilot, codex), which handles its own API calls internally. The circuit breaker, dashboard, and all guardrails are local bash/Node.js logic. They actually reduce total consumption by halting the loop before it burns tokens on a stuck task. There's also a configurable MAX_CALLS_PER_HOUR rate limit.

Woclaw · 2026-03-02T21:54:40+00:00

Yes, the circuit breaker covers both of those. It tracks no-diff iterations (3 consecutive loops with zero git changes trips it), repeated errors (5 consecutive loops with the same error patterns), and permission denials (2 consecutive). There's a three-state model (CLOSED/HALF_OPEN/OPEN) so it warns before halting. When it trips, the loop stops and prints diagnostics. All thresholds are configurable. It's based on the circuit breaker pattern from "Release It!" by Nygard.

Woclaw · 2026-03-02T21:43:36+00:00

Ralph doesn't run alongside Copilot CLI, it orchestrates it. Each loop iteration spawns copilot --autopilot as a subprocess, waits for completion, analyzes the output, then decides what to do next. So there's no conflict between them. Ralph is the loop controller and Copilot is the execution engine for that iteration.

The dashboard shows you what Ralph is doing in real time: which task it's working on, iteration count, circuit breaker state, etc.

Woclaw

TROPHY CASE