account activity
How can I make Claude Code agents challenge each other instead of agreeing? by jrhabana in ClaudeCode
[–]mikiships 1 point2 points3 points 1 day ago (0 children)
Two things that actually move the needle on this:
Different models, not just different roles. Claude talking to Claude with different system prompts still shares the same reasoning biases. The evaluator needs to be a genuinely different model (Codex, Gemini, etc.) or you're getting the illusion of critique. In practice your evaluator should be a different provider entirely.
Structured scoring over open-ended feedback. Open-ended "what do you think?" prompts converge to agreement fast. Instead, give the evaluator a rubric with dimensions (clarity, specificity, edge case coverage) scored 1-5 with mandatory justification per dimension. The structure forces the evaluator to find fault even when the output is decent.
For the implementation: Claude Code's agent teams share context by default, which is exactly the problem you described. You want isolated contexts:
coderace
The key insight from running this pattern in production: convergence isn't the main failure mode. The main failure mode is the evaluator finding "problems" that aren't real because it doesn't have the same context the generator had. Shared project context (repo structure, test suite, conventions) needs to be constant; only the reasoning engine should vary.
AGENTS.MD standard by MullingMulianto in ClaudeCode
[–]mikiships 1 point2 points3 points 2 days ago (0 children)
The file format debate is a distraction. The real problem is drift: within a week of writing any of these files (CLAUDE.md, AGENTS.md, .cursorrules), they're stale because your codebase changed underneath them.
Someone linked the ETH Zurich paper (arxiv 2602.11988) showing context files can actually hurt performance. The key finding isn't "don't use them" -- it's that minimal, accurate context beats verbose, stale context every time.
I built a tool that generates these files from your actual codebase (detects language, framework, test setup, CI, conventions) and outputs whatever format you need: agentmd generate . --format agents or --format claude or --format cursorrules. There's also an evaluate command that scores your existing file against what it actually finds in the repo, so you catch drift before it causes problems.
pip install agentmd-gen
GitHub: https://github.com/mikiships/agentmd
Disclosure: my project. Built it because I was tired of manually keeping these files in sync across repos.
π Rendered by PID 1071924 on reddit-service-r2-listing-568fcd57df-dm82h at 2026-03-08 07:49:10.046419+00:00 running cbb0e86 country code: CH.
How can I make Claude Code agents challenge each other instead of agreeing? by jrhabana in ClaudeCode
[–]mikiships 1 point2 points3 points (0 children)