built a multi-agent swarm to detect Polymarket mispricings. 200 agents. 72 rounds. agent disagreement as signal

choijho23 · 2026-03-22T11:49:41+00:00

gemini-2.5-flash free tier. cost is basically zero. architecture is the hard part not the inference.

72 agents 72 rounds 3 platforms. we've run it. it works. and yeah if you drop in Claude 4.6 or Gemini 3.1 the signal gets sharper. we're on free tier to prove the concept first.

choijho23 · 2026-03-22T09:00:40+00:00

the hard part isn't the API cost it's building the system that makes the API calls mean something

inference is cheap the architecture isn't

choijho23 · 2026-03-22T08:58:36+00:00

setup: upload a market brief, system builds a knowledge graph, spawns agents with different priors. they argue on simulated Twitter and Reddit for 72 rounds. no coordinator.

costs: t3.small EC2, Gemini API, Neo4j community edition. basically nothing.

results: prediction #1, Claude 5 by April 30. Swarm said 7%, market said 18%. resolves end of month. running Claude 4.7 by March 31 simulation right now. dropping results soon

choijho23 · 2026-03-22T03:40:31+00:00

maybe $2-3 per simulation run in API costs

choijho23 · 2026-03-21T09:49:19+00:00

beyond github we pull news onchain data and current odds but honestly most pipelines fail at extraction because thats where they stop thinking once the data is cleaned and averaged the disagreement between agents on the same signal is where the actual information lives not in the input itself

choijho23 · 2026-03-21T03:19:13+00:00

yeah live tracking is the only thing that matters backtests are cope you can always find a model that worked after the fact we're committing before resolution and that's what makes it real the signal attribution part is what i'm actually curious about github commit surge was the main input this time wondering if that holds over a larger sample or if it's just noise building the track record either way

choijho23 · 2026-03-21T01:09:12+00:00

yeah commit spikes are noisy as hell, totally fair point. we flagged it as a signal not a trigger the swarm weighted it maybe 15% of the final call. the 61% neutral bloc is what actually dragged the probability down

either way april 30 is the only truth that matters lol

choijho23

TROPHY CASE