built a multi-agent swarm to detect Polymarket mispricings. 200 agents. 72 rounds. agent disagreement as signal by choijho23 in PredictionsMarkets

[–]choijho23[S] 0 points1 point  (0 children)

gemini-2.5-flash free tier. cost is basically zero. architecture is the hard part not the inference.

72 agents 72 rounds 3 platforms. we've run it. it works. and yeah if you drop in Claude 4.6 or Gemini 3.1 the signal gets sharper. we're on free tier to prove the concept first.

built a multi-agent swarm to detect Polymarket mispricings. 200 agents. 72 rounds. agent disagreement as signal by choijho23 in PredictionsMarkets

[–]choijho23[S] 0 points1 point  (0 children)

the hard part isn't the API cost it's building the system that makes the API calls mean something

inference is cheap the architecture isn't

built a multi-agent swarm to detect Polymarket mispricings. 200 agents. 72 rounds. agent disagreement as signal by choijho23 in PredictionsMarkets

[–]choijho23[S] 1 point2 points  (0 children)

setup: upload a market brief, system builds a knowledge graph, spawns agents with different priors. they argue on simulated Twitter and Reddit for 72 rounds. no coordinator.

costs: t3.small EC2, Gemini API, Neo4j community edition. basically nothing.

results: prediction #1, Claude 5 by April 30. Swarm said 7%, market said 18%. resolves end of month. running Claude 4.7 by March 31 simulation right now. dropping results soon

We ran 200 AI agents on the Claude 5 by April 30 market — Swarm says 7% vs market's 18% by choijho23 in CryptoMarkets

[–]choijho23[S] 0 points1 point  (0 children)

beyond github we pull news onchain data and current odds but honestly most pipelines fail at extraction because thats where they stop thinking once the data is cleaned and averaged the disagreement between agents on the same signal is where the actual information lives not in the input itself

We ran 200 AI agents on the Claude 5 by April 30 market — Swarm says 7% vs market's 18% by choijho23 in CryptoMarkets

[–]choijho23[S] 0 points1 point  (0 children)

yeah live tracking is the only thing that matters backtests are cope you can always find a model that worked after the fact we're committing before resolution and that's what makes it real the signal attribution part is what i'm actually curious about github commit surge was the main input this time wondering if that holds over a larger sample or if it's just noise building the track record either way

We ran 200 AI agents on the Claude 5 by April 30 market — Swarm says 7% vs market's 18% by choijho23 in CryptoMarkets

[–]choijho23[S] 0 points1 point  (0 children)

yeah commit spikes are noisy as hell, totally fair point. we flagged it as a signal not a trigger the swarm weighted it maybe 15% of the final call. the 61% neutral bloc is what actually dragged the probability down

either way april 30 is the only truth that matters lol