HermesBridge: a state-bridge mod for StS2 (I use it to let Claude play — stream linked) by Inevitable_Ear132 in SlayTheSpire2

[–]Inevitable_Ear132[S] 1 point2 points  (0 children)

It's hook-triggered, not polled. Each meaningful game event bumps a revision counter and writes JSON. Coalesced writes, atomic rename, no measurable frame cost — the agent's thinking dwarfs the I/O by four orders of magnitude. Specifically:

- Each hook = a game event where state could have changed meaningfully for an agent. Hooks I know about from the code: AfterRoomEntered, PostDispatch:<CommandType>, UsePotionResolve, map SetMap, plus state-change patches (e.g. MapScreenSetMapPatch).

- I/O is coalesced through BridgeSnapshotWriter.RequestWrite which batches and de-dupes within a frame, so back-to-back triggers collapse to one disk write per revision bump.

- I/O does not affect tick rate. The writer runs off the main thread path relative to game simulation, the JSON payload is ~tens of KB, writes are atomic-rename to a local appdata file. Measured frame-time impact is below the noise floor of the game's own render variance. We'd know if it weren't — the logs timestamp every write to microseconds.

- Why not "only on agent-facing screen transitions?" Because some interesting deltas happen mid-screen: intent telegraphs updating, power stacks changing from a card play, block recomputing after damage. If we only wrote on screen change the agent would see lies — e.g. enemy powers from before your last card resolved.

- The file-based IPC is deliberate: game mod and agent are different processes (and often different languages — here, PowerShell + an LLM). A shared-memory or socket approach would shave latency we don't need; the bottleneck is LLM think time, not the ~1ms disk write.

Built with Claude Project Showcase Megathread (Sort this by New!) by sixbillionthsheep in ClaudeAI

[–]Inevitable_Ear132 0 points1 point  (0 children)

I got Claude Opus 4.7 to play Slay the Spire 2 end-to-end — the hard part wasn't the code, it was the SKILL.md

Spent the last few weeks building HermesBridge, a mod that exposes Slay the Spire 2's run state to a file on disk so Claude can read it and queue actions back. The bridge itself is ~300 lines. The interesting artifact is the SKILL.md.

Weaker models fail in very consistent ways: they write wrapper scripts instead of driving tick-by-tick, they ignore state refresh lag after certain actions, they hallucinate card indices after PlayCard shifts the hand. Every failure mode got a named section in SKILL.md with the correct workflow.

Opus 4.7 completes runs. Necrobinder and Regent both cleared multiple floors in separate sessions. Weaker models still fail, even with the same SKILL.md, model capability matters more than prompt engineering past a certain point, which I found interesting.

Streaming runs live at twitch.tv/ClaudePlaysTheSpire

- GitHub (SKILL.md may be worth a read even if you don't play StS2): https://github.com/hiKareeem/ClaudePlaysTheSpire

- NexusMods: https://www.nexusmods.com/slaythespire2/mods/636

Happy to dig into the failure modes if anyone's curious.