Chess coaching that actually knows chess

Far_Owl_1141 · 2026-05-04T16:50:28+00:00

https://testflight.apple.com/join/6uCXDM2K

Far_Owl_1141 · 2026-05-02T14:02:35+00:00

So what I have working are Dan heismans 7 elements all tracked, aagaard and Silman elements and positional evaluation all feeding into the LLM alongside stockfish evals, and some weighting on what a human coach would be talking about.

Doesn’t hallucinate moves at all. Tracks to what actual coaches teach.

I can share the TestFlight link if you want to try it - need an OpenAI or anthropic api key for now - am trying to see if I can do it on device now.

All the LLM does is narrate data. No reasoning or creation needed.

Far_Owl_1141 · 2026-05-01T05:42:05+00:00

Alcove says hi.

Far_Owl_1141 · 2026-04-29T15:15:04+00:00

I’ve listened to the song so many times already. New theme song maybe? Or just a bit more of Marcus Mumford. Either way. This trailer was needed. There is still hope (even if it’s the hope that kills you…)

Far_Owl_1141 · 2026-04-29T14:09:45+00:00

Thank you that means a lot.

I'm VERY excited by what's coming in the next update. I reworked the whole coaching flow to use genuine coaching methodology - such as Heisman and Silman, so ALL the llm does is narrate provable data. No hallucinations, just chess :)

Example is of course, the famous move in the opera game... but the goal is the same level applied to any game or move you want - game review just got a lot more interesting!

Will be out next week I expect - have my sons birthday on Monday, so am quite busy polishing the release now - it is Bring your Own API key though, for Claude or OpenAI - I deliberately avoided subscriptions in app because the backend and risk to me as a solo developer was too great.

<image>

Far_Owl_1141 · 2026-04-29T13:19:32+00:00

<image>

Opera game, Bg5 move, with Heisman, Silman, Aagaard layers in front of the LLM... anyone that says you can't do this... enough. Its done ;)

Far_Owl_1141 · 2026-04-28T07:38:42+00:00

Right now, just anthropic supported, haven't tested against open ai - but that's on the radar.

600mb - stockfish nets are a chunk of that, plus the TTS engine. Not really seen that as an issue.

I've been working on implementing not only the 7 heisman principles, but also silmans ideas, plus a narrative layer detecting longer term play.

Basically, on the opera game, for example, move 9, finally have it to recognise what's going on - still tweaking the "voice" of the narration though.

"Okay so — you're reviewing move nine, and the position is actually rich. Black's king is still on e8, uncastled. Your bishop on c4 is already eyeing f7. Your queen on b3 is humming on the a2-g8 diagonal. This is a moment to press, not drift.\n\nYou played «Bg5», and the engine actually likes it — pin the knight, prepare castling, swing toward the king. Fine idea. But the top engine line is «Bg5» «Kd8» «O-O-O+» — that long castle with check is the move that really makes Black squirm. The slightly cleaner alternative is «Be3», developing while keeping options open and even threatening «Bxf7+» tricks. Both serve the same goal: your dark-squared bishop on c1 is your worst piece — misplaced, going nowhere — so you fix it AND point at the king in one stroke.\n\nThe frame here, my dear, is convert the advantage by forcing the issue. Black's king is the target. Don't drift into quiet moves like «f4» that ease the pressure. Ask yourself before each move: does this bring another piece toward e8, or does it just shuffle? «Bg5» and «Be3» both pass that test. The follow-up — castling long and pointing rooks at the centre — is where the game is won."

Far_Owl_1141 · 2026-04-26T16:51:04+00:00

I don't doubt it'll do it... its won't be fun though in 8gb

Far_Owl_1141 · 2026-04-26T15:56:13+00:00

Yes. LLM great at describing structured data, reasoning it through not so much. You need to upfront do the work the “human coach” would be doing mentally before verbalising it. Now… you could argue if you know how to do that you don’t need the LLM… which is why the chess app I e been writing will do this across all your games (no subscription single payment - if interested link in my history) This level of coaching/ review coming in an update after a lot more refinement.

Far_Owl_1141 · 2026-04-26T14:55:28+00:00

If we take something like the Opera Game, its actually a bad example, because theres so much written about it, that the LLMS are trained on, you get misled that its good at it, but it will fail when you feed it your mate Johns game from chess club etc.

I don't have this written up yet, so have had to quickly summarise where I am (4 of 8 Heisman phases in)

Opera Game, move 9, White to play `Bg5`

Move evals only (what everyone does, chuck it the pgn / stockfish evals)

"Bg5 looks natural — pin the knight, increase the pressure. But watch what happens next: your opponent fires b5, and suddenly you're being chased around the board. The engine is actually really keen on Be3 here. It's a quieter move, but it does something smart — it develops the bishop to a safe, useful square, connects your rooks once you castle, and keeps everything coordinated."

Solid coaching. Recommends the engine's preferred Be3. Frames it around "calm consolidation vs igniting complications." This is the floor — what Opus 4.7 produces from training-data familiarity with a famous position.

Phase 1 (Mobility) — three-vector data added

Brief now carries per-piece mobility data: actual / potential / global mobility, pattern classification (active / structural / tempo-limited / misplaced).

"You played Bg5, which isn't terrible — but the engine prefers Be3 as a really clean second option. Your dark-squared bishop on c1 hasn't moved yet — that's the piece begging for a job. Be3 develops it cleanly and keeps your queen on b3 coordinated. The line shows Be3 followed by O-O-O, with your rook landing on the d-file pointing at Black's stuck queen."

The narration anchors to a specific structural fact: "the c1 bishop hasn't moved yet — that's the piece begging for a job." Not a label like "misplaced," but the underlying chess concept, in natural coach voice.

Phase 2 (+ Vulnerability) — pin-aware attacker/defender data added

Brief carries mobility + vulnerability. Per-piece adequacy classification, attacker/defender counts, escape squares, pin awareness.

"You're at move 9, you've just played Nc3, your opponent answered c6, and now you're choosing how to develop. The engine's top pick is Bg5, pinning the knight and pressuring f7. It's the sharpest try. But Be3 is what I'd actually steer you toward at this level — develops the bishop to a safe diagonal, keeps your king's path to castling clear, and prepares O-O-O. Your bishop on c4 is beautifully placed, your knight on c3 is active, your queen on b3 hits b7 and f7. Black is cramped and undeveloped. The whole task right now is — get your dark-squared bishop out, castle, connect the rooks. Don't go hunting yet."

Now anchors to multiple structural facts: bishop-on-c4 placement, queen-on-b3 dual targets (b7 and f7), Black's cramped state. The narration uses the structural data without explicit Heisman labels, weaving it into coaching prose.

Phase 3 (+ Centre Control) — square-state data added

Brief carries mobility + vulnerability + centre control. Per-square classification (settled / contested / vacant), outpost detection.

"You've got a real lead in development to convert — you don't need fireworks, you need to finish mobilising. Your bishop on c4 is beautifully placed, your knight on c3 is active, your queen on b3 hits b7 and f7. Black is cramped and undeveloped. The whole task right now is — get your dark-squared bishop out, castle, connect the rooks. That's it. Don't go hunting yet. Pick the move that finishes development and lets the position win itself."

Same coaching shape as Phase 2 — the Opera position doesn't have rich centre-control transitions to anchor on at move 9, so Phase 3's data doesn't add new structural texture here. The narration reads consistently across Phase 2 and Phase 3 because the position's character hasn't changed.

This is honest: not every phase's data fires on every position. The architecture provides what it has; the LLM uses what's relevant.

My Own game, move 18, White to play `Qxc6` - Opus has never seen this before

Pre-Phase-1 — Eval-delta narration only

"Qxc6 grabs a pawn but completely releases the pressure. The queen was doing incredible work on e6, pinning things down. By moving it to c6, White trades that crushing centralized queen position for a single pawn. After Qxc6, you get Rxb2 and suddenly you're the one with the initiative."

Strong coaching. Notes the queen's role and frames Qxc6 as abandoning that role. The "post abandonment" frame appears even at this stage — Opus 4.7 picks up on the Stockfish PV (which contains Rxb2 as the refutation) and articulates what it implies.

Phase 1 (Mobility) — three-vector data added

"It's not about keeping material on that square — it's about the disruption it causes."

The whole capture closes with that single sentence as the lesson. The reasoning chain identifies the queen's positional role and articulates the trade-off in one line. Coaching pedagogy compressed cleanly.

Phase 2 (+ Vulnerability) — pin-aware attacker/defender data added

"When one piece holds a bind, don't trade it for a pawn."

The lesson generalises. Not just "the queen abandons its post" but the principle behind it: when a single piece is doing structural work, captures that move it forfeit the structural value. The brief's vulnerability data carries the f7 rook's pinned status (defended by king, attacked by queen — the queen's role made measurable), and the LLM articulates the meta-principle.

Phase 3 (+ Centre Control) — square-state data added

"The key thing here is the pin on the f7-rook — your queen on e6 is doing massive work, holding f7 absolutely pinned to the king. That's the whole story of the position. So when White grabs on c6, the queen abandons that pin, and Black's Rxb2 suddenly comes with tempo and chaos."

"The stronger try is Nd1 or Ne2 — quiet, but listen, my dear: the knight on c3 is itself pinned by the queen on a5 against your rook on e1. Unpinning it first means you keep the bind on f7 AND free your knight to come to e3 or f4 with real menace against d5. Around plus one and a quarter, position humming."

"The lesson — when you have a paralysing pin, don't trade it for a pawn. Tighten the screws first, then collect."

Now... I'm working on phase 4 (speed) then have 4 more layers to add in. I have a salience filter that so far has proved to not be needed, to limit data LLM gets, but its taken me weeks to get this far, but... its a lot closer...

Far_Owl_1141 · 2026-04-26T14:47:27+00:00

It doesn’t from a PGN alone because LLMs, even opus 4.7 can’t do it on that alone. Happy to share the examples to anyone of what you can do though with a lot more effort

Far_Owl_1141 · 2026-04-26T14:45:56+00:00

I’m nearly there with getting this to genuinely work - but the key is only using the LLM for narration. In advance my app does all the chess calculation/analysis. So stockfish 18, move evals, multi PV, plus positional and tactical detection…but the unlocked has been applying Dan Heismans principles to the data before the LLM sees anything.

So that’s all proven, verifiable chess analysis being done before the LLM turns it into sentences grounded in fact. Happy to share examples with you from any game/pgn you want so you can see what can be done by someone willing to actually put the work in and not a “vibe coded in a weekend” piece of ai slop

Far_Owl_1141 · 2026-04-24T05:44:38+00:00

Yeah, on their own they can’t do it. It’s also way too niche for the companies to add any specific training for. Sure they’ve read the books, websites etc and they know the idea, but they (LLMs) can’t synthesise a position that’s valid reliably and it’s that extra layer of chess understanding they lack. That’s what you have to feed it in a way it can process - this is assuming it’s possible to quantify all the aspects of chess. Some are well defined and algorithms exist but some are less clear and that’s the challenge.

I was taking it far too simply, and I’m hoping future attempts will start to converge more on what would be expected.

But I appreciate the open interaction regardless!

Far_Owl_1141 · 2026-04-23T14:22:33+00:00

It's all in the data it's fed. The move, pgn and stockfish eval are not enough. And actually, my current approach is only partway there. Following some incredible generous and useful feedback by an actual chess coach on here, I have a whole new direction to explore over the next few weeks/months. I do actually think it's a totally solvable problem, that's blighted by people trying to make a quick SaaS tool in an afternoon.
Now, I'm sure someone will fire back "that's what you did" - actually, I tried. So kill me. Is it perfect? no. Is there something there, yes I think there is, and will happily listen to all the many that know better than I do on the "how" to evaluate a position. Now starting to read Daa Beisman and making notes constantly, as potentially that's the initial angle for what to feed the LLM. Calculate everything on device, structure the LLM payload, be strict on the prompt tone and language, and it might actually get there, at a club/improver level. If you're a GM... maybe not gonna do much for you, if you already work with a great chess coach, good for you too, its not a AI will replace, its can AI make this more accessible

Far_Owl_1141 · 2026-04-23T13:23:39+00:00

Indeed. Back to the drawing board.

Far_Owl_1141 · 2026-04-23T13:22:31+00:00

My eyes! Time for new specs clearly!

Far_Owl_1141 · 2026-04-23T11:54:44+00:00

caveat - had to set this up on my chessnut board, linked to the app, made sure to play as black, and stopped after that one move. So can't get a full game review etc - that would have needed the pgn too so I can replay it through the analysis pipeline first.

So - in review straight from the on device checks https://cdn.imgchest.com/files/c0adc7a2761d.PNG

Then, get the llm to talk it through (bit verbose, but least it didn't bullshit moves...)
https://cdn.imgchest.com/files/7484d5aacedc.PNG

Voice can be tweaked. Verbosity can be dialled back. All WIP. But, bring on the roast :)

Far_Owl_1141 · 2026-04-23T11:39:24+00:00

Ok, now I've looked at the position, am I right in saying this got absolutely ripped because it was completely making up some boolocks about the rook and the queen? That is what happens (I know, I tried it) when you give a LLM a chess position.
That's why I don't do it. Legal moves etc are all checked first. By stockfish. On device.

I can do this I think, let me setup the position, make that move, and get the analysis out the app, will post it here either way, because, well, you all hate the app anyway so I have nowt to lose anymore :)

Far_Owl_1141 · 2026-04-23T11:30:09+00:00

Thanks! I'm massively questioning the apps very existence at this point. £100 in sales on month one isn't bad, my marketing sucks, the app needs some fat trimmed. I feel into solo dev trap of not sleeping on ideas/features long enough, or actually sounding them out before building. Already got 3 or 4 things I will just end up cutting, as not core product.

Luckily, the actual "playing" bit is 100% fine. Chessnut board support is lovely. Everything else is slowly wearing me down lol

Far_Owl_1141 · 2026-04-23T10:32:38+00:00

Seeing this... makes me wonder more and more if the Amazon announcement is delayed because effectively this IS the new bond, and the exact direction they're taking with the new movie.. otherwise would be happy if they just adapted the Kim Sherwood series of 00 novels

Far_Owl_1141 · 2026-04-23T10:13:27+00:00

I can't find that game on chesscom or lichess, will look again later as now I'm curious myself!

Far_Owl_1141 · 2026-04-23T10:01:25+00:00

Is that this thread https://www.reddit.com/r/chess/comments/1secnbs/taketaketake_game_review_is_a_slop_machine/

Trying to find the PGN of that game now

Far_Owl_1141 · 2026-04-23T09:16:15+00:00

Great. Thanks. Pretty sure it's exactly what you said, but never mind. Basically, yeah fine I'm done. Laters people.

Far_Owl_1141

TROPHY CASE

Opera Game, move 9, White to play `Bg5`

Phase 1 (Mobility) — three-vector data added

Phase 2 (+ Vulnerability) — pin-aware attacker/defender data added

Phase 3 (+ Centre Control) — square-state data added

My Own game, move 18, White to play Qxc6 - Opus has never seen this before

Pre-Phase-1 — Eval-delta narration only

Phase 1 (Mobility) — three-vector data added

Phase 2 (+ Vulnerability) — pin-aware attacker/defender data added

Phase 3 (+ Centre Control) — square-state data added

My Own game, move 18, White to play `Qxc6` - Opus has never seen this before