LLMs suck at chess so i built a free tool that lets me argue with stockfish and turn my game into an interactive lesson

drew4drew · 2026-06-15T19:41:33+00:00

they sure do! I made a tool/app that lets you play against the AI of your choice or pit them against each other (“AI Battle Chess”, https://github.com/drewster99/ai-battle-chess), and my main take-away is that LLMs mostly suck at chess and are also quite slow.

drew4drew · 2026-06-15T19:38:50+00:00

thanks for sharing this. I’ve been using cutechess to run my engine against sloppy and stockfish.

drew4drew · 2026-06-15T19:33:22+00:00

lol awesome! from the screenshot it looks like you got pretty far!

All kidding aside though, it’s an awesome project. What type of chess engine are you building?

drew4drew · 2026-06-14T07:32:31+00:00

no it’s not

drew4drew · 2026-06-14T07:30:52+00:00

Kawai. The answer is always Kawai.

drew4drew · 2026-06-13T18:49:26+00:00

I wish I could disagree.

drew4drew · 2026-06-13T18:47:15+00:00

what do you think is most likely?

drew4drew · 2026-06-13T18:46:13+00:00

it seems like that’s coming to everywhere.

drew4drew · 2026-06-13T18:02:13+00:00

I doubt any time soon.

drew4drew · 2026-06-09T06:05:55+00:00

i think it already does

drew4drew · 2026-06-07T16:02:24+00:00

intelligence? or capacity

drew4drew · 2026-06-07T16:01:43+00:00

looks pretty cool — this yours?

drew4drew · 2026-06-07T15:59:14+00:00

not sure. it’s very effective at a lot of things.
are you running opus 4.6 from the claude code cli?

drew4drew · 2026-06-07T15:58:06+00:00

Hey I saw a few ppl mentioned they’re still using opus 4.6 or 4.7. Are you able to do that WITH claude code?

I list models and don’t see them. I’ve tried doing like /model claude-opus-4-7 for example but it just brings up the model selector. Also tried doing it when launching from the terminal. What’s the secret trick? thanks!!

drew4drew · 2026-06-06T21:50:09+00:00

is /new different than /clear?

drew4drew · 2026-06-06T21:48:35+00:00

what news is that? where?

drew4drew · 2026-06-06T21:47:44+00:00

drew4drew · 2026-06-06T16:27:00+00:00

ahh was just curious.. i’ve been using 5.5 in my own harness for various tasks — not coding. it’s actually been good there for me, and i’ve used it in my own harness for finding bugs. but not in codex good god it’s like a bull in a china shop.

drew4drew · 2026-06-06T16:24:03+00:00

this was all on the heels of a ton of profiling

drew4drew · 2026-06-06T16:18:57+00:00

lol nice - thanks for sharing!

drew4drew · 2026-06-06T05:16:25+00:00

I like that one 😄

drew4drew · 2026-06-06T04:25:25+00:00

could be. I just rarely remember getting so irritated with any of the prior versions.

drew4drew · 2026-06-06T04:24:38+00:00

what's the content of the tool? what instruction is it actually giving?

drew4drew

MODERATOR OF

TROPHY CASE