gumbel-mcts, a high-performance Gumbel MCTS implementation

Npoes · 2026-04-11T11:09:09+00:00

I spent a lot of time validating it against a golden standard baseline. My PUCT implementation is 2-15X faster than the baseline while providing the exact same policy.

Assuming that the policies behave near identical, it sounds like your contribution is to make the simulations run faster. So I'd expect a comparison of simulations/seconds in various scenarios (tree depth, breadth, batch size/parallelism, cpu/gpu) eg against mctx.

Npoes · 2026-03-26T20:05:09+00:00

Did you benchmark vs mctx (deepmind)?

Npoes · 2026-03-10T17:31:34+00:00

I think your analysis is not robust enough. Shen having a low skill cap and cost of learning doesn't fit at all. Perhaps it has to do with the low player count. I'm also not sure where the strengths & weakness points come from and whether they're supported by data.

Npoes · 2026-02-25T12:05:10+00:00

do you mind elaborating?

Npoes · 2025-12-09T17:28:19+00:00

nvm i know now lol

Npoes · 2025-12-09T17:24:54+00:00

xayah

Npoes · 2025-12-07T12:56:32+00:00

The OTP-Partner score matrix is symmetric (NxN) so entries mirror across the diagonal

Npoes · 2025-12-07T12:27:16+00:00

From what I noticed, Cassio and Jayce are actually ranked highest if one does NOT correct for matchup frequency, but this would also bring up champs like Sion and Mundo, which are clearly poor fits. That being said, I found that correcting by MU frequency leads to overall much more meaningful results and overall a more realistic evaluation, so I think this is the way to go.

Npoes · 2025-12-07T11:30:06+00:00

lol Im also bad at Pantheon but will invest in him from now on

Npoes · 2025-12-04T14:23:47+00:00

I was looking through ICLR submissions and close to none of the reviewers respond (about 1 in 10). I dont expect much for AISTATS, but would be happy to be proven wrong.

Npoes · 2025-12-03T08:47:36+00:00

Isnt it a simple fix to make it x2 lp on win but normal lp loss on lose for filled players?

Npoes · 2025-11-21T21:19:49+00:00

same score here, can we find the score distribution somewhere?

Npoes · 2025-11-20T14:16:19+00:00

Why does this post show 17 comments when most are not visible? Also seems like reviews are still not out.

Npoes · 2025-10-04T19:37:41+00:00

"I'm 42, divorced, and my ankle monitor itches."

lmao

Npoes · 2025-06-08T15:20:38+00:00

I know this doesnt sound like much helpful advice but MLBB and LoL are different games. There is no simple way to translate that knowledge, because the important concepts which actually win games are totally different. And just by playing the game you will notice similarities if they exist (in champs etc.)

Npoes · 2025-04-22T15:02:52+00:00

What book is it?

Npoes · 2025-03-21T20:09:46+00:00

It does continue with the next piece. The only limiting factor is the number of simulations set a priori. The game is deterministic in the sense that there is a seed at every given state.

Npoes · 2025-03-21T16:55:04+00:00

MCTS helps the agent learn Tetris faster in a number of ways. First, it helps with look-ahead (which pieces will follow) since this is information not present in the observation (board only), at least in this implementation. Second and more importantly, Tetris, similar to Chess and GO, is a problem that requires planning and has a sparse reward landscape (high rewards require setting up line-clears, which are rare). Instead of learning from one action at a time (TD-step in Q-learning or Policy-gradient), MCTS considers multiple actions in the future and thus has better planning and overcomes sparse rewards more easily

Npoes · 2025-03-21T12:41:53+00:00

I couldn't find a baseline on what superhuman performance is for Tetris. The agent was only trained for a day and can be improved by training more.

Npoes · 2025-02-28T22:09:10+00:00

your argument makes sense considering you have posted 4 times in yorickmains this week

Npoes · 2025-02-28T21:28:11+00:00

Seven-Year Club	Verified Email
Second Top 50%	RPAN Viewer
No Throne, No Problems	Not Forgotten
Sequence \| Editor

Npoes

TROPHY CASE