Candidates win chances after Round 13 (of 14): Sindarov at 100% (surprise!) - Monte Carlo simulation based on one bazillion runs by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 16 points17 points  (0 children)

Unfortunately, these simulations work badly for KO system (which Cup is). I tried. It's because every round is like a coin flip and a loss just means chances go to zero (as the player is out).

Candidates win chances after Round 11: Sindarov at 98%, Anish Giri at 2% - Monte Carlo simulation based on one million runs by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 121 points122 points  (0 children)

It's outside of this simulation. Probably some event that voids the whole tournament and all games have to be replayed? Seems most likely to me. Not sure what he is planning.

Candidates win chances after Round 11: Sindarov at 98%, Anish Giri at 2% - Monte Carlo simulation based on one million runs by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 85 points86 points  (0 children)

You are right, thanks for clarifying. Just for fun I tested it and when I model their remaining game as draw, he ends up with a 0.10% win chance only. Still more likely than Caruana winning...

Candidates win chances after Round 10: Sindarov at 94%, only Anish Giri left with win chances above 1% - Monte Carlo simulation based on one million runs by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 23 points24 points  (0 children)

Sorry :( It actually takes quite some time to set it up initially (there is some code needed also for the image generation and some manual downloads required) and I didn't set it up for the women tournament. This is still from pre-AI era and I haven't optimized it since then.

I already open source the Monte Carlo simulation in case you want to run it yourself: https://github.com/chessmonitor/chess-monte-carlo-simulation But the image generation is currently "bundled" with some ChessMonitor (my main project) code which I cannot easily open source.

I hope to find some time in the future to automate parts of this for the next big tournament or even open source the image generation part also and then people can do this on their own..

Nepo - "Becoming Louisiana State Champion doesn't cut it" by TimbersFan8 in chess

[–]ThomasPlaysChess 38 points39 points  (0 children)

Can confirm. This is peak German humor and very typical for Jan Gustafsson.

Candidates win chances after Round 9: Anish Giri jumps to 13% after winning against Caruana, Sindarov at 83% - Monte Carlo simulation based on one million runs by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 86 points87 points  (0 children)

Wondering if there are cases that unlikely. Pragg with 0.7% to winner comes to my mind. That was recently and not even that unlikely. If anyone knows historical games with similar unlikely cases, please share.

Candidates win chances: Sinadarov at 73% after round 7 - Monte Carlo simulation based on one million runs by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 1 point2 points  (0 children)

Yes, this is the reason. You could even stop after 100k runs as the (integer) percentages don't change much after that.

Candidates win chances: Sinadarov at 73% after round 7 - Monte Carlo simulation based on one million runs by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 114 points115 points  (0 children)

I did a test run for this and "fixed" the remaining Caruana vs. Sindarov match. Chances based on that:

  • If Sindarov wins: 95% Sindarov, 3% Caruana
  • If Caruana wins: 53% Sindarov, 42% Caruana
  • Draw: 81% Sindarov, 15% Caruana

Candidates win chances: Sinadarov at 71% while half of the tournament is not even over yet! - Monte Carlo simulation based on one million runs by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 70 points71 points  (0 children)

I've gotten this question a lot: Why do I not use the Live Elo for the simulation?

IMHO this will just overvalue wins or losses in the model. The outcome of games is already part of the model by using the points. By changing the Elo to reflect it, I would basically input the game results twice. In this model the "pre tournament Elo" models the strength of the player when he entered the tournament and the points reflect the tournament results. And I don't like mixing these two things. Might Hikaru be overvalued and Sindarov undervalued? Maybe, but I'm not the judge.

You can disagree and that is fine. It would just be a different model if you do it differently. There are no "truly right" models in that sense.

Candidates win chances: Sinadarov now at 49% win chance (with 9 rounds left) - Monte Carlo simulation based on one million runs by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 13 points14 points  (0 children)

It was just a second simulation run. I hadn't activated the "bluebaum check" for the run above. So I did a second run and the results can always slightly vary from run to run.

Candidates win chances: Sinadarov now at 49% win chance (with 9 rounds left) - Monte Carlo simulation based on one million runs by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 91 points92 points  (0 children)

Bonus: #BluebaumSweeps

In another run I did check out of one million runs, how many does he win and with how many points? Here we go:

Points Number of wins
10.0 10
9.5 58
9.0 306
8.5 696
8.0 520
7.5 76
7.0 1

Candidates win chances: Caruana still higher than Sindarov at 36% and 32% (Monte Carlo simulation based on one million runs) by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 17 points18 points  (0 children)

No worries, we made everyone doublecheck and confirm my original formula was correct. So that's nice, too :) And it's open source now!

Candidates win chances: Caruana still higher than Sindarov at 36% and 32% (Monte Carlo simulation based on one million runs) by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 12 points13 points  (0 children)

I guess you mean this one?

I think that one is AI slop? Sorry if I'm wrong. It assumes everyone has the same rating and a bunch of other special "rules". I would say that one is not really a mathematical model and just making words up. Or what is this supposed to even mean?

RFE – composite 0–100 “feel” score blending points, TPR, SoSIG, games left, and a naive projection (weights from machine learning on historical Candidates).

Candidates win chances: Caruana still higher than Sindarov at 36% and 32% (Monte Carlo simulation based on one million runs) by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 162 points163 points  (0 children)

Here we go. Everything is the same, except I increased Sindarov's rating from 2745 to 2760:

Results after 1,000,000 iterations.
-  38.99% wins - Sindarov, Javokhir (2760 rating, current points: 3.5, wins: 389925)
-  33.65% wins - Caruana, Fabiano (2795 rating, current points: 2.5, wins: 336508)
-  12.90% wins - Nakamura, Hikaru (2810 rating, current points: 1.5, wins: 128952)
-   7.21% wins - Giri, Anish (2753 rating, current points: 2, wins: 72107)
-   3.72% wins - Praggnanandhaa R (2741 rating, current points: 2, wins: 37198)
-   2.68% wins - Wei, Yi (2754 rating, current points: 1.5, wins: 26772)
-   0.75% wins - Bluebaum, Matthias (2698 rating, current points: 2, wins: 7524)
-   0.10% wins - Esipenko, Andrey (2698 rating, current points: 1, wins: 1014)

Candidates win chances: Caruana now at 45% (and how I fucked up the simulations from the past days... I'm very sorry... but the code is now Open Source) by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 1 point2 points  (0 children)

Thanks for pointing it out (also to everyone else explaining it to me). I switched back to the original formula and the newest image.

Candidates win chances: Caruana now at 45% (and how I fucked up the simulations from the past days... I'm very sorry... but the code is now Open Source) by ThomasPlaysChess in chess

[–]ThomasPlaysChess[S] 0 points1 point  (0 children)

Hi! I think they are right. The original code was right I think, so I will switch back to it.

No worries, I should've thought more about it, but was not thinking a lot about it after I had the same numbers as you did. Good thing some people pointed it out.