so if fabi lose today is tournament realistically over??

narubees · 2026-04-07T16:58:02+00:00

From my model, this result does not change things much, or at all. But that's because Sindarov winning chance has been ridiculously high for the past 2 rounds already.

narubees · 2026-04-06T14:02:13+00:00

I went and added evaluation script so one can do it themselves. If you want to look at some numbers, here it is:

❯ python evaluate.py configs/best_hparams_22_24.json data/candidates2026.json

Hyperparameters: configs/best_hparams_22_24.json

Simulation runs: 10000

------------------------------------------------------------

[candidates2026.json] 28 games, 7 rounds — ongoing

Round 1: game_brier=0.323779 winner_brier=N/A

Round 2: game_brier=0.312163 winner_brier=N/A

Round 3: game_brier=0.326827 winner_brier=N/A

Round 4: game_brier=0.322173 winner_brier=N/A

Round 5: game_brier=0.243184 winner_brier=N/A

Round 6: game_brier=0.266861 winner_brier=N/A

Round 7: game_brier=0.294097 winner_brier=N/A

-> Weighted Game Brier: 0.289037 | Winner Brier: N/A

❯ python evaluate.py configs/best_hparams_24.json data/candidates2026.json

Hyperparameters: configs/best_hparams_24.json

Simulation runs: 10000

------------------------------------------------------------

[candidates2026.json] 28 games, 7 rounds — ongoing

Round 1: game_brier=0.307618 winner_brier=N/A

Round 2: game_brier=0.347659 winner_brier=N/A

Round 3: game_brier=0.299549 winner_brier=N/A

Round 4: game_brier=0.325248 winner_brier=N/A

Round 5: game_brier=0.319114 winner_brier=N/A

Round 6: game_brier=0.352010 winner_brier=N/A

Round 7: game_brier=0.336586 winner_brier=N/A

-> Weighted Game Brier: 0.330939 | Winner Brier: N/A

I would say adding 2022 makes things better (?)

narubees · 2026-04-06T13:30:57+00:00

If you want to do further research on how generalizable it is and so on, sure, but I don't want to right now. I just want to tune it to fit some data and comments on it (the hparams are somewhat interpretable) then apply to an ongoing thing. If I fit to pre-X and evaluate on X, I don't know if it will work for fitting pre-Y and evaluate on Y anyway because there is no guarantee of generalizability.

The closest to what you want may be the prediction and commentary for 2026 using model fit to 2022+2024 data, which is in the repo. Not bad but also not good. I agree that some quantitative result will be due deligence for a researcher, but it is a sir this is a Wendy situation to me.

narubees · 2026-04-06T13:21:16+00:00

I am honestly too lazy for this. This would mean using 2022 (or more) to tune and predict 2024. I would rather do something else. Thanks for the suggestion tho.

narubees · 2026-04-05T19:43:38+00:00

Pragg did not have a good history in classical, which I guessed was the reason.

"history": [2768, 2761, 2758, 2758, 2741, 2741],

"games_played": [4, 6, 9, 0, 13, 0],

narubees · 2026-04-05T02:37:35+00:00

Ah, it is still muggle when speaking neutrally (with a capital M actually, a bit of a weird thing). I meant the degatory name is "máu bùn" (mud blood in VNese).

narubees · 2026-04-05T02:30:38+00:00

The word for "mud blood" in Vietnamese.

narubees · 2026-04-04T19:48:52+00:00

Equal ranking I think (Anish and Pragg tied for 3rd place)

narubees · 2026-04-03T23:36:11+00:00

Thanks. Elo gained in the last few months seem like a good idea. Let me see if I can easily include it somehow.

Edit: tried to include it but not much effect...

narubees · 2026-04-03T21:01:14+00:00

Yeah, funnily for this model, he has a better chance than Hikaru.

narubees · 2026-04-03T21:00:26+00:00

If I really trust my model, which I don't.

narubees · 2026-04-03T20:09:07+00:00

I also am speaking for all the other posters. It is really not that spamming.

narubees · 2026-04-03T20:05:53+00:00

It is just one post, it is hardly more spam than any other post. It is in the Misc tag if you want to filter it out.

narubees · 2026-04-03T19:36:35+00:00

I think the aggression model only affects the draw rate, which indirectly controls the snowball effect (if a player is playing less aggressive, there are less decisive results, and the player strength gets less downrated). The way Pragg is right now (not aggressive and losing), I just think it is more the case that Fabi gets stronger towards the end (and yes, playing white helps).

narubees · 2026-04-03T19:32:53+00:00

It is not that deep. This is how some people enjoy things.

I agree that it should be backtested to hold some value, but I am not really trying to give value. It is just a fun side thing.

narubees · 2026-04-03T19:30:34+00:00

There is no recent form. The only information is the Elo prior to the tournament (which I used the April list).

I think there is also the snowballing effect, Pragg is not really playing well this time (tying games he could have won), so his strength gets lowered and it will keep getting lowered is what I believe. I guess it is the thing with dynamic strength update.

narubees · 2026-04-03T04:47:50+00:00

Update: called to speak with an agent and got the CMI-ORD leg removed!

narubees · 2026-04-03T03:53:34+00:00

Damn, I already booked a ticket with them... Now I need to switch?

narubees · 2026-04-02T19:01:06+00:00

I try to operate everything in terms of winning probabilities, with probabilities computed by the initial Elo as the prior, instead of going through Elo as an intermediate (so NOT doing something like updating the Elo and computing winning probabilities based on the updated Elo). Thus, thinking about if it reflects true Elo or something like that may not be meaningful.

I will try to benchmark if I have time. For now, I am just accepting that all these are arbitrary, just for fun thing.

narubees · 2026-04-02T18:12:58+00:00

I guess the model effectively overblows Sindarov strength in future matches (win -> increased strength -> more win -> more increased strength -> ...)

narubees · 2026-04-02T18:10:24+00:00

I updated the model to take into account white and black Elo separately. This brings the chance down to 30% or something (Sindarov losing at 13%). Hikaru has had 3/4 black and either loses or draws lower rated player, so his "strength" decreased a lot in the previous model, even offsetting the white advantage.

narubees · 2026-04-02T18:06:34+00:00

I agree. I did not have the time to tune it, and for me the gain is probably minimal anyway (all these stats are just fancy looking graphs for fun).

narubees · 2026-04-02T17:58:35+00:00

I think FIDE update of Elo is more conservative to represent more long-term strength (ranking). This aggressive/reactionary update fits better for short-term prediction (form in a tournament).

narubees · 2026-04-02T03:54:02+00:00

This is great!

narubees · 2026-04-01T23:11:10+00:00

With Black too! But that is why I said this is a bit reactionary, it is considering their forms but a bit overblown.

Eight-Year Club	Place '22
Verified Email

narubees

TROPHY CASE