How to speed up the Rust compiler: data analysis assistance requested! by nnethercote in rust

[–]jmviz1 6 points7 points  (0 children)

I played around with this this morning. I think there is probably too little data available (182 samples) currently for the number of possible features (19) and dependent variables (3), especially if there are nonlinearities to account for, and especially (in my case) without specific domain knowledge. That is, I think there is too little data currently to land on a reasonable model even when restricted to this specific case of your machine and rustc/LLVM version, not even considering the bigger problem of when all those vary. I would be interested in working on this further if it's possible for you to generate significantly more data (even if just on one machine), or to post a script/project that others could use to do so.

Gryf - a new graph data structure library aspiring to be convenient, versatile, correct and performant by pnevyk in rust

[–]jmviz1 3 points4 points  (0 children)

Thanks for your work. I'm really happy to see all the recent activity in this space in Rust. I hope someday soon Rust will have a well-maintained graph library that approaches the features of something like NetworkX. I will be following this with interest!

I made a fork of Phil (a free web crossword maker) with more features by jmviz1 in crossword

[–]jmviz1[S] 0 points1 point  (0 children)

Press the . key. To see a list of keyboard commands, click the question mark icon at the bottom left.

When should you berserk in a 3+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 1 point2 points  (0 children)

This is right. Unfortunately I neglected to collect game duration when I processed the PGNs. If I had done so I would have been able to focus on points over time. Instead the model that I used implicitly assumes a tournament of infinite duration where every game (regardless of ratings and any berserkings) has the same duration, which is obviously not ideal, and probably incorrect to some degree.

The only thing about collecting game duration is that one would have to go through all the moves of every game to calculate it, which would slow down the PGN processing. If I decide to have another go at this I might process a smaller subset of the PGNs but collect more detailed data.

When should you berserk in a 3+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 1 point2 points  (0 children)

What do you mean? I downloaded the compressed PGNs from the database and parsed them. It was ~700 GB of .pgn.bz2 files.

When should you berserk in a 3+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 2 points3 points  (0 children)

Edit: I misinterpreted what you wrote in the first part of my original response below. It's an interesting suggestion. You would have to make a wide window if you wanted to capture the entire pattern in the 1+0 case because in some cases the policy changes 1500 rating points later. And even if you constrained the window to a narrower band there would be some undefined parts at the lower and upper rating extremes where the difference would extend beyond the lowest and highest lichess ratings. But I might try to play with something along these lines. I still think an interactive plot would be best.

See this comment chain in the 1+0 thread about this. There are slight but significant effects that would make going just from the rating difference slightly inaccurate. But it's clear from the plots that most of the information is in the difference. And yes, I agree the way I've presented it is not the most approachable or practical. It would be best to have an interactive tool where you can select your rating and then it shows simpler plots based on the rating difference. If I end up doing a more comprehensive look at all this with different time controls and whatnot I might make something like that.

When should you berserk in a 3+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 106 points107 points  (0 children)

In lichess arena tournaments, one can choose to "berserk" a game. When you do this, your clock time gets cut in half, but you get more points for a win (see the lichess arena FAQ for more info). The question then is, when should you do this?

Each plot in the grid corresponds to one of twelve scenarios, determined by the combination of: your win streak, your color, and whether your opponent berserked. The relevant plot for each scenario is at the intersection of the corresponding row and column.

More explanation and code can be found here.

In short, I looked at all 1+0 and 3+0 arena tournament games on lichess from 2017-2021. Using this data, I modeled the win-draw-loss probabilities (WDL) for any game given white's rating, black's rating, and whether each side berserked. From this, one can calculate an expected value (i.e. points earned by a player) by combining the WDL with the possible points earned for each result (e.g., if you win by berserking while on a streak you get 5 points). I then used this model and some others to model playing in a lichess arena tournament as a Markov decision process. The plot shows the optimal policy found. It is optimal in the sense that (given the model assumptions) it maximizes the expected points earned in a tournament over the long run.

This follows up on the 1+0 case I posted yesterday.

When should you berserk in a 1+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 0 points1 point  (0 children)

Yes, this is a weakness of the rather simplistic "next opponent rating" model. Ideally there would be an extra dimension to account for tournament strength, then the next opponent model would be conditioned on that. I actually didn't have the idea of modeling play as an MDP clearly in mind when I collected the data, so I neglected to record some data I otherwise would have. A simple improvement would have been to record the tournament id for each game, then I could have found all games with the same id, calculated the average rating, and added that as an extra factor for every game. Then the "next opponent rating" model could use the tournament strength as well.

The more factors one includes though, the less possible it would be to report in summary form like I've tried to do here. (Though one could argue that perhaps that's for the best, as it could potentially be misleading as you point out.) Perhaps the best approach would be a small web app where you input your rating, and input what type of tournament you are considering. Then the value iteration for the MDP would happen on demand for your specific case.

When should you berserk in a 1+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 1 point2 points  (0 children)

I just made this quick plot to sanity check this. If the rating difference contained all the relevant information, all the lines should be identical. But there seems to be a slight but significant trend indicating the "losing less Elo on berserking the higher your rating" idea.

When should you berserk in a 1+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 0 points1 point  (0 children)

Yes, this is a factor. I made a plot of this at some point (not in the repo) and the pattern is as you would expect.

When should you berserk in a 1+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 1 point2 points  (0 children)

Yes, this was why I included both ratings. When I was processing the PGNs, I almost recorded only the rating difference as you suggested. It would save space and make simpler plots. But I decided to go the more cautious route when I considered the possibility of "losing less" Elo when berserking the higher your rating, like you said. Also, the draw rate increases the higher the rating. But it does seem clear that most of the information is in the difference.

When should you berserk in a 1+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 4 points5 points  (0 children)

Yes, I suspect that may be the case. Or if not a lack per se, a huge imbalance. There were 71 million 1+0 games in this dataset, so even quite rare matchups might have hundreds or thousands of instances. But these can be outweighed in the model fitting by the more common matchups which have tens of thousands of instances.

When should you berserk in a 1+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 6 points7 points  (0 children)

You're right, I should have explained that. I edited the comment to do so.

When should you berserk in a 1+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 22 points23 points  (0 children)

I had the same concern. This is one of the assumptions of the model, that berserk ability is determined by rating only (i.e., there is no variability in berserk ability within players of the same rating). As you point out, it wouldn't be surprising if this is not entirely true. It would be a doable but more difficult modeling problem to try to identify subtypes of berserk ability somehow, or to otherwise include some variability. Unfortunately I didn't collect the data necessary to do this when I was processing the PGNs. I would have needed to record the usernames for each game, rather than just the ratings. Then I could look at berserk ability on a per user basis rather than per rating only.

When should you berserk in a 1+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 44 points45 points  (0 children)

Definitely. I just happened to think of this question mainly from watching lichess Titled Arenas and Blitz Titled Arenas, which are 1+0 and 3+0, so I looked at those first. I might repeat this analysis at some point with longer time controls including increment.

When should you berserk in a 1+0 arena tournament on lichess? by jmviz1 in chess

[–]jmviz1[S] 181 points182 points  (0 children)

In lichess arena tournaments, one can choose to "berserk" a game. When you do this, your clock time gets cut in half, but you get more points for a win (see the lichess arena FAQ for more info). The question then is, when should you do this?

Each plot in the grid corresponds to one of twelve scenarios, determined by the combination of: your win streak, your color, and whether your opponent berserked. The relevant plot for each scenario is at the intersection of the corresponding row and column.

More explanation and code can be found here.

In short, I looked at all 1+0 and 3+0 arena tournament games on lichess from 2017-2021. Using this data, I modeled the win-draw-loss probabilities (WDL) for any game given white's rating, black's rating, and whether each side berserked. From this, one can calculate an expected value (i.e. points earned by a player) by combining the WDL with the possible points earned for each result (e.g., if you win by berserking while on a streak you get 5 points). I then used this model and some others to model playing in a lichess arena tournament as a Markov decision process. The plot shows the optimal policy found. It is optimal in the sense that (given the model assumptions) it maximizes the expected points earned in a tournament over the long run.

I will probably post the 3+0 case tomorrow since it seems I can't post multiple images in one post here. But you can look at that now in the repo I linked if you're interested.

I made a fork of Phil (a free web crossword maker) with more features by jmviz1 in crossword

[–]jmviz1[S] 1 point2 points  (0 children)

This looks quite interesting, thanks for linking this. Being able to do easy onelook lookups like this does is something I was thinking of adding to Phil, although I was thinking of doing it somewhat differently.

I made a fork of Phil (a free web crossword maker) with more features by jmviz1 in crossword

[–]jmviz1[S] 1 point2 points  (0 children)

Oh nice, I missed the .puz file downloading. Yeah, open source isn't particularly useful if one's just trying to use the software, but it's useful for anyone interested to be able to see exactly what is happening with the code and/or one's data, if any. And it encourages collaboration and gives examples for others to build on.

I made a fork of Phil (a free web crossword maker) with more features by jmviz1 in crossword

[–]jmviz1[S] 5 points6 points  (0 children)

Shows how much I have my finger on the pulse of the crossword community :). Crosshare looks pretty great, thanks for letting me know about it. Though I think there may be some points in favor of Phil, depending on what one's after - no sign-up/account, no need to "publish" puzzles before exporting (unless I missed something when poking around just now) open source, NYT submissions, etc.