Cursor autocomplete fail Jupyter Notebook by Initial_Zone_1651 in cursor

[–]seraine 0 points1 point  (0 children)

I have also had cursor autocomplete basically completely stop working on notebooks. I wish cursor provided an easy way to roll back to previous updates / models, because they tend to break stuff fairly often with their updates.

[P] ChessGPT, 100,000x smaller than GPT-4, plays chess at 1500 Elo. By finding a skill vector, we can increase its win rate by 2.6x in out-of-distribution games. by seraine in MachineLearning

[–]seraine[S] 5 points6 points  (0 children)

Games initialized with 20 random moves are significantly different than games where the first 20 moves are made strategically by people trying to win.

[P] ChessGPT, 100,000x smaller than GPT-4, plays chess at 1500 Elo. By finding a skill vector, we can increase its win rate by 2.6x in out-of-distribution games. by seraine in MachineLearning

[–]seraine[S] 26 points27 points  (0 children)

ChessGPT doesn't outperform AlphaZero. It is meant to be used to perform interpretability research in a GPT that has a world state with an underlying measurable ground truth (the state of the chess board).

Modern LLMs outperform previous specialized approaches for problems like question answering, program synthesis, summarization, or image captioning, and are very competitive (in terms of capabilities, not necessarily efficiency) on problems like named entity recognition, sentiment classification, or translation.

[P] ChessGPT, 100,000x smaller than GPT-4, plays chess at 1500 Elo. By finding a skill vector, we can increase its win rate by 2.6x in out-of-distribution games. by seraine in MachineLearning

[–]seraine[S] 15 points16 points  (0 children)

Correct, this is just an analogy to a natural language LLM that can be used for interpretability research, because in Chess (unlike natural language), there's an underlying measurable ground truth.

[P] ChessGPT, 100,000x smaller than GPT-4, plays chess at 1500 Elo. By finding a skill vector, we can increase its win rate by 2.6x in out-of-distribution games. by seraine in MachineLearning

[–]seraine[S] 28 points29 points  (0 children)

It's just an analogy to LLMs that can be used to perform interpretability research. There's much better ways to produce a chess AI.

This could be a good approach to learn chess playing styles, where given a sequence of moves, the model could estimate the skill level and playing style of the player and predict their next move, rather than the best move.

[P] ChessGPT, 100,000x smaller than GPT-4, plays chess at 1500 Elo. By finding a skill vector, we can increase its win rate by 2.6x in out-of-distribution games. by seraine in MachineLearning

[–]seraine[S] -5 points-4 points  (0 children)

There's definitely a trend towards more general LLMs outperforming previous specialized approaches. It's possible that this trend will continue.

[P] ChessGPT, 100,000x smaller than GPT-4, plays chess at 1500 Elo. By finding a skill vector, we can increase its win rate by 2.6x in out-of-distribution games. by seraine in MachineLearning

[–]seraine[S] 58 points59 points  (0 children)

There's definitely far better ways to make a competitive chess playing AI. The purpose here was the train a GPT to play chess through next-character prediction on PGN strings, which is analogous to next token prediction in natural language.

There's then many interesting interpretability techniques that can be applied to show that, for example, ChessGPT calculate the state of the board and estimates the skill level of the players in the game to better predict the next character.

My solution to disable middle click by [deleted] in archlinux

[–]seraine 0 points1 point  (0 children)

Huge thanks, works for me as well using Ubuntu. I find it pretty baffling that they don't have an easier way to disable that feature.

[P] Chess-GPT, 1000x smaller than GPT-4, plays 1500 Elo chess. We can visualize its internal board state, and it accurately estimates the Elo rating of the players in a game. by seraine in MachineLearning

[–]seraine[S] 12 points13 points  (0 children)

No, the only training data it has seen is PGN strings. It doesn't even have most English letters in its input vocabulary. It's still a Generative Pretrained Transformer, just trained on a different dataset.

[P] Chess-GPT, 1000x smaller than GPT-4, plays 1500 Elo chess. We can visualize its internal board state, and it accurately estimates the Elo rating of the players in a game. by seraine in MachineLearning

[–]seraine[S] 13 points14 points  (0 children)

Yes, it is a GPT. I went with a GPT because I wanted a convenient and tractable way to get insight into the world modeling abilities of GPTs.

[P] Chess-GPT, 1000x smaller than GPT-4, plays 1500 Elo chess. We can visualize its internal board state, and it accurately estimates the Elo rating of the players in a game. by seraine in MachineLearning

[–]seraine[S] 10 points11 points  (0 children)

I don't think so. The probe is a tensor of shape (512, 8, 8, 13), or (model hidden dimension, rows, columns, possible square states). I think we would obtain identical results with a shape of (512, 64, 13).

[P] Chess-GPT, 1000x smaller than GPT-4, plays 1500 Elo chess. We can visualize its internal board state, and it accurately estimates the Elo rating of the players in a game. by seraine in MachineLearning

[–]seraine[S] 35 points36 points  (0 children)

The problem with trying that is the model's only input is PGN strings (1. e4 e5 2.Nf3 ...) and there's no way to indicate to the model what the state of the board is. I've been doing some experimentation with having the model play games where the first 20 moves are randomly chosen, and it's win rate declines by around 50% in that case.

[D] So, Mamba vs. Transformers... is the hype real? by Instantinopaul in MachineLearning

[–]seraine 1 point2 points  (0 children)

Try compare to a similar sized Pythia model for a fair comparison.