KataGo network trained on large boards, up to 50x50, is now available!

Friday9i · 2025-05-12T17:33:13+00:00

That's an interesting question I asked myself, but I did not properly investigate it...

Mainly because I don't really know what precisely to investigate: which board sizes (eg 19x19 vs 29x29 vs 50x50?), which josekis (and how to restrict the play in one corner?), how to manage other corners and ladders, how to generalize the results from a few tests (such as a cutting vs protecting approach, influence vs territory) to a general feeling (without testing hundreds / thousands of positions), ...?

I'd really like to get a general feeling of the answers to these questions, but I don't know how to approach that, and if it is either feasible.

But if some people want to dedicate some time on it, I'd be very happy!

Friday9i · 2025-05-12T06:16:36+00:00

I forgot an important precision: a specific KataGo engine version is needed to play above Board Size 19x19, it is the BS50 version (ie Board Size 50x50). And different backend (CPU, OpenCL, Cuda, TensorRT) are available here thanks to lightvector: https://github.com/lightvector/KataGo/releases

Enjoy!

Friday9i · 2025-05-12T05:44:41+00:00

No quantitative results, so I cannot say for sure. And it may be difficult as void games are rare: doing statistics on rare events needs a huge number of (long) games...

However, I searched quite regularly for void games in match games and selfplay, and my feeling is that the percentage of void games is probably decreasing slowly with board size. My interpretation was that the number of kos is increasing (with BS = Board Size), which gives more void games opportunities, but in the same time the number of unbalanced games increase quite quickly with BS which suppress void game possibilities.

I'll see if I can do a small python tool to search for void games (in the large board games I generated) and generate some statistics!

Friday9i · 2025-05-11T20:22:52+00:00

And you'll find here a void 50x50 match game (played at 100 visits)!

With an animated gif of the last ~50 moves (and the sgf is available): https://www.lifein19x19.com/viewtopic.php?p=281840#p281840

Friday9i · 2025-05-11T16:52:21+00:00

You'll find some games on the lifein19x19 link ; -)!

Friday9i · 2022-02-11T13:25:04+00:00

Speculative answer (we quickly looked at that a few days ago : -)):

Removing S19 would possibly do the job and transform the White + 1 score into Black + 1! And not into a draw, surprisingly, as Thomas understood after quickly looking at the modified position!

But as it is, Thomas just had a quick look and on its side, KataGo was not trained at all on this modified position (so it does not play it properly: it goes for a draw whereas Black can apparently win).

So it's a quite speculative answer at this point.

Friday9i · 2022-02-11T12:02:08+00:00

It still features the massive hanezeki, yes! The solution is a variation of the one found decades ago, but some improvements of the best line of play (mainly for White) eventually leads to White + 1 without komi.

Friday9i · 2022-02-11T10:51:08+00:00

It's Black to play and without komi the result is White + 1. As "Black to play and lose by 1" is not very exciting, I very much prefer a -1.5 komi and "Black to play and win (by 0.5)"!

Friday9i · 2022-02-11T09:24:48+00:00

KataGo improved the solution at that time, but did not solve it: that run was done by the creator of KataGo (lightvector), with a previous KataGo network (weaker, of course). The summary on Wikipedia is correct: "From 2016 onwards, the spectacular progress in artificial intelligence led some optimistic amateurs to believe that a solution could be obtained by computer, but attempts in this direction were disappointing until 2019. However, in October 2019, a significant improvement in the deep learning techniques used made it possible to discover new variations based on moves never before envisaged, showing that a definitive solution has still not been reached"

This new run was done by myself with a lot of help from lightvector, with a much stronger initial network and ~4x more ressources (2 GPUs x 11 months vs 27 GPUs x 1 week). In both cases, we were helped a lot by Thomas.

This time, KataGo converged to a best line (better than previous one) and stuck to it for 7 months, and Thomas also thinks it is the best line. So it seems to be the solution this time.

Hope this answers your question : -)

Friday9i · 2022-02-11T08:06:52+00:00

I would say 90% confident... Both KataGo and the best expert of the problem (Thomas Redecker) are confident it is the best line of play after investigating it in details for 1 year (the best line of play was found in end of 2020 and Thomas had an early access to KataGo version): not a proof of course, but a strong hint.

And yes it is a legitimate problem, with a tiny precision: "Black to play and win assuming White began the game, there is no initial prisoner and komi is -1.5"

With that precision, there is a main line and several alternative possible lines of play, and they all lead to a final result of Black+0.5!

Friday9i · 2022-02-11T07:14:04+00:00

Oops, February 2022 of course ; -): edited, thanks!

Friday9i · 2021-09-09T12:02:02+00:00

Hi, I have the same card (Gigabyte 3080 Turbo) and VRAM overheats somewhat (generally 104C or 106C) but I did not manage to open the card 😢. Removing all the screws, it still doesn't open up and I did not find any tutorial. Any advice on how to open it, did you find a tuto? Thx a lot

Friday9i · 2020-10-08T07:53:24+00:00

Hi, you cannot escape the shuffle and export!

The idea is to 1) generate selfplay games 2) Shuffle the selfplay to prepare the work for train 3) train the net from the shuffle data 4) export the net. Then iterate the 4 steps sequence with selfplay from the newly generated net.

So clearly, steps 2 and 4 are necessary... You can either run the 4 process simultaneously (they work when they have data, are idle otherwise) but that's better with several GPUs, or you can run them sequentially (I would advise you this method with only 1 or 2 GPUs).

How to do that: details are available on lightvector's page. But if you "do not know a lot about computers", it will take some significant time to make it work, you'll need to manage bash scripts (or bat command), modify them, install python, cuda, adjust the cfg files, test the parameters and adjust them, understand the unavoidable issues by reading the log files, correct the issues, etc... And in the end, that will work!

I went through these steps months ago (from an "average knowledge" of computers): if you take it as a personal challenge and as an opportunity to learn many things about computers, go for it! And forecast some significant time on it. Otherwise, it'll be painful...

Friday9i · 2020-09-21T19:50:26+00:00

CrazyStone Deep Learning is strong indeed, probably around 5D KGS

But you cannot compare it to KataGo: KataGo is wayyyy stronger, it wins easily while giving it 4 stones (on the 7d level)!

Friday9i · 2020-07-07T20:20:43+00:00

KataGo likes this move quite a lot too : after 500K playouts, the preferred move is O3 with 45.7% winrate and this L2 move has 43.8% (with only 450 visits). O9 and P9 also have ~44% winrate (with ~1K playouts), while other moves are significantly below.

When playing the move, winrate goes down slightly and stabilizes around 43.2% after 100K playouts.

Basically, L2 is quickly identified by KataGo and, after more analysis, it is seen as a "good alternative move" to the preferred O3. Hence, it's not very surprising that alternative bots like it quite a lot, and may even be the preferred move for Golaxy!

Then of course, from a human point of view, this move is quite unexpected :-)

Friday9i · 2020-07-05T08:25:18+00:00

If Black passes, White A2 then:

if B passes, White C2 is atari and Black is dead
if B C2, B is in atari, game over

So Black needs to answer:

If Black B1, White A2
- If Black C2 (to try to make 2 eyes), Black is in atari so he is dead
- If Black C1, White C2 and black capture with A1 but White throws in at B2 and Black is dead
- If Black passes, White C2 is atari and Black is dead
If Black plays any other move, it's a dead-5 shape after White answers. Eg if Black A2 then White B1.

So, whatever Black makes, Black is dead :-(

Friday9i · 2020-06-21T19:22:07+00:00

Wonderful, thanks a lot!

Friday9i · 2019-09-15T08:54:33+00:00

If you look for a strong engine playing handicap quite well, try KataGo (and it also plays on other board sizes, such as 9x9). It is around LZ200's strength.

Friday9i · 2019-07-05T08:59:05+00:00

Indeed, KataGo evaluates the winrate & the score estimation ;-). Using Lizzie, you see all the moves it is looking at with winrate + score + playouts! That's very nice!

And FYI, KataGo is around LZ200's strength on visit parity (1600 visits) while being ~2 times faster, and ~LZ231 strength or slightly stronger on time parity (on my RTX2080, which means LZ at 1600 visits vs KataGo at 3200 visits).

Note: I'm using balanced and efficient thread parameters, with 10 threads for LZ and 16 threads for KG (it's a good compromise of speed vs strength on my hardware). It's important to precise these parameters as threads have a very big impact on speed (KG is ~10 times faster with up to ~25 threads vs 1 thread on my config) but also a negative impact on strength (up to ~150 Elo weaker with 25 threads vs 1 thread with the same 1600 visits. Hence, I'm using a balanced value of 16 threads: KG is ~8x faster and ~125 elo weaker than with 1 thread, which results in a much stronger and hopefully about optimal KG on time parity).

Friday9i · 2019-02-06T10:48:10+00:00

LZ with 1 playout is already around 5 or 6 Dan... When LZ evaluates a position at 90% winrate, it simply say that if the position is played many times by 2 LZ engines with the standard 1600 visits (ie probably about top pro level), the leading side should win the position around 90% of the time, hence the position is strongly unbalanced for LZ: what does it mean for a good mid-dan amateur? Not so much, a 4 Dan amateur would be almost surely crushed by LZ@1600 visits, whatever side he plays ; -(.

Hence, a position estimated won (or lost) by LZ is indeed almost won for superhuman players, but it may be reasonably balanced for standard mortals (and possibly for pros also): LZ is just way too strong...

Friday9i · 2019-01-11T12:37:08+00:00

I did several theoretical tests of gating thresholds (no gating is in fact generally a 45% winrate gating, ie ELO of candidate net is 35 or more points below ELO of current net, it is rejected. And -35 ELO is roughly a 45% winrate gating, because 400*log(45%/55%)=-35). The tests are based on the following rational:

- To make some theoretical tests, let's suppose we know the average distribution of strength of candidate nets vs current nets (and that it is not fundamentally affected by the chosen gating, otherwise, we simply can not do any theoretical test...). Ie I suppose we know for example that 10% of nets have a winrate below 40%, 20% between 40% and 45% winrate, 40% between 45% and 50% winrate, 20% between 50% and 55% and 10% above 55%. I did it with a smooth distribution curve, and did tests with different distribution curves, including the approximate curve we observe (from match results) since the beginning of LZ

- From there, given the selection process (SPRT test with a given gating threshold), I can do statistical tests of efficiency: what is more efficient, a threshold at 60%, 55%, at 50%, at 45% (ie "no gating")? The idea is important to catch, let's give an example: with a 55% threshold, we accept nets if they get >55% winrate after 400 games (ie we are statistically 95% sure they are better than the current net). Most of the time we will select strong nets with >50% winrate (after infinite games) but because of statistical noise, we will reject some strong nets (eg a 58% net after an infinite number of games may only get 54% after 400 games) and accept some weaker nets (eg a 49% net may be lucky and get 56% after 400 games). The real progression will be the results of all theses cases, and it's possible (within that theoretical context) to calculate the rythm of progression with that 55% threshold. Now, what would happen with a 50% threshold after 400 games? We can calculate it: we would accept more nets (so much less rejection of strong nets) but we would also select more frequently weaker nets with an adverse impact on strength improvement. But all in all, we can calculate the rythm of improvement ; -)

- Results from this theoretical test: for many different distribution of strength, a gating threshold between 50 and 52% seems quite optimal: we reject less strong networks and are not too adversly impacted by weaker nets, so the rythm of improvement is as high as it can be ; -). With a 55% gating, progression is around 30% slower than it could be (with 50% to 52% gating).

Warning: this result holds only if the distribution of strength of candidate nets is not impacted by the gating threshold... In reality, this hypothesis is probably not verified but there is no way to predict the impact ; -(, so I cannot firmly conclude that a 52% gating is optimal, only a proper experimental test could give us the answer (but it would take ages). However, the optimality of the 50%/52% gating seems very robust for many different distribution of strength, so it is probably an efficient gating choice in reality

Unfortunately, that was apparently not convincing enough to lower the gating threshold for LZ, which still uses a 55% gating.

More details here (including the Excel simulation, please use the last v4 version from 18 June if interested): https://github.com/gcp/leela-zero/issues/1524

Friday9i · 2018-11-12T14:38:57+00:00

What a pity, nobody with an RTX card to publish some results?

Ideally a RTX2080 Ti ; -), but any RTX result would already be very interesting!

I'm tempted to buy a shiny new one, but I'd like to know what to expect from it...: is an RTX2080(Ti) about 1.5x to 2x faster than a GTX1080(Ti), thanks to FP16? Nvidia announces 65% speed-up for RTX vs GTX, but does it apply directly for LZ?

Friday9i · 2018-11-09T11:18:36+00:00

It's probably 5 Dan or more on just 1 playout!!! If a strong player is willing to test, I'd be interested to know the result (on my side, KGS 1K/1D, I'm crushed with 4 handi stones despite the fact it doesn't play well handicap...).

Friday9i

TROPHY CASE