2018-08-31

Useful advice

It's not completely relevant to Lc0, but many people who follow CCCC wonder how to disable sound.

If you use Chrome browser, you can mute the tab by right's clicking tab's header and choosing "mute site" from there. :)
The same also works for Firefox.
Possibly in other browsers too.


Also there is a javascript snippet which shows material difference, written by the community.
Link: https://pastebin.com/raw/R3fY11zY
To use, just copy contents of that snippet into javascript console when CCCC is on.
(In Chrome, that's F12 and then Console tab).
UPD: Also variant which shows 50-move clock.
UPD2: Variant which does that even better (also resets on captures, not only pawn moves)

Test20 progress

Test20 is being run for a few hours already, and several people expressed a concern that this time progress is slower then it was in previous runs.

Slower progress at start is expected (per network, not that much per time), for 3 reasons:

1. We have much more frequent networks than in previous tests, so there are less games per network, and less training per network.

2. Our training window is now 500000 games from the very beginning, and we generated 500000 random games. We need 500000 non-random games for random games to fully go out of the training window. Until then, we still use random games for training.

3. Cpuct was changed to 5, it's expected that training will be slower with it at first.

(credits to Tilps, a person who handles training, for this explanation).

CCCC starts.



Chess.com Computer Chess Championship starts today.

24 engines will participate playing all against all twice, in a double Round Robin tournament with 15 minutes for each player for the game plus 5 seconds per move increment and pondering(thinking in opponent's time) on. There will be no opening books usage for the 1st round. Every engine will calculate all the moves by itself.
Leela will play on four Tesla V100 GPUs while the other engines on 46 threads of a 2 x Intel Xeon Platinum 8168, 2.70 GHz that has 48 logical cores and 96 threads.

The hardware is very fast, the engines belong to the top ones so the level of play will be amazing.
Every engine will play 46 games so there would be 46 rounds.
After all games are completed, the first 8 of the 24 engines will advance to round 2.

2018-08-30

Training run reset

Update

After the restart, the initial training of test20 (from random games) is not working as expected.

Some network properties are not going where we expected them to go (for example, it's expected that MSE loss would suddenly drop, but it didn't. Actually, it jumped up instead, can be followed here). Something is wrong with the training, and we are investigating.

The original plan for that unplanned case was to revert to test10 and do further investigations in background while keeping training test10. However, a person who handles trainings has a personal emergency situation today, so it's not clear yet if/when the revert will happen.

For now, no training games are generated by contributors, your GPUs are kept cool.

Update2

After more than a day of running the initial training and some training parameters tweaking, initial network of test20 is taking expected shape (e.g. MSE loss is dropped).
So, no revert to test10. We'll start a reinforcement learning stage of test20 soon, so after that short break clients are expected to start generating training games again.

Update3

test20 training is finally started! First network training from non-random self-play games will be id20058. Networks id20000–20056 were intermediate networks from initial training, and id20057 is the final seed network.



As it was planned, we concluded our test10 run, and now it is time for another one.
Test10 was undoubtedly a success, but it has reached its limit. The vote on discord has shown that the community wants the reset as soon as possible, and that's what we did. :)



We used to keep network identifiers with test numbers (e.g test5 had network id 5xx), but as we had so many networks for the test10 that it overflown into networks id11xxx, the next test is called test20.

It is expected that at the current game rate it will take 6-7 weeks for test20 to become stronger than latest networks from test10.

Changes

What didn't change

Before telling what's new in the next run, let me list what of what we promised, but is not there:
  • Weights quantization is not enabled.
    It is implemented, but we didn't test it enough to confirm that it doesn't lead to weaker nets.
  • SWA (Stochastic weights averaging).
    Implementation turned out to be too slow, optimizations are needed.
  • Training multiple networks in parallel.
    With frequent training that we plan, training pipeline won't be able to keep up with that.
    There are plans to employ several GPUs during training, but that's not implemented yet.
  • It's not main2, but rather test20.
    It's running on test server, but at least we updated the server version.

What did change

And now, how test20 will be different from test10:
  • Cpuct will be equal to 5
    That's the value that Deepmind used in AlphaGo (they did not mention values of Cpuct in AlphaGo Zero and AlphaZero papers).
    It is expected that this will make Leela better in tactics, and will add more variance to openings.
  • Rule50 bug fixed.
    Leela will be able to use information about number of moves without captures and pawn moves.
  • Cache history bug fixed.
    We recently found a bug, that different transposition of the same position could be taken from NN cache, while in reality NN can return different output depending on history. That was fixed.
  • Better resign threshold handling.
    We'll watch at which eval value probability to resign correctly becomes 95% and adjust threshold dynamically.
  • Frequent network generation, ~40 networks per day.
    Test10 started with only ~4 networks per day.
  • Larger batch size in training pipeline.
    This is closer to what DeepMind did for AlphaZero and should reduce overfitting.
  • Ghost Batch Normalization from start
    (I don't really know what it is). Also closer to what DeepMind did and also prevents overfitting.
  • En passant + threefold repetition bug is fixed.
    This was a minor bug which probably won't have much effect. After pawn move by 2 squares, position was never counted towards three-fold repetition.

2018-08-27

CCCC

As most of you are already aware, Leela will participate in the upcoming season of CCCC!

CCCC (chess.com computer chess competition) is a tournament, where top chess engines compete in a different set of formats, settings and time controls on a high-end hardware. Chess.com did conduct computer chess competitions in the past, but this time CCCC features a really good brand new shiny interface which will make watching it even more fun (and also this is the first time Leela participates there, that also adds fun :-P).

Leela will run on four V100 GPUs. That is pretty good hardware, and we hope that Lc0 will be able to show interesting games against top chess engines.

The network that Lc0 will use is id11089.

The endgame tablebases will be disabled. We wrote about endgame wierdness caused by only supporting WDL probes but not DTZ, and the LCZero community voted in Discord against using tablebases this season.


The games will start on August 31st. The first CCCC season will be called CCCC 1: Rapid Rumble. It will be a Round Robin tournament among 24 engines, with time control 15+5, with ponder on and no opening book.

Come watch and support Leela in CCCC chat!

Where to follow:

Lc0 v0.17.0 has been released.

v0.17.0 is out of "release candidate" status, and now is fully released!

Can be downloaded here.

It has no changes relative to RC2. For the list of differences relative to v0.16, see posts for v0.17.0-rc1 and v0.17.0-rc2.

It is encouraged now that everyone who contributes training games, switches to version v0.17.0.

After the network training reset (which is probably few weeks from now), only version v0.17 will be accepted. This is because v0.16 has rule50 encoding bug.

2018-08-23

Tablebase support and Leela weirdness in endgame

As it has been announced earlier, Leela has a partial endgame tablebase support now.

The support in v0.17.0 is partial only, only WDL tables are probed, but not DTZ.
That means, that Leela is only able to query tablebase for positions immediately after captures and pawn moves, and for other positions it has to think by itself.

While it improves strength of the play in average, often lack of DTZ queries causes weird endgame playing effects and losing play.

For example, Leela may have a 7-men position with considerable advantage (probability of win 99%) and then "simplify" to 6-men "won" position by just giving up the material. That 6-men position is "won" from the point of view of tablebases, so it has probability of win 100%, and Leela happily goes there.
However, after that move Leela has to play by itself, and that position may be really hard for Leela to win without tablebases. It's not rare that that leads to drawn or lost games, which Leela would win or draw if it played without tablebases at all.



The code for DTZ support is ready, but not tested, and we are not releasing it in v0.17.0 which will be used in CCCC.

We are looking at CCCC test which is currently running and depending on how it looks like, possibly will ask CCCC team to disable tablebase usage for LCZero this season (but probably won't).

Test CCCC gauntlet with Leela is live!

As you know, we are releasing Lc0 v0.17.0 to participate in the next CCCC (chess.com computer chess competition) season, which will be the first season with a new updated design.
This version has support of pondering and partial support of engdames tablebases, which are going to be useful for CCCC.

The CCCC team kindly agreed to run a testing gauntlet between Lc0 and bunch of other chess engines before the main event.

This gauntlet is LIVE right now!
Link to watch: http://chess.com/cccc

Enjoy!

Test10 learning rate has been lowered

The learning rate for the test10 training run has been lowered to 0.0002. Network id 11013 will be the first network trained with the new LR.

This is the last time we lower it for test10 to squeeze some more Elo out of it. It's expected that the result will be visible within a day or two.

The test10 will probably stay for some weeks, and after that the plan is to do a reset and to start a main2 run from scratch again.

What will change after restart:
  • int8 quantization during training
    That's how DeepMind did it. This will produce networks compatible with TensorRT framework which should considerably improve nps on supported hardware.
    We tried to quantize existing nets, but it doesn't really work that way. Elo drop was about -300.
  • Training with Stochastic Weights Averaging
    That will hopefully result in better network quality.
  • Rule50 plane.
    As I wrote in a few previous blog posts, it turns out that information about 50-move rule counter was not available to the network. That will be fixed.
  • Value of Cpuct constant will be increased during training.
    That may allow Leela to better see tactics.
  • It's possible that we'll train multiple network sizes in parallel, but recently training was really back to back, we are not sure there will be capacity even for two networks.

2018-08-21

Lc0 v0.17.0-rc2 has been released.

The "Release Candidate 2" for the Lc0 version v0.17 has been published!
Available to download here.

Release candidate 1 was mostly bug free, but there were still things to tweak:

  • The Rule50 encoding bug was fixed.
  • Default batch size for openCl changed to 16.
    Up to 5x speedup, promised in RC1, should be visible with default settings.
  • Time management constants were tweaked a bit.

Feel free to use this version for training, but it's not necessary. It is expected that the "rule50" fix will not have either positive or negative effect on networks in the test10 run. 
(Reason: all weights related to that plane are equal to 0 since long ago due to regularization, and it's not really possible to recover from that state.)


We hope that no further changes will be needed and this release candidate will become the v0.17.0.

We've sent this version to CCCC organizers, and it's quite possible that they will have another test of Lc0 playing before the main event, so follow their news if you are interested! Links to CCCC:



2018-08-20

Rule50 encoding bug is found

We had numerous issues in network encoding in the past, and now after pretty long pause we found yet another one! :)

Turns out, that information about 50-move-no-capture-and-pawn-move-counter was located in wrong place in training data, so networks were trained without that information.

That bug existed since the first version of lc0.exe, but wasn't there in lczero.exe (v0.10). That may explain a slight Elo drop when we fully switched to lc0.exe (v0.16).

This bug will be fixed in upcoming v0.17.0.
It may however cause slight Elo drop in networks after that as it needs time to adapt.


And for the curious, what the bug was,

In the code:
struct V3TrainingData {
  uint32_t version;
  float probabilities[1858];
  uint64_t planes[104];
  uint8_t castling_us_ooo;
  uint8_t castling_us_oo;
  uint8_t castling_them_ooo;
  uint8_t castling_them_oo;
  uint8_t side_to_move;
  uint8_t move_count;  // Not used, always 0.
  uint8_t rule50_count;
  int8_t result;
};

Should be:
struct V3TrainingData {
  uint32_t version;
  float probabilities[1858];
  uint64_t planes[104];
  uint8_t castling_us_ooo;
  uint8_t castling_us_oo;
  uint8_t castling_them_ooo;
  uint8_t castling_them_oo;
  uint8_t side_to_move;
  uint8_t rule50_count;
  uint8_t move_count;  // Not used, always 0.
  int8_t result;
};

Spot the difference!

2018-08-19

Lc0 v0.17.0-rc1 has been released.

The release candidate of a new version of the Lc0 engine has been released.

v0.17.0-rc1

We expect to have a stable v0.17.0 release in one week, so that we can use it for CCCC. For now you can either help us to find bugs by trying the RC1, or use v0.16 for now.

Download and full changelog here.


Change highlights:

  • Syzygy Tablebases support.
    Only WDL probe for now, e.g. the engine only probes for positions after pawn moves and captures.
  • Ponder support.
  • Batch support for OpenCL backend, that gives up to 5x speedup.

    UPD:It turned out that openCl batching is off by default, so no 5x speedup is visible.
    To enable, use command like flag --backend_opts=batch_size=16.
    Due to another issue, it may happen that batch size 16 requires too much VRAM so it doesn't start. In that case, try lower values.
  • Windows CUDA version of Lc0 now includes all required .dlls.

Welcome to LCZero blog v2!

Welcome to the new Leela Chess Zero blog! The old blog is gone, and here it is, new and fresh. We will write new posts here.

You can find old posts here: http://archive.is/https://blog.lczero.org/*.