2018-12-08

TCEC Season 14. Leela promoted from 3rd division to div2....



Leela's big journey to try to go to premier division of TCEC, has started!
TCEC season 14 is running for the last couple of weeks and Leela has participated in 3rd division of it, finishing in the top position easily and now participates in the 2nd division trying to promote to 1st division.

2018-12-07

AlphaZero paper, and Lc0 v0.19.1

As everyone has already heard, DeepMind has published a detailed paper on AlphaZero!

The announcement can be found here. Scroll down the announcement to get links to the full paper text as well as supplementary materials (including PGNs of games and training pseudocode).

The paper contains additional details that were missing in the original preprint from one year before. There were some aspects that were implemented in Leela differently from AlphaZero, and I'm sure we'll find some more.

Differences found

So, what differences have we found so far? Here is the list!
  • In training games, only first 15 moves (30 ply) are generated with temperature randomness.
    To explore more possibilities during training games, a randomness (including random blunders) was added to the training. The paper preprint told that that happens for all moves. Final paper also says so, but if you look into pseudocode, it turns out that it's only applied during first 15 moves!
    Training new networks with 15-move-temperature setting will possibly help us to improve endgame play. Leela won't longer wait opponent to blunder, having too high eval for drawn positions.
  • When played against stockfish, AlphaZero used a new technique to ensure game diversity.
    What AlphaZero did, is picked a random move with eval within 1% of the best move's eval, for the first 15 moves. Surprisingly, that improved winrate of AlphaZero in those games.
    We can try that too!
  • Action space turned out to be 0..1, not -1..1
    That's more of a technical detail rather than something that changes the algorithm. In AlphaGo paper, loss was encoded as 0 and win as 1. When AlphaZero preprint came out, they wrote that they changed MCTS action values to -1 for loss, 0 for draw and 1 for a win. But in the end it turned out that it wasn't correct understanding. Loss is still 0, and draw is 0.5.
    As I mentioned, it doesn't change algorithms. However, it changes the meaning of some constants from the paper.
  • Cpuct is not a constant
    CPUCT is a constant, which indicates what should be the balance between exploration and exploitation in the search algorithm. Turned out that that constant is not a constant! This value grows as search progresses!
    We had plans to do something along those lines, as there were problems which were seemingly caused by a constant Cpuct. Namely, it usually happend, that at large number of nodes Leela stuck to one move and never switched.
  • First Play Urgency value is known now. It's -1!
    FPU is a fancy name for a node eval for the case the node was never visited. We used a value based on a parent node (assuming that eval of children is roughly the same as parent's eval). Turned out that AlphaZero just considered unvisited nodes as lost (with very little confidence though)
  • When training new network, positions from last 1 000 000 games are used.
    We used 500 000 last games so far, as it was the number mentioned in previous papers.
  • DeepMind generated new network 4 times rarer than we do.
    We were worried that we did that too rare. But it happened that we were fine, in fact it's fine to have 4 times less networks per day.
  • The network architecture has differences.
    See here for the context.

v0.19.1-rc2

What does those findings mean for us?

We want to experiment with new settings in play and training, so we are urgently releasing a new version of Lc0 v0.19.1 (as a release candidate today, the full release will happen during the next days), where we add missing parameters. There are lots of parameters, and many of them are expected to be renamed/rethought for version v0.20. So, please welcome new parameters:


  • --temp-cutoff-move=X
    After move number X, temperature will be fixed to what is set in --temp-endgame flag.
    To reproduce match a0 vs sf8, set this to 16
  • --temp-endgame
    See above for the meaning. This parameter is mostly exposed for training experiments. Default is 0, and it makes sense to keep it like that for play.
  • --temp-value-cutoff=X
    Only moves with eval within x percentage points from the bestmove are considered during temperature pick.
    Set to 1.0 to reproduce match a0 vs sf8
  • --temperature
    This is an old flag, but set to 10.0 to reproduce settings of match a0 vs sf8.
  • --fpu-strategy
    Default is "reduction", old way of handling first play urgency. Set to "absolute" to play like AlphaZero!
  • --fpu-value=X
    Only used in "absolute" FPU mode. -1.0 is the default, and that's what DeepMind used.
  • --cpuct
    That used to be a constant, and it was equal to 3.4 for quite a long time in Lc0.
    Correct value from AlphaZero is 2.5, but it slows down nps (will investigate why), so for now default is 3.0
  • --cpuct-base
    That's that factor which defines how Cpuct grows. The value from DeepMind paper is 19652, and that's now the default.
  • --cpuct-factorThat's the multiplier of the growing part of Cpuct. Default value now is 2, and that's what DeepMind used (well, they didn't have that factor, but as our action space is 2 times larger, we have to scale this parameter).

Those parameters will appear in today's release candidate v0.19.1-rc2, which will be available for download here. (Yesterday there was already v0.19.1-rc1 which had one new parameter, but rc2 will have more!)

Note that most of those parameters probably won't have immediate useful effect. For them to be useful, new networks have to be trained using those parameters.

Also, all those parameters were added into RC2 in a bit of a hurry. It's very probable there will be RC3 with fixes for bugs that I just introduced. If you see a bug, please report!

2018-11-19

Lc0 v0.19.0 has been released.

v0.19.0 is finally out of "release candidate" status, and now is fully released!
It has been quite a long bugfixing run with 5 release candidates, but now all known issues seem to be resolved.

Can be downloaded here.

For the list of differences relative to v0.18, see post for v0.19.0-rc1.

For people contributing training games, there's no need to rush to upgrade, it's fine to use v0.18.

2018-11-13

Where to play Leela online?

The play.lczero.org web site where everyone could quickly play Lc0 online is down pretty often recently.

But even while it doesn't work, there are some options to play Leela online.

The easiest way is to play on lichess.
There is for example a bot called LeelaChess, it is the very first lichess bot.
Also there are other bots of different configurations and strength. Check the all-the-leelas lichess team and pick the one which is online. You are also welcome to host your own Leela and join that team.

If you know other ways to play Leela online (FICS, etc), please reply in comments, I'll add it to this post.

And of course you can always download Leela and set it up to play locally. This blog post describes how to do that.

UPD: Kontrachess has a way to play with LCZero. It seems very to be very nice looking site! I did not try it myself though. (Initially I thought it was a paid site, but one of the site representatives said in comments below that it's actually free).

UPD2: Also NextChessMove has a number of options including different networks of Lc0. From what I can see, it is similar to what play.lczero.org was. It takes some time to get a move from a free version, but probably paid version is faster (again, I don't know anyone who tried that).

2018-11-03

Lc0 v0.19.0-rc1 (UPD: rc2) has been released.

The release candidate of a new Leela version has been released:

(v0.19.0-rc1)
Upd: we are releasing v0.19.0-rc2 immediately as due to mistake in the release procedure rc1 reported its version as v0.19.0-dev rather than v0.19.0-rc1

We expect testing phase to last around 7-10 days, after which proper v0.19.0 will be released.

Download here. Please test it thoroughly and report any bugs that you find.
Note: CudNN builds for Windows are now compiled with CUDA 10. You may need to update you GPU driver to run it.

Please don't use release candidates to generate training games. We only use stable versions for that.

What's new:

Search algorithm changes

When visiting terminal nodes and collisions, instead of counting that as one visit, estimate how many subsequent visits will also go to the same node, and do a batch update.
That should slightly improve nps near terminal nodes and in multithread configurations. Command line parameters that control that:
  • --max-collision-events – number of collision events allowed per batch. Default is 32. This parameter is roughly equivalent to --allowed-node-collisions in v0.18.
  • --max-collision-visits – total number of estimated collisions per NN batch. Default is 9999.

Time management

  • Multiple changes have been done to make Leela track used time more precisely (particularly, the moment when to start timer is now much closer to the moment GUIs start timer).
  • For smart pruning, Leela's timer only starts when the first batch comes from NN eval. That should help against instamoves, especially on non-even GPUs.
  • Also Leela stops the search quicker now when it sees that time is up (it could continue the search for hundreds of milliseconds after that, which caused time trouble if opponent moves very fast).
Those changes should help a lot in ultra-bullet configurations.

Better logging

Much more information is outputted now to the log file. That will allow us to easier diagnose problems if they occur. To have debug file written, add a command line option:
--logfile=/path/to/logfile
(or short option "-l /path/to/logfile", or corresponding UCI option "LogFile")
It's recommended to always have logging on, to make it easier to report bugs when it happens.

Configuration parameters change

Large part of parameter handling has been reworked. As the result:
  • All UCI parameters have been changed to have more "classical" look.
    E.g. was "Network weights file path", became "WeightsFile".
  • Much more detailed help is shown than before when you run
    ./lc0 --help
  • Some flags have been renamed, e.g.
    --futile-move-aversion
    is renamed back to
    --smart-pruning-factor.
  • After setting a parameter (using command line parameter or uci setoption command), uci command "uci" shows updated result. That way you can check the current option values.
  • Some command-line and UCI options are hidden now. Use --show-hidden command line parameter to unhide them. E.g.
    ./lc0 --show-hidden --help

Also, in selfplay mode the per player configuration format has been changed (although probably noone knew that anyway):
Was: ./lc0 selfplay player1: --movetime=14
Became: ./lc0 selfplay --player1.movetime=14

Other

  • "go depth X" uci command now causes search to stop when depth information in uci info line reaches X. Not that it makes much sense for it to work this way, but at least it's better than noting.
  • Network file size can now be larger than 64MB.
  • There is now an experimental flag --ramlimit-mb. The engine tries to estimate how much memory it uses and stops search when tree size (plus cache size) reaches RAM limit. The estimation is very rough. We'll see how it performs and improve estimation later.
    In situations when search cannot be stopped (`go infinite` or ponder), `bestmove` is not automatically outputted. Instead, search stops progress and outputs warning.
  • Benchmark mode has been implemented. Run run, use the following command line:
    ./lc0 benchmark
    This feature is pretty basic in the current version, but will be expanded later.
  • As Leela plays much weaker in positions without history, it now is able to synthesize it and do not blunder in custom FEN positions. There is a --history-fill flag for it. Setting it to "no" disables the feature, setting to "fen_only" (default) enables it for all positions except chess start position, and setting it to "always" enables it even for startpos.
  • Instead of output current win estimation as centipawn score approximation, Leela can how show it's raw score. A flag that controls that is --score-type. Possible values:
    • centipawn (default) – approximate the win rate in centipawns, like Leela always did.
    • win_percentage – value from 0 to 100.0 which represents expected score in percents.
    • Q – the same, but scales from -100.0 to 100.0 rather than from 0 to 100.0

2018-10-28

Lc0 training.




 If you are new to Leela (Lc0) Chess and have begun contributing games either using Google Cloud or some other online service or your own home computer, you may be wondering where all those games go and how training of Leela happens.

2018-10-19

Leela beats Fire promoting to Semi-Final of TCEC Cup!




 Leela in a classic drama style, promoted in TCEC Cup Semi-Finals and it will face Stockfish today!
While in CCCC blitz tournament she is still at 3rd place ahead of Komodo, Ethereal and Fire and behind Stockfish and Houdini.

2018-10-12

CCCC Blitz is running.... Leela on top 3!




CCCC blitz tournament is running and till now Leela is having a good performance being steadily on the top 3.


Conditions for the tournament are:
33 engines play a 4x Round Robin tournament with each engine that will play each other 4 times(2 with black and 2 with white) in a total of 128 games per engine, with no opening books or predefined positions used.
This implies a problem though as an engine will play each other twice with white and twice with black so the question becomes: how variety of play will be assured to not have duplicate games? Obviously they will rely on the non determinism of multithreaded search(traditional engines that use more than 1 threads/cores are not deterministic, even Leela that uses more than 1 CPU threads it's not(Leela mainly uses GPU for its search, but uses also CPUs)). This is of course not that wise decision and they should use predefined positions for the second part of the Round Robin.

•Time control of 5 minutes per game plus 2 seconds added time per move.

2018-10-11

Draw in Chess. Some odd cases.




Chess is a game where there are 3 distinct results. White wins or black wins or it is a draw and nobody wins.  Draw can achieved in many ways in Chess. These are:

•Stalemate position. A position where the player to move does not have a legal move to play and his King is not in check. Game immediately ends as a draw.

2018-10-10

Understanding Training against Q as Knowledge Distillation.


Article by Cyanogenoid, member of Leela Chess Zero development team .



Recently, Oracle investigated training the value head against not the game outcome z, but against the accumulated value q for a position that is obtained after exploring some number of nodes with UCT [Lessons From AlphaZero: Improving the Training Target.]. In this post, we describe some of the experiments with Knowledge Distillation (KD) and relate them to training against q.


Background

Knowledge Distillation (KD) [1 , 2] is a technique where there are two neural networks at play: a teacher network and a student network. The teacher network is usually a fixed, fully-trained network, perhaps of bigger size than the student network. Through KD, the goal is usually to produce a smaller student network than the teacher -- which allows for faster inference -- while still encoding the same "knowledge" within the network; the teacher teaches its knowledge to the student. When training the student network, instead of training with the dataset labels as targets (in our case this is the policy distribution and the value output), the student is trained to match the outputs of the teacher.

2018-10-09

Leela promotes to round of 16 in TCEC Cup with 2 nice wins!



 Leela after 6 consecutive draws in the series of games against Laser(division 1 engine) she won last 2 games so she promoted to next round where she will face Ethereal(premier division engine) that beat Rodent(division 4 engine) easily with 5-0 and promoted too. The games against Ethereal will probably take place on this Sunday.

 Leela's performances generally seem very odd, since when she is playing top engines like Stockfish and getting countless draws with performances like around 40 Elo short of Stockfish(like it is on TCEC bonus games) you expect to crush weaker engines. But this doesn't really happen.
 It's a general observation that Leela underperforms against weaker engines and a good analysis of this can be found HERE where it was found that the usual Elo curve does not fit Leela's results well.