Neural Network training repository for the Black Marlin chess engine.
This repo is or was relied upon by a number of other engines, including but not limited to Viridithas, Svart, and Carp.
- Python 3
- Cargo (Rust)
- Numpy
- PyTorch
- Clone the repo with
git clone https://github.com/dsekercioglu/marlinflow - Build the data parser, using native CPU optimisations for your system:
cd parse
cargo rustc --release -- -C target-cpu=native- Locate the resulting
.so/.dllin thetarget/release/directory and move it to thetrainer/directory, renamed as libparse.so/libparse.dll. - Create some directories for training output in the
trainer/directory:
In trainer/, do
mkdir nn
mkdir runs- Decide upon the directory in which you want to store your training data. (simply making a
data/directory insidetrainer/is a solid option) - Place your data file in the directory created in step 5. (if you don't have one, consult Getting Data)
- In
trainer/, runmain.pywith the proper command line arguments:
A typical invocation for training a network looks like this:
python main.py \
--data-root data \
--train-id net0001 \
--lr 0.001 \
--epochs 45 \
--lr-drop 30 \
--batch-size 16384 \
--wdl 0.3 \
--scale 400 \
--save-epochs 5--data-rootis the directory created in step 5.--train-idis the name of the training run.--lris the learning rate.--epochsis the number of epochs to train for.--lr-dropis the number of epochs after which the learning rate is dropped by a factor of 10.--batch-sizeis the batch size.--wdlis the weight of the WDL loss. (1.0 would train the network to only predict game outcome, while 0.0 would aim to predict only eval, and other values interpolate between the two)--scaleis the multiplier for the sigmoid output of the final neuron.--save-epochs ntells the trainer to save the network everynepochs.
- Convert the resulting JSON network file into a format usable by your engine:
The trainer will output a number of files in the nn/ directory - files of the form net0001_X are saved state_dict files, which you can ignore (unless you're aiming to resume a half-completed training run) - net0001.json is what you're interested in: a JSON file containing the final weights of the network.
In order to use the network, you will need to convert the JSON file into a more usable format, and you will almost certainly want to quantise it. For simple perspective networks, this can be done with nnue-jsontobin, while for more complex networks like HalfKP and HalfKA (or ones you have designed yourself!) you will need to employ some elbow grease.
To train a network, you will need a large amount of training data. There are a number of possible sources for this data, the most common of which is that you will generate it using your own chess engine, which requires that you write some datagen code. It is recommended that your data generator produce data directly in the marlinflow data format, and not in the legacy text format (see Legacy Text Format), as it is a significantly more compact format, and skips the required conversion step.
To convert a file in the legacy text format into a data file, use marlinflow-utils, which is built in much the same way as the parser:
cd utils
cargo rustc --release -- -C target-cpu=nativeThe resulting binary will be in target/release/, and can be invoked as follows:
target/release/marlinflow-utils txt-to-data INPUT.txt --output OUTPUT.binMarlinflow accepts a specific text format for conversion into data files, with lines set out as following:
<fen0> | <eval0> | <wdl0>
<fen1> | <eval1> | <wdl1>
Here, <fen> is a FEN string, <eval> is a evaluation in centipawns from white's point of view, and <wdl> is 1.0, 0.5, or 0.0, representing a win for white, a draw, or a win for black, respectively.
marlinflow-utils is a program that provides a number of utilities for working with marlinflow. These are as follows:
txt-to-dataconverts a legacy text file into a data file.shuffleshuffles a data file. It is extremely important to shuffle your data before training, to prevent overfitting.interleaverandomly interleaves data files. This allows you to cleanly combine data from multiple sources without requiring a re-shuffle, provided that the source files have already been shuffled.convertwill convert an NNUE JSON file into the BlackMarlin NNUE format. (currently only supports HalfKP)