- AlphaGo Zero paper
- AlphaZero for Chess and Shogi
- Lessons from Implementing AlphaZero Blog Post
- A Simple Alpha(Go) Zero Tutorial
The latest models can be found here. Unzip the file and follow the instructions in the README.
The latest models can be found here if you want to continue learning from a given point. Check the REAME to know the number of residual blocks.
A part of the training consists in generating games to train on them. To make this step faster, multiple games are generated in parallel. Several approach are implemented. In parrallel to this step, it is important to run the continuous learning (see below).
The current prefered method uses files to communicate. Here are example of cammands to run it. Other parameters can be found in janggi/parameters.py.
CUDA_VISIBLE_DEVICES=0 python3 inference_service_files.py --root_file_inference /tmp --n_residuals 40 --batch_size 16&
python3 game_generation_files.py --root_file_inference /tmp --n_iterations 200 --number_simulations 800 --n_processus 32 --n_episodes 32 --c_puct 1.0
The following methods are generally slower, or are deprecated.
game_generation_processes.py uses processes to generate games in parallel. First, we launch a process responsible of running the inference service, in charge of providing, given the board features, the probability distribution of actions and the value associated to the board. Then, we start game generation processes. All processes communicate via a shared memory (available since Python 3.8).
This execution tries to create batches in order to maximize the usage of the GPU.
inference_service_files.py predict value and probabilities from features given in a file.
game_generation_files.py generate the games by writing files for the prediction.
inference_service.py contains two inference services: One is running a web server with Flask, and the other uses directly web sockets. To choose between the two, one must modify the last lines of the file (sockets by default).
game_generator.py generates the games by using for inference either a web server (ServicePredictorWeb) or the sockets (ServicePredictorSocket). To choose between the two, one must change the predictor in the main function (socket by default).
game_generator_not_nn.py generates games that are not using neural network but a simple MCTS with a evaluation function based on the current score.
train_model.py generates games sequentially and train on them. This is slow, do not use it.
Once games have been generated, it is possible to train a new model on them.
train_model_supervised.py trains a model in a supervised way, given a file containing existing games in a specific format. See data/ for examples.
continuous_learning.py will train and evaluate the model continuously, given the available games generated by other methods. This is the method to prefer. Here is an example of how to launch it:
CUDA_VISIBLE_DEVICES=0 python3 -u continuous_learning.py --n_iterations 200 --number_simulations 800 --n_fights 30 --c_puct 1.0 --n_epoch 1 --learning_rate 0.001 --n_residuals 40 >> continuous_learning.txt
janggi/ contains code to run a janggi game and some useful functions.
ia/ contains code related to training, MCTS and neural networks.
data/ contains real-world games and some scripts used to generate them.