Skip to content
This repository has been archived by the owner on Mar 11, 2021. It is now read-only.

Documentation for repository structure #956

Open
marcinbogdanski opened this issue Jan 9, 2020 · 3 comments
Open

Documentation for repository structure #956

marcinbogdanski opened this issue Jan 9, 2020 · 3 comments

Comments

@marcinbogdanski
Copy link

Hi

Firstly, thanks for all the hard work on AlphaZero reproduction!

Is it possible to run C++ implementation locally (w/o Google Cloud cluster)?

I'm trying to run Minigo C++ locally mostly for learning purposes. After exploring the repo it's not completely clear to me what is the purpose of different folders, what is the interplay between Python/C++ implementations and what would be main entry point to train locally.

So far I figured (perhaps incorrectly):

  • cc - is this fully stand alone implementation? Can it be used to train a model by using concurrent_selfplay as an entry point? Does it talk to Python in any way?
  • cluster - looks like Kubernetes stuff, can be ignored when running locally (?)
  • ml_perf - the script start_selfplay.sh calls C++ concurrent_selfplay, but train.py calls Python? I guess this is a MLPerf wrapper and is not required when running locally?
  • rl_loop - looks like some kind of wrapper?
  • .py files in minigo folder - looks like python implementation? Is it fully independent, or does it talk to C++ in any way?

My current overall theory is that self-play (and test-games?) can be run either with C++ or Python, but training neural network can only be executed in python. Wrappers take care or switching between the two at the right times?

It seems I need to bootstrap, then self-play/train in a loop. The followup issue is that there are multiple bootstrap, selfplay and train scripts across the repository, some of them are wrappers around others, and it is non-obvious to me which folder contains the "master" training loop for local c++ end-to-end execution (if it is possible at all?).

Thanks in advance, will keep digging in the mean time.

@tommadams
Copy link
Contributor

It certainly is possible to run the Minigo pipeline locally, though I have personally never done so :)

Your understanding of the codebase is correct. The selfplay is all done in C++ by concurrent_selfplay (the python code still runs but is slower and has fewer features). Training is done in python.

It sounds like the ml_perf directory is what you need: it's a self-contained benchmark that trains a small model that learns to play something that looks like 19x19 Go within a day or two on a VM with 8 v100 GPUs and 96 cores. If you want to try something smaller to start with, you can train a 9x9 model much quicker (just change --board_size=19 to --board_size=9 in the instructions. You'll also have to bootstrap the training process using random games instead of the checkpoint that the benchmark instructions describe:

./ml_perf/scripts/bootstrap.sh \
        --board_size=9 \
        --base_dir=$BASE_DIR

Please let us know how you get on or if you have any questions, we'll be happy to help. Good luck!

@marcinbogdanski
Copy link
Author

marcinbogdanski commented Jan 9, 2020

So, just to confirm, if I run instructions from ml_pref/README.md, but replace:

    # Download & extract bootstrap checkpoint.
    gsutil cp gs://minigo-pub/ml_perf/0.7/checkpoint.tar.gz .
    tar xfz checkpoint.tar.gz -C ml_perf/

    # Download and freeze the target model.
    mkdir -p ml_perf/target/
    gsutil cp gs://minigo-pub/ml_perf/0.7/target.* ml_perf/target/
    python3 freeze_graph.py --flagfile=ml_perf/flags/19/architecture.flags  --model_path=ml_perf/target/target

    # Set the benchmark output base directory.
    BASE_DIR=$(pwd)/ml_perf/results/$(date +%Y-%m-%d-%H-%M)

    # Bootstrap the training loop from the checkpoint.
    # This step also builds the required C++ binaries.
    # Bootstrapping is not considered part of the benchmark.
    ./ml_perf/scripts/init_from_checkpoint.sh \
        --board_size=19 \
        --base_dir=$BASE_DIR \
        --checkpoint_dir=ml_perf/checkpoints/mlperf07

with (found in ml_pref/scripts/bootstrap.sh):

    # Set the benchmark output base directory.
    BASE_DIR=$(pwd)/ml_perf/results/$(date +%Y-%m-%d-%H-%M)

    # Plays selfplay games using a random model in order to bootstrap the
    # reinforcement learning training loop.
    # Example usage:
    ./ml_perf/scripts/bootstrap.sh \
        --board_size=19 \
        --base_dir=$BASE_DIR

In theory I would be at least barking at the right tree?

@tommadams
Copy link
Contributor

Yep, that looks like the correct tree.

I do recommend trying on 9x9 before 19x19 first though, it's around 10x faster:

BASE_DIR=$(pwd)/ml_perf/results/$(date +%Y-%m-%d-%H-%M)
./ml_perf/scripts/bootstrap.sh \
        --board_size=9 \
        --base_dir=$BASE_DIR

You may also want to change ml_perf/scripts/start_selfplay.sh to have the selfplay binary write SGF files after each game completes for debugging purposes:

./bazel-bin/cc/concurrent_selfplay \
   --sgf_dir="${sgf_dir}/selfplay/\$MODEL/${device}" \
   etc...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants