The ultimate goal of this project is to create competitive network that can play with all seven tetrominoes, what could be challenging [5]. Project use modified Netris (originally created by by Mark H. Weaver) client netris-env for learning.
Current status:
- Experiment 3. For single tetrimino/piece "I" model achieved target after 2000 episodes.
- Experiment 9. For two tetriminos/pieces "I" and "O" model achieved target after 4973 episodes. To accelerate learning Netris solver was used, also more layers was used than in model from experiment 3. What is interesting client never rotate given tetrimino.
Download and compile netris-env
git clone https://github.com/MateuszJanda/netris
cd netris
./Configure
make
Download project and prepare environment for robot
cd ~/
git clone https://github.com/MateuszJanda/netris-ai-robot.git
cd netris-ai-robot
virtualenv -p python3 venv
pip install -r requirements.txt
ln -s ~/netris/netris-env
Create docker container
docker run -v $PWD:/tmp -w /tmp --gpus all -it --name tf_netris --network host tensorflow/tensorflow:latest-gpu
Use build in *etris game environment.
docker start tf_netris
docker exec -it tf_netris python experiment.py --experiment 3 --use-gpu --local-env
Netris by default spawn new robot every time game end, so to overcome this we need three elements:
- game envirement/server - (Netris itself) in this case tuned version for faster learning -
netris-env
- proxy robot (
proxy.py
), a middleware that is run by second Netris client, and translate and pass communication between Netris envirement and machine learning code. - machine learning code (
experiment.py
)
- On first terminal run Netris (environment) server
./netris-env -w -u -i 0.1
-
On second terminal run machine learning code
- With GPU support (in guest/container)
docker start tf_netris docker exec -it tf_netris python experiment.py --experiment 3 --use-gpu --proxy-env-port 9800
- Alternatively, you can run DQN agent with CPU support (at host)
python experiment.py --experiment 3 --proxy-env-port 9800
-
Run third and fourth (in this case
/dev/pts/3
) terminal for proxy robot and debug logging. Note that interval (-i
) must match value passed to Netris environment server
./netris-env -n -m -c localhost -i 0.1 -r 'python proxy.py --log-in-terminal /dev/pts/3 --port 9800'
- [1] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller. Playing Atari with Deep Reinforcement Learning. arXiv preprint arXiv:1312.5602, 2013.
- [2] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg & Demis Hassabis. Human-level control through deep reinforcement learning
- [3] Harrison Kinsley. Training Deep Q Learning and Deep Q Networks (DQN) Intro and Agent Reinforcement Learning w/ Python Tutorial
- [4] Matt Stevens, Sabeek Pradhan. Playing Tetris with Deep Reinforcement Learning
- [5] Simón Algorta, Özgür Şimşek. The Game of Tetris in Machine Learning. arXiv:1905.01652
- [6] Q-learning. Wikipedia
- [7] Bellman equation. Wikipedia
- [8] https://www.quora.com/Artificial-Intelligence-What-is-an-intuitive-explanation-of-how-deep-Q-networks-DQN-work