Deep Q-Learning Project to play 2048. See this presentation for an introduction.
Install TensorFlow, python & pip. Then, run:
pip install -r requirements.txt
To run the code, you'll need to update your PYTHONPATH
:
source set_pythonpath.sh
Now, you should be able to run the tests:
py.test
All python source code lives in py_2048_rl
.
This directory contains code to simulate the 2048 game itself.
For example, it provides a Game
class that implements the game logic.
The play
module defines the Experience
class, a play()
function and various strategies that can be passed as an argument to play()
.
This directory contains all code that has to do with the Deep Q-Learning algorithm itself. Here's a comprehensive list of the modules:
replay_memory
implements the Replay Memory. Main methods areadd()
to add an experience andsample()
to sample a number of experiences.experience_collector
implements acollect(strategy, num_games)
function that plays a number of games, deduplicates & undersamples the experiences, and returns them.target_batch_computer
is responsible for computing the target batch that is passed to the network.experience_batcher
uses theReplayMemory
,ExperienceCollector
andTargetBatchComputer
to generate training batches for the neural network.model
defines the Neural Network architecture and its training parameters (e.g. Learning Rate).learning
glues everything together to implement the Deep Q-Learning algorithm.
Step 1 is to set various parameters.
For example, you might want to adjust
- The
GAMMA
value intarget_value_computer.py
- The
INIT_LEARNING_RATE
orHIDDEN_SIZES
inmodel.py
- The
MIN_EPSILON
inexperience_batcher.py
- ...
Once that's done, you can simple run python py_2048_rl/learning/learning.py <train_dir>
.
You can use TensorBoard to monitor your Network training, simply by passing you train directory as the --logdir
param.
Furthermore, have a look at py_2048_rl/analisis.py
(for plotting a historgram of Q-Values) and py_2048_rl/play_game.py
(for simulating a (number of) games given a particular model).