This is our final project for ECS 170 Spring 2018 at UC Davis
Note to graders: All logic has been implemented but a few bugs in the middle of the act/train step are preventing the entire model from working. Environment and observation/reward modifiers are fully functional individually, it is just this training bug causing an issue.
We wanted to build a bot to micromanage army units. Our model uses a deep reinforcement learning algorithm called A2C and the FullyConv model to try to achieve this. More generally, we wished to build a customizable system so that it can easily be adapted. We achieve this customization through config.json files (or strings) that define what observations, rewards, and actions the bot will consider while training. Similarly, all advanced training settings, map files, and more can be set as well.
We refered to these repos when considering designs for our project:
- Other pysc2 git repos as well, but the above are the primary main ones
: general agent, model, and runner files (for A2C and FullyConv)
: Modify and reformat observations, actions, and rewards
: environment manager
: map files
: test files (some unittest, some manual)
: images and other miscellaneous files
: Some experimenting with observations
: weekly check-in files
Some of our branches have more experimentation code we didn't want to clutter
- install pysc2 version 2.0 (from their git repo)
- follow instructions on pysc2 website to install Starcraft 2 (latest)
- install tensorflow gpu (1.8.0)
- install numpy (1.14.2, should come with tensorflow)
- run:
(on 3.5+)
Do note while all code is logically complete, there are a few unresolved bugs in the trainer
To run our test files, uncomment line 30 in
#test_env(env, config)
This tester shows that we are able to correctly set up the environment for pysc2 and launch starcraft game instances. Here we are hard-coding actions, but we are able to correctly parse observations and modify rewards. These can be fed into the model.
In addition, actual unit tests can be found in the tests/
folder as well.
We tested and trained on a simplified map. This map has no buildings, fog of war, or resource collection. The tanks on this map are forced stationary and other units kept equal between players same. For our project, we were interested in unit micromanagement and less on other aspects of gameplay, so this map allows us to focus our model. It should be noted that our AI can run any map file with a built in reward and even apply our own reward calculations to any map.
Here are some screenshots of part of this map:
another screenshot:
We also made a map where our agent and the built-in agent have the same
number of units, all zerglings:
another screenshot: