The is the implementation of Deep Deterministic Policy Gradient (DDPG) using PyTorch. Part of the utilities functions such as replay buffer and random process are from keras-rl repo. Contributes are very welcome.
- Python 3.4
- PyTorch 0.1.9
- OpenAI Gym
Training : results of two environment and their training curves:
- Pendulum-v0
$ ./main.py --debug
- MountainCarContinuous-v0
$ ./main.py --env MountainCarContinuous-v0 --validate_episodes 100 --max_episode_length 2500 --ou_sigma 0.5 --debug
Testing :
$ ./main.py --mode test --debug