ppo-pytorch

This is a PyTorch implementation of PPO algorithm, which is designed for flexible modification and high performance in continuous control tasks.

Requirements

pytorch 1.4.0
tensorboard
numpy
tqdm
gym
baselines
pybullet (optional)

Setup

You can use the provided requirements.txt file to install necessary dependencies.

$ pip install -r requirements.txt

Training PPO agents

For example, to train a ppo agent using 12 processes for pybullet ant locomotion task as follows:

$ python train.py --task-id=AntBulletEnv-v0 --num-processes=12 --num-env-steps=5000000

You can also monitor the training process and perform hyper-parameters tuning using tensorboard:

$ tensorboard --logdir=log

Here is what it looks like:

reward	action

Experimental Results

It takes about half an hour for 5M training steps in a six cores MacBook Pro.

HalfCheetahBulletEnv	AntBulletEnv

Reference

John Schulman, Philipp Moritz, Sergey Levine, Michael I. Jordan, and Pieter Abbeel. High-dimensional continuous control using generalized advantage estimation. CoRR, abs/1506.02438, 2015.

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.

https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
results		results
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arguments.py		arguments.py
environment.py		environment.py
model.py		model.py
ppo.py		ppo.py
requirements.txt		requirements.txt
storage.py		storage.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ppo-pytorch

Requirements

Setup

Training PPO agents

Experimental Results

Reference

About

Releases

Packages

Languages

License

fengredrum/ppo-pytorch

Folders and files

Latest commit

History

Repository files navigation

ppo-pytorch

Requirements

Setup

Training PPO agents

Experimental Results

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages