Each environment (gym-gazebo2 Env) has it's optimal hyperparameters that allow the agent to learn faster and achieve a better policy. In the following table we present our best parameters.
Please open a new issue and share your results if you found better parameters!
Content: baselines/ppo2/defaults.py.
Environment | num_layers | num_hidden | nsteps | nminibatches | lr | cliprange |
---|---|---|---|---|---|---|
MARA | 2 | 16 | 1024 | 4 | lambda f: 3e-3 * math.e**(-0.001918*update) | 0.25 |
MARA Collision | ||||||
MARA Orient | ||||||
MARA Collision Orient |