This repo contains necessary code for reproducing results in the paper Accelerating Natural Gradient with Higher-Order Invariance, ICML 2018, Stockholm, Sweden.
by Yang Song, Jiaming Song, and Stefano Ermon, Stanford AI Lab.
In this work, we propose to use midpoint integrators and geodesic corrections to improve the invariance of natural gradient optimization. With our methods, we are able to get accelerated convergence for deep neural network training and higher sample efficiency for deep reinforcement learning.
The synthetic experiments and deep reinforcement learning experiments are implemented in Python 3
and TensorFlow
. Deep neural network training experiments are coded in MATLAB 2015b
. In order to run deep reinforcement learning experiments, the users need to obtain a valid license of MuJoCo.
We can observe that for a simple objective where ODEs can be solved accurately, midpoint integrators and geodesic correction methods give much more invariant optimization trajectories than vanilla natural gradient.
To reproduce Figure 2 in the paper, run
python synth/gamma_experiment.py
Training deep autoencoders and classifiers on CURVES, MNIST, and FACES datasets. Code was based on James Martens' MATLAB implementation for Deep Learning via Hessian-free Optimization. (original code)
To download all datasets, run
cd mat/
wget www.cs.toronto.edu/~jmartens/mnist_all.mat
wget www.cs.toronto.edu/~jmartens/newfaces_rot_single.mat
wget www.cs.toronto.edu/~jmartens/digs3pts_1.mat
Then launch MATLAB
in directory mat/
. Experiments can be run by calling
nnet_experiments(dataset, algorithm, runName)
where the options are listed as follows
dataset
: a string. Can be 'CURVES', 'MNIST', 'FACES', or 'MNIST_classification'algorithm
: a string. Can be 'ng', 'geo', 'mid', 'geo_faster' or 'adam'runName
: a string. The name of log files.
Model-free reinforcement learning over continuous control tasks for ACKTR. Based on OpenAI Baselines. Our code can be installed by running
pip install -e rl/
The following are usages for running various RL algorithms tested in the paper
- ACKTR
python -m baselines.acktr.run_mujoco --env=Walker2d-v2 --seed=1 --mom=0.0 --lr=0.03 --alg=sgd
- Midpoint Integrator
python -m baselines.acktr.run_mujoco --env=Walker2d-v2 --seed=1 --mom=0.0 --lr=0.03 --alg=mid
- Geodesic Correction
python -m baselines.acktr.run_mujoco --env=Walker2d-v2 --seed=1 --mom=0.0 --lr=0.03 --alg=geo
If you find the idea or code useful for your research, please consider citing our paper:
@inproceedings{song2018accelerating,
title={Accelerating Natural Gradient with Higher-Order Invariance},
author={Song, Yang and Song, Jiaming and Ermon, Stefano},
booktitle = {International Conference on Machine Learning (ICML)},
year={2018},
}