Running Reinforcement Learning with Model-Agnostic Meta-Learning (MAML) TRPO and PPO

This tutorial assumes a completely fresh installation of Ubuntu. If dependencies are already installed then simply run the "Activate Virtual Environment", "Install the Given Requirements", and "Run the Relevant Python File" steps. This repo is inspired by Moritz Schneider's implementation of MAML TRPO schneimo/maml-rl-tf2 (TensorFlow) as well as the rlworkgroup's implementation of MAML PPO rlworkgroup/garage (PyTorch / TensorFlow).

Clone the Repo

Update `apt` Packages

sudo apt update

Use `apt` to Install `git`

sudo apt install git

Create a Personal Access Token on GitHub

From your GitHub account, go to Settings → Developer Settings → Personal Access Tokens → Tokens (Classic) → Generate New Token → Generate New Token (Classic) → Add a relevant "Note" → Select Scope of "Repo" → Fill out the Remainder of the Form → Generate Token → Copy the Generated Token, it will be something like ghp_randomly_generated_personal_access_token

Git Clone Using your Personal Access Token

git clone https://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@github.com/ChinemeremChigbo/maml-ppo.git

Navigate into the `git` Repo Folder

cd maml-ppo/

Get `Python 3.7.16`

Install `Python` Requirements

sudo apt install curl build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev curl libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev

Curl `pyenv` from Using `bash`

curl https://pyenv.run | bash

Update `~/.bashrc` with the Relevant Lines

printf "%s\n" '' 'export PATH="$HOME/.pyenv/bin:$PATH"' 'eval "$(pyenv init -)"' 'eval "$(pyenv virtualenv-init -)"' >> ~/.bashrc

Reload the `~/.bashrc`

source ~/.bashrc

Install `Python 3.7.16`

pyenv install 3.7.16

Get `mujoco150`

Install `mujoco` Requirements

sudo apt install ffmpeg patchelf unzip libosmesa6-dev libgl1-mesa-glx libglfw3

Download `mujoco150`

wget https://www.roboti.us/download/mjpro150_linux.zip

Download the `mujoco` license

wget https://www.roboti.us/file/mjkey.txt

Unzip the `mujoco150` zip folder

unzip mjpro150_linux.zip

Remove the `mujoco150` zip folder

rm mjpro150_linux.zip

Make a `mujoco` directory in the current user's folder

mkdir $HOME/.mujoco

Move `mujoco150` to the mujoco folder

mv mjpro150 $HOME/.mujoco

Move the `mujoco` License to the `mujoco` Folder

mv mjkey.txt $HOME/.mujoco

Update `~/.bashrc` with the Relevant Lines

printf "%s\n" '' 'export LD_LIBRARY_PATH=$HOME/.mujoco/mjpro150/bin' 'export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python' >> ~/.bashrc

Reload the `~/.bashrc`

source ~/.bashrc

Create a Virtual Environment

Use `Python 3.7.16`

pyenv local 3.7.16

Make the Virtual Environment Folder

python3 -m venv env3.7

Activate Virtual Environment

source env3.7/bin/activate

Install wheel

pip install wheel==0.40.0

Run the Relevant `Python` File

Install the Given Requirements

python3 -m pip install -r requirements.txt

Run the `main_trpo.py` File

To run this, uncomment the mujoco requirement from requirements.txt and rerun the previous step

python3 main_trpo.py --env-name HalfCheetahDir-v1 --num-workers 20 --fast-lr 0.1 --max-kl 0.01 --fast-batch-size 5 --meta-batch-size 10 --num-layers 2 --hidden-size 100 --num-batches 1 --gamma 0.99 --tau 1.0 --cg-damping 1e-5 --ls-max-steps 10 --save-iters 1

Run the `experiments.py` File to test `main_trpo`

python3 experiments.py

Run the `main_maml_ppo.py` File to generate pretrained pickled model

python3 main_maml_ppo.py --epochs=1 --episodes_per_task=1

Run the `main_cav_ppo.py` File to test PPO with 2 CAV pairs from scratch

python3 main_cav_ppo.py --epochs=1 --episodes_per_task=1

Run the `main_cav_maml_ppo.py` File to test MAML PPO with 2 CAV pairs starting from pretrained model

python3 main_cav_maml_ppo.py --epochs=1

Run the test2CAV File

Note that you can replace 2 with whichever CAV test is required

python3 test_2CAV_BFoptimal_Kaige.py

References

@misc{garage,
 author = {The garage contributors},
 title = {Garage: A toolkit for reproducible reinforcement learning research},
 year = {2019},
 publisher = {GitHub},
 journal = {GitHub repository},
 howpublished = {\url{https://github.com/rlworkgroup/garage}},
 commit = {be070842071f736eb24f28e4b902a9f144f5c97b}
}

@article{DBLP:journals/corr/FinnAL17,
  author    = {Chelsea Finn and Pieter Abbeel and Sergey Levine},
  title     = {Model-{A}gnostic {M}eta-{L}earning for {F}ast {A}daptation of {D}eep {N}etworks},
  journal   = {International Conference on Machine Learning (ICML)},
  year      = {2017},
  url       = {http://arxiv.org/abs/1703.03400}
}

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
KKT_opt		KKT_opt
logs		logs
maml_rl		maml_rl
results/sacred		results/sacred
tests		tests
tf2marl		tf2marl
.coveragerc		.coveragerc
.gitignore		.gitignore
.travis.yml		.travis.yml
Dockerfile		Dockerfile
README.md		README.md
__init__.py		__init__.py
cav_environment.py		cav_environment.py
env_2CAV_cooperation_20230319_benchmarks.py		env_2CAV_cooperation_20230319_benchmarks.py
env_3CAV_cooperation_20230319_benchmarks.py		env_3CAV_cooperation_20230319_benchmarks.py
env_4CAV_cooperation_20230316_benchmarks.py		env_4CAV_cooperation_20230316_benchmarks.py
env_4CAV_cooperation_20230319_benchmarks.py		env_4CAV_cooperation_20230319_benchmarks.py
env_5CAV_cooperation_20230319_benchmarks.py		env_5CAV_cooperation_20230319_benchmarks.py
env_6CAV_cooperation_20230319_benchmarks.py		env_6CAV_cooperation_20230319_benchmarks.py
env_CAV_cooperation_20230225.py		env_CAV_cooperation_20230225.py
env_CAV_cooperation_20230312.py		env_CAV_cooperation_20230312.py
env_CAV_cooperation_20230313.py		env_CAV_cooperation_20230313.py
env_CAV_cooperation_20230313_benchmarks.py		env_CAV_cooperation_20230313_benchmarks.py
env_CAV_cooperation_benchmarks.py		env_CAV_cooperation_benchmarks.py
experiments.py		experiments.py
main_cav_maml_ppo.py		main_cav_maml_ppo.py
main_cav_ppo.py		main_cav_ppo.py
main_maml_ppo.py		main_maml_ppo.py
main_trpo.py		main_trpo.py
pyvenv.cfg		pyvenv.cfg
requirements.txt		requirements.txt
result_analysis.py		result_analysis.py
test_2CAV_BFoptimal_Kaige.py		test_2CAV_BFoptimal_Kaige.py
test_3CAV_BFoptimal_Kaige.py		test_3CAV_BFoptimal_Kaige.py
test_4CAV_BFoptimal_Kaige.py		test_4CAV_BFoptimal_Kaige.py
test_5CAV_BFoptimal_Kaige.py		test_5CAV_BFoptimal_Kaige.py
test_6CAV_BFoptimal_Kaige.py		test_6CAV_BFoptimal_Kaige.py
test_BFoptimal_Kaige.py		test_BFoptimal_Kaige.py
test_CAV_BFoptimal_Kaige.py		test_CAV_BFoptimal_Kaige.py
train.py		train.py
train_Kaige.py		train_Kaige.py

ChinemeremChigbo/maml-ppo

Folders and files

Latest commit

History

Repository files navigation

Running Reinforcement Learning with Model-Agnostic Meta-Learning (MAML) TRPO and PPO

Clone the Repo

Update apt Packages

Use apt to Install git

Create a Personal Access Token on GitHub

Git Clone Using your Personal Access Token

Navigate into the git Repo Folder

Get Python 3.7.16

Install Python Requirements

Curl pyenv from Using bash

Update ~/.bashrc with the Relevant Lines

Reload the ~/.bashrc

Install Python 3.7.16

Get mujoco150

Install mujoco Requirements

Download mujoco150

Download the mujoco license

Unzip the mujoco150 zip folder

Remove the mujoco150 zip folder

Make a mujoco directory in the current user's folder

Move mujoco150 to the mujoco folder

Move the mujoco License to the mujoco Folder

Update ~/.bashrc with the Relevant Lines

Reload the ~/.bashrc

Create a Virtual Environment

Use Python 3.7.16

Make the Virtual Environment Folder

Activate Virtual Environment

Install wheel

Run the Relevant Python File

Install the Given Requirements

Run the main_trpo.py File

To run this, uncomment the mujoco requirement from requirements.txt and rerun the previous step

Run the experiments.py File to test main_trpo

Run the main_maml_ppo.py File to generate pretrained pickled model

Run the main_cav_ppo.py File to test PPO with 2 CAV pairs from scratch

Run the main_cav_maml_ppo.py File to test MAML PPO with 2 CAV pairs starting from pretrained model

Run the test2CAV File

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Update `apt` Packages

Use `apt` to Install `git`

Navigate into the `git` Repo Folder

Get `Python 3.7.16`

Install `Python` Requirements

Curl `pyenv` from Using `bash`

Update `~/.bashrc` with the Relevant Lines

Reload the `~/.bashrc`

Install `Python 3.7.16`

Get `mujoco150`

Install `mujoco` Requirements

Download `mujoco150`

Download the `mujoco` license

Unzip the `mujoco150` zip folder

Remove the `mujoco150` zip folder

Make a `mujoco` directory in the current user's folder

Move `mujoco150` to the mujoco folder

Move the `mujoco` License to the `mujoco` Folder

Update `~/.bashrc` with the Relevant Lines

Reload the `~/.bashrc`

Use `Python 3.7.16`

Run the Relevant `Python` File

Run the `main_trpo.py` File

Run the `experiments.py` File to test `main_trpo`

Run the `main_maml_ppo.py` File to generate pretrained pickled model

Run the `main_cav_ppo.py` File to test PPO with 2 CAV pairs from scratch

Run the `main_cav_maml_ppo.py` File to test MAML PPO with 2 CAV pairs starting from pretrained model

Packages