Imitation from Observation

This project implement an imitation from observation algorithm. The algorithm trains an agent to learn to imitate an expert performing some tasks, by using videos of this expert.

We evaluate this algorithm on DeepMind Control tasks by training different experts and trying to imitate these experts. To evaluate our model:

We train an expert on a specific task:

We use DrQv2 algorithm, a model-free RL algorithm, to train an expert on a DeepMind Control task. This expert allows us to build a dataset of demonstrations showing how to do the task in many source contexts.

Demo of experts trained on Finger Spin, Finger Turn and Reacher tasks

We train a context translation model using videos of the trained expert

The Context translation model which takes as input a demonstration of an expert in a source context, the first observation of the imitator agent in the target context, and outputs a predicted sequence of next observations in this target context. We train this model with this Imitation from Observation algorithm.

Example of context translation from a source context into a target context. The first row is the sequence of expert observations, the second row is the predicted sequence of agent observation.

We train an imitator agent to reproduce the different states predicted by the context translator using a classic Actor-Critic algorithm.

Usage

Install the environment

conda env create -f env.yml
conda activate ifo

For the following we consider only the task Reacher Hard, but this process can be applied with any other tasks.

Expert Training (this part is optional, trained experts are provided in `experts` folder )

Train the expert

python train.py task=reacher_hard

Watch evaluation videos in the eval folder of the experiment folder
Watch the training on Tensorboard

tensorboard --logdir exp_local

Copy-paste the snapshot.pt file from the experiment folder exp_local into the experts folder (create it in the root if it doesn't exist) and name it reacher_hard.pt

Context translation model training

Generate demonstrations of the expert acting in many random contexts

python generate_reacher_hard_expert_video.py

The demonstration dataset is stored in videos/reacher_hard and split into train and valid datasets.

Train the context translation model on the Reacher Hard expert videos

python train_ct.py task=reacher_hard

Watch evaluation videos in the eval folder of the experiment folder
Watch the training on Tensorboard

tensorboard --logdir ct_local

Copy-paste the snapshot.pt file from the experiment folder ct_exp_local into the ct folder (create it in the root if it doesn't exist) and name it reacher_hard.pt

Imitator Agent Training

Train the imitator agent by using the expert as demonstration video provider and the context translation model as context translator

python train_rl.py task=reacher_hard

Watch evaluation videos in the eval folder of the experiment folder
Watch the training on Tensorboard

tensorboard --logdir rl_local

Acknowledgements

We reuse Denis Yarats's code of the DrQv2 project for this project

Name		Name	Last commit message	Last commit date
Latest commit History 371 Commits
cfgs		cfgs
ct_cfgs		ct_cfgs
demo		demo
domain_xmls		domain_xmls
rl_cfgs		rl_cfgs
run		run
scripts		scripts
virl_cfgs		virl_cfgs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
alexnet.py		alexnet.py
context_changers.py		context_changers.py
ct_model.py		ct_model.py
datasets.py		datasets.py
dmc.py		dmc.py
drqv2.py		drqv2.py
env.yml		env.yml
logger.py		logger.py
losses.py		losses.py
metaworld_env.py		metaworld_env.py
metaworld_test.py		metaworld_test.py
nets.py		nets.py
replay_buffer.py		replay_buffer.py
requirements.txt		requirements.txt
rl_model.py		rl_model.py
train.py		train.py
train_ct.py		train_ct.py
train_rl.py		train_rl.py
train_virl.py		train_virl.py
utils.py		utils.py
video.py		video.py
virl_model.py		virl_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Imitation from Observation

Usage

Expert Training (this part is optional, trained experts are provided in `experts` folder )

Context translation model training

Imitator Agent Training

Acknowledgements

References

About

Releases

Packages

Contributors 3

Languages

License

medric49/imitation-from-observation

Folders and files

Latest commit

History

Repository files navigation

Imitation from Observation

Usage

Expert Training (this part is optional, trained experts are provided in experts folder )

Context translation model training

Imitator Agent Training

Acknowledgements

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Expert Training (this part is optional, trained experts are provided in `experts` folder )

Packages