Skip to content

Henrygwb/rl_robust_minimax

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rl_robustness

Games:

Matrix and numerical games: Match Pennies, Asymmetric Match Pennies, f(x, y) = x^2 - y^2, f(x, y) = x^2y^2 - xy.

MuJoCo games: Kick And Defend, You Shall Not Pass, Sumo Humans, Sumo Ants.

Code:

Training an RL agent with Minimax optimization and self-play.

Training an adversarial RL agent aganist the agent trained with Minimax optimization or self-play.

Retraining the victim agents against the adversarial agents.

Code structure:

common.py: environments related functions.

env.py: define the game environments.

logger.py: define the logger.

utils.py: logger related functions.

ppo_selfplay.py: define the selfplay related objects: training model, act model, learner, runner.

ppo_minimax.py: define the minimax play related objects: training model, act model, learner, runner.

ppo_adv.py: define the adversarial attack related objects: training model, act model, learner, runner.

selfplay_train.py: main function of training a selfplay agent.

adv_train.py: main function of training an adversarial agent.

minimax_train.py: main function of training a set of minimax agents.

zoo_utils.py: define the policy network models.

annotated_gym_compete.py, compete.py, plot_video.py, video_utils.py, video_recorder.py: generating test video for MuJoCo games.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •