rl_robustness

Games:

Matrix and numerical games: Match Pennies, Asymmetric Match Pennies, f(x, y) = x^2 - y^2, f(x, y) = x^2y^2 - xy.

MuJoCo games: Kick And Defend, You Shall Not Pass, Sumo Humans, Sumo Ants.

Code:

Training an RL agent with Minimax optimization and self-play.

Training an adversarial RL agent aganist the agent trained with Minimax optimization or self-play.

Retraining the victim agents against the adversarial agents.

Code structure:

common.py: environments related functions.

env.py: define the game environments.

logger.py: define the logger.

utils.py: logger related functions.

ppo_selfplay.py: define the selfplay related objects: training model, act model, learner, runner.

ppo_minimax.py: define the minimax play related objects: training model, act model, learner, runner.

ppo_adv.py: define the adversarial attack related objects: training model, act model, learner, runner.

selfplay_train.py: main function of training a selfplay agent.

adv_train.py: main function of training an adversarial agent.

minimax_train.py: main function of training a set of minimax agents.

zoo_utils.py: define the policy network models.

annotated_gym_compete.py, compete.py, plot_video.py, video_utils.py, video_recorder.py: generating test video for MuJoCo games.

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
MuJoCo		MuJoCo
Pong		Pong
Starcraft		Starcraft
matrix_numerical		matrix_numerical
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rl_robustness

Games:

Code:

Code structure:

About

Releases

Packages

Contributors 3

Languages

Henrygwb/rl_robust_minimax

Folders and files

Latest commit

History

Repository files navigation

rl_robustness

Games:

Code:

Code structure:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages