GitHub - adishs/icml2020_rl-policy-teaching_code

ICML 2020 -- Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning

Prerequisites:

Python3
Matplotlib
Numpy
Scipy
Cvxpy
Itertools

Running the code

To get results, you will need to run the following scripts:

For the Chain environment

For online attack

python teaching_online.py

For offline attacks when varying parameter $\overline{R}(s_0, .)$

python teaching_offline_vary_c.py

For offline attack when varying parameter $\epsilon$

python teaching_offline_vary_eps.py

To see how long it takes to solve P1, P2, P3 and P4 problems when |S|=4, |S|=10, |S|=50 and |S|=100 run:

python teaching_time_table.py

==========================================

For the Gridworld environment

For online attack

python teaching_online_grid.py

For offline attacks when varying parameter $\overline{R}(s_0, .)$

python teaching_offline_vary_c_grid.py

For offline attack when varying parameter $\epsilon$

python teaching_offline_vary_eps_grid.py

Results

After running the above scripts, new plots will be created in plots/env_chain or in plots/env_grid directory accordingly.

In the main function, the variable number_of_iterations denotes the number of runs used to average the results. Set a smaller number for faster execution.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
code-attacker		code-attacker
plots		plots
MDPSolver.py		MDPSolver.py
README.md		README.md
env_chain.py		env_chain.py
env_gridworld.py		env_gridworld.py
learner.py		learner.py
plot_chain.py		plot_chain.py
plot_grid.py		plot_grid.py
teacher.py		teacher.py
teaching_offline_vary_c.py		teaching_offline_vary_c.py
teaching_offline_vary_c_grid.py		teaching_offline_vary_c_grid.py
teaching_offline_vary_epsilon.py		teaching_offline_vary_epsilon.py
teaching_offline_vary_epsilon_grid.py		teaching_offline_vary_epsilon_grid.py
teaching_online.py		teaching_online.py
teaching_online_grid.py		teaching_online_grid.py
teaching_time_table.py		teaching_time_table.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prerequisites:

Running the code

For the Chain environment

For online attack

For offline attacks when varying parameter $\overline{R}(s_0, .)$

For offline attack when varying parameter $\epsilon$

To see how long it takes to solve P1, P2, P3 and P4 problems when |S|=4, |S|=10, |S|=50 and |S|=100 run:

==========================================

For the Gridworld environment

For online attack

For offline attacks when varying parameter $\overline{R}(s_0, .)$

For offline attack when varying parameter $\epsilon$

Results

About

Releases

Packages

Languages

adishs/icml2020_rl-policy-teaching_code

Folders and files

Latest commit

History

Repository files navigation

Prerequisites:

Running the code

For the Chain environment

For online attack

For offline attacks when varying parameter $\overline{R}(s_0, .)$

For offline attack when varying parameter $\epsilon$

To see how long it takes to solve P1, P2, P3 and P4 problems when |S|=4, |S|=10, |S|=50 and |S|=100 run:

==========================================

For the Gridworld environment

For online attack

For offline attacks when varying parameter $\overline{R}(s_0, .)$

For offline attack when varying parameter $\epsilon$

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages