ICML 2020 -- Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning
Python3
Matplotlib
Numpy
Scipy
Cvxpy
Itertools
To get results, you will need to run the following scripts:
python teaching_online.py
python teaching_offline_vary_c.py
python teaching_offline_vary_eps.py
To see how long it takes to solve P1, P2, P3 and P4 problems when |S|=4, |S|=10, |S|=50 and |S|=100 run:
python teaching_time_table.py
python teaching_online_grid.py
python teaching_offline_vary_c_grid.py
python teaching_offline_vary_eps_grid.py
After running the above scripts, new plots will be created in plots/env_chain or in plots/env_grid directory accordingly.
In the main function, the variable number_of_iterations denotes the number of runs used to average the results. Set a smaller number for faster execution.