This repository contains codes for paper: The Unreasonable Effectiveness of Greedy Algorithms for Multi-Armed Bandits with Many Arms by Mohsen Bayati, Nima Hamidi, Ramesh Johari, and Khashayar Khosravi.
The scripts containing the implementation of various algorithms discussed in the paper are divided into four files discussed in the following:
-
mmab.py
: this class contains the implementation of algorithms discussed in the paper for the stochastic multi-armed bandit problem. -
cmmab.py
: this class contains the implementation of algorithms discussed in the paper for the contextual multi-armed bandit problem. -
montecarlo.py
: this file runs the monte carlo simulations for the stochastic case and generates the plots for the regret and profile of pulls. -
contextual_montecarlo_datatest.py
: this file runs the monte carlo simulations for the contextual case using the Letter Recognition Dataset and generates the plots for the regret.
Requires python 3, numpy, scipy, and plotly (with orca for static image export).
The paper mainly relies on synthetic simulations, but there exist one simulation using the Letter Recognition Dataset. This dataset is publicly available, but please include the proper citation (described in the link below) if you wish to use this dataset: Letter Recognition Dataset.
If you wish to use our repository in your work, please cite our paper:
Mohsen Bayati, Nima Hamidi, Ramesh Johari, and Khashayar Khosravi. The Unreasonable Effectiveness of Greedy Algorithms for Multi-Armed Bandit with Many Arms, arXiv preprint arXiv:2002.10121
A shorter version of our paper appeared at NeurIPS 2020.
BibTex:
@article{bayati2020unreasonable,
title={The Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms},
author={Bayati, Mohsen and Hamidi, Nima and Johari, Ramesh and Khosravi, Khashayar},
journal={arXiv preprint arXiv:2002.10121},
year={2020}
}
Any question about the scripts can be directed to the authors via email.
For generating the figures in the paper execute the following codes:
-
Figure 1:
python3 montecarlo.py -T 20000 20000 -k 1000 3000
. -
Figure 2:
python3 contextual_montecarlo_datatest.py -T 8000 8000 -k 300 300 -d 2 6
. -
Figure 3:
python3 contextual_montecarlo_datatest.py -T 8000 8000 -k 8 8 -d 2 6
.