Multi-arm Bandits Exploration

This is an bandit experiment that implements different exploration techniques for a 10-arm testbed as described in the Reinforcement Learning Book by Sutton & Barto.

The exploration techniques covered include:

ε-greedy
Optimistic Initialization
UCB Exploration
Boltzmann (Softmax) Exploration

This experiment further compares the different exploration techniques and concludes on which is better to use in different settings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Multi-arm Bandits Exploration

Files

README.md

Latest commit

History

README.md

File metadata and controls

Multi-arm Bandits Exploration