A pure numpy-python implementation of training an RL agent on a k-armed bandit using epsilon-greedy as well as upper confidence bound policies for exploration.
-
Notifications
You must be signed in to change notification settings - Fork 0
samlanka/RL
About
Bandits in Numpy
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published