This repository contains the code for replications of exercises and examples from the book Reinforcement Learning: An Introduction (2nd Edition) by Richard S. Sutton and Andrew G. Barto. Code is written in either Julia or Python. If possible, it's nice to have both implementations but time constraints exist. When possible, .tex files and the generated pdfs for presentation are included in the relevant chapters.
When contributing -- please add your name to the list of contributors and the MIT license. Please remember to pull before pushing, and to work inside the folders for your chapter. If you're adding a new chapter, please create the folder first. Add the relevant code from Shangtong's repository so that we can compare. If you've never used git before, email me and I can help.
Contributors:
Python code for the Tic-Tac-Toe example, with Julia code replicating and extending those results. Python code from Shangtong Zhang, Julia code and slides written by Gabe.
Python code for the 10-armed bandit example, with Julia code replicating those results. Python code from Shangtong Zhang, Julia code and slides written by Gabe.
Python code for the finite Markov example, with slides explaining the code and environment. Python code from Shangtong Zhang, slides written by Finn. Julia code extending the Python code written by Gabe.
Python code for dynamic programming examples (car rental and gambler's problem), with slides explaining the code and environment. Python code from Shangtong Zhang, slides written by Finn. Julia code extending the Python code written by Gabe.
Python code for Monte Carlo examples (blackjack), with slides explaining the code and environment. Python code from Shangtong Zhang and extended by Ruqing and Yurou. Slides written by Ruqing and Yurou. Julia code extending the Python code written by Gabe.