- homework assignments and some algorithms associated with my reinforcement learning class
- Monte-carlo, temporal difference:
- MC-nonstationary
- MC-incremental
- TD(0)
- TD-forward(0.5)
- TD-backward(0.5)
- Sarsa (state-action-reward-state-action), Q-learning: (on Windy Gridworld with and w/o King’s Moves respectively, Reinforcement Learning textbook)
- Sarsa (done)
- Q-learning