Skip to content

Latest commit

 

History

History
 
 

Exercise 04

In this exercise we will use the included racetrack_environment in order to write our first reinforcement learning algorithm. The used algorithm is Monte-Carlo learning in an on- and off-policy fashion.

Tasks:

  1. policy evaluation using first-visit Monte-Carlo
  2. on-policy epsilon-greedy control using first-visit Monte-Carlo
  3. off-policy epsilon-greedy control with weighted importance sampling Monte-Carlo
  4. extra challenge

(Source: https://media.giphy.com/media/UqZ4imFIoljlr5O2sM/giphy.gif)