Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Exercise 05

In this exercise we will revisit the included racetrack_environment to have a look at temporal difference (TD) algorithms.

Tasks:

  1. policy evaluation using TD learning
  2. on-policy epsilon-greedy control using TD learning
  3. off-policy epsilon-greedy control using TD learning → Q-learning
  4. using double Q-learning in stochastic environments