Skip to content

Latest commit

 

History

History
14 lines (8 loc) · 655 Bytes

README.md

File metadata and controls

14 lines (8 loc) · 655 Bytes

In this project, the optimal policy to escape a maze is found using model-based (Policy Iteration) and model-free (Q-learning and SARSA) reinforcement learning techniques.

Start state : top-left corner, Goal state : bottom-right corner

Policy Iteration result:

PI

Q-Learning result:

Q-learning

SARSA result:

SARSA