Skip to content

Latest commit

 

History

History
2 lines (2 loc) · 667 Bytes

README.md

File metadata and controls

2 lines (2 loc) · 667 Bytes

taxi-v3-learning

In this project, we tried two different Learning Algorithms for Hierarchical RL on the Taxi-v3 environment from OpenAI gym. SMDP Q-Learning and Intra Option Q-Learning and contrasted them with two other methods that involve hardcoding based on human understanding. We conclude that the solutions learnt by machine are way superior than humans for this problem. Intra Option Q-Learning outperforms SMDP Q-Learning because of better usage of the SARS samples (similar to experience replay). Our algorithms even outperform the Hardcoded Agent. We also demonstrated and concluded the strong effectiveness of state compression on the model performance.