Reinforcement-Learning-CS6900

Programming assignments

Assignment 1

Contains implementations of algorithms like: epsilon-greedy, UCB, Softmax for solving multi-arm bandit problems on 10-armed testbed

Assignment 2

Contains implementations of sarsa, sarsa-lambda, MC-policy gradient and linear function approximators
gym_pdw contains a custom puddle world environment made using openai gym

Assignment 3

Contains implementation of DQN to solve the cartpole problem and SMDP Q-learning on 4 Room grid environment
grid-worlds contains a custom 4 room grid environment made using openai gym