- FrozenLake(Gridword)
- WindyGridWorld(in Sutton's book)
DQN(NIPS2013)은 (Experience Replay Memory / CNN) 을 사용.
- CartPole(Classic Control) - Cartpole 같은 경우에는 CNN을 사용하지 않고 센서 정보를 통해서 학습
DQN(Nature2015)은 (Experience Replay Memory / Target Network / CNN) 을 사용
- CartPole(Classic Control)
- Breakout(atari)
- Breakout(atari)
- this code is made by pytorch and more efficient memory and train
- episodic
- one-step
- n-step
- CartPole(Classic Control)(used a single thread instead of multi thread)
- CartPole(Classic Control)(used multiprocessing in pytorch)
- Super Mario Bros(used multiprocessing in pytorch)