Deep Reinforcement Learning Nanodegree

By: NVIDIA Deep Learning Institute and Unity.

Projects

All of the projects use rich simulation environments from Unity ML-Agents.

The Taxi Problem: Trained a taxi to pick up and drop off passengers.
Navigation: Trained an agent to collect yellow bananas while avoiding blue bananas.
Continuous Control: Trained a robotic arm to reach target locations.
Collaboration and Competition: Trained a pair of agents to play tennis!

Dynamic Programming: Implemented Dynamic Programming algorithms such as Policy Evaluation, Policy Improvement, Policy Iteration, and Value Iteration.
Monte Carlo: Implemented Monte Carlo methods for prediction and control.
Temporal-Difference: Implemented Temporal-Difference methods such as Sarsa, Q-Learning, and Expected Sarsa.
Discretization: Learned how to discretize continuous state spaces, and solve the Mountain Car environment.
Tile Coding: Implemented a method for discretizing continuous state spaces that enables better generalization.
Deep Q-Network: Explored how to use a Deep Q-Network (DQN) to navigate a space vehicle without crashing.
Hill Climbing: Used hill climbing with adaptive noise scaling to balance a pole on a moving cart.
Cross-Entropy Method: Used the cross-entropy method to train a car to navigate a steep hill.
REINFORCE: Learned how to use Monte Carlo Policy Gradients to solve a classic control task.
Proximal Policy Optimization: Explored how to use Proximal Policy Optimization (PPO) to solve a classic reinforcement learning task.
Deep Deterministic Policy Gradients: Explored how to use Deep Deterministic Policy Gradients (DDPG) with OpenAI Gym environments.
Finance: Trained an agent to discover optimal trading strategies (Tutorial from Nvidia Deep Learning Institute).
AlphaZero Tic Tac Toe: Trained an agent to play Tic Tac Toe using AlphaZero alorithm
Multi-Agents: Trained an agent to solve the Physical Deception problem.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
Continuous-Control		Continuous-Control
Deep Q-Network		Deep Q-Network
Multiagent-Tennis		Multiagent-Tennis
Navigation-DeepQN		Navigation-DeepQN
OpenAI Gym Taxi-v2		OpenAI Gym Taxi-v2
Pong-REINFORCE-PPO		Pong-REINFORCE-PPO
TicTacToe-AlphaZero		TicTacToe-AlphaZero
.gitattributes		.gitattributes
Cross-Entropy-Method.ipynb		Cross-Entropy-Method.ipynb
Discretization.ipynb		Discretization.ipynb
Dynamic_Programming.ipynb		Dynamic_Programming.ipynb
Hill_Climbing.ipynb		Hill_Climbing.ipynb
Monte_Carlo.ipynb		Monte_Carlo.ipynb
README.md		README.md
REINFORCE.ipynb		REINFORCE.ipynb
Temporal_Difference.ipynb		Temporal_Difference.ipynb
Tile_Coding.ipynb		Tile_Coding.ipynb
projects.gif		projects.gif