By: NVIDIA Deep Learning Institute and Unity.
All of the projects use rich simulation environments from Unity ML-Agents.
-
The Taxi Problem: Trained a taxi to pick up and drop off passengers.
-
Navigation: Trained an agent to collect yellow bananas while avoiding blue bananas.
-
Continuous Control: Trained a robotic arm to reach target locations.
-
Collaboration and Competition: Trained a pair of agents to play tennis!
-
Dynamic Programming: Implemented Dynamic Programming algorithms such as Policy Evaluation, Policy Improvement, Policy Iteration, and Value Iteration.
-
Monte Carlo: Implemented Monte Carlo methods for prediction and control.
-
Temporal-Difference: Implemented Temporal-Difference methods such as Sarsa, Q-Learning, and Expected Sarsa.
-
Discretization: Learned how to discretize continuous state spaces, and solve the Mountain Car environment.
-
Tile Coding: Implemented a method for discretizing continuous state spaces that enables better generalization.
-
Deep Q-Network: Explored how to use a Deep Q-Network (DQN) to navigate a space vehicle without crashing.
-
Hill Climbing: Used hill climbing with adaptive noise scaling to balance a pole on a moving cart.
-
Cross-Entropy Method: Used the cross-entropy method to train a car to navigate a steep hill.
-
REINFORCE: Learned how to use Monte Carlo Policy Gradients to solve a classic control task.
-
Proximal Policy Optimization: Explored how to use Proximal Policy Optimization (PPO) to solve a classic reinforcement learning task.
-
Deep Deterministic Policy Gradients: Explored how to use Deep Deterministic Policy Gradients (DDPG) with OpenAI Gym environments.
-
Finance: Trained an agent to discover optimal trading strategies (Tutorial from Nvidia Deep Learning Institute).
-
AlphaZero Tic Tac Toe: Trained an agent to play Tic Tac Toe using AlphaZero alorithm
-
Multi-Agents: Trained an agent to solve the Physical Deception problem.