By: NVIDIA Deep Learning Institute and Unity.
All of the projects use rich simulation environments from Unity ML-Agents.
The Taxi Problem: Trained a taxi to pick up and drop off passengers.
Navigation: Trained an agent to collect yellow bananas while avoiding blue bananas.
Continuous Control: Trained a robotic arm to reach target locations.
Collaboration and Competition: Trained a pair of agents to play tennis!
Dynamic Programming: Implemented Dynamic Programming algorithms such as Policy Evaluation, Policy Improvement, Policy Iteration, and Value Iteration.
Monte Carlo: Implemented Monte Carlo methods for prediction and control.
Temporal-Difference: Implemented Temporal-Difference methods such as Sarsa, Q-Learning, and Expected Sarsa.
Discretization: Learned how to discretize continuous state spaces, and solve the Mountain Car environment.
Tile Coding: Implemented a method for discretizing continuous state spaces that enables better generalization.
Deep Q-Network: Explored how to use a Deep Q-Network (DQN) to navigate a space vehicle without crashing.
Hill Climbing: Used hill climbing with adaptive noise scaling to balance a pole on a moving cart.
Cross-Entropy Method: Used the cross-entropy method to train a car to navigate a steep hill.
REINFORCE: Learned how to use Monte Carlo Policy Gradients to solve a classic control task.
Proximal Policy Optimization: Explored how to use Proximal Policy Optimization (PPO) to solve a classic reinforcement learning task.
Deep Deterministic Policy Gradients: Explored how to use Deep Deterministic Policy Gradients (DDPG) with OpenAI Gym environments.
Finance: Trained an agent to discover optimal trading strategies (Tutorial from Nvidia Deep Learning Institute).
AlphaZero Tic Tac Toe: Trained an agent to play Tic Tac Toe using AlphaZero alorithm
Multi-Agents: Trained an agent to solve the Physical Deception problem.