A collection of notes on AWS DeepRacer Scholarship Challenge 2019.
Contributions are always welcome!
- Lesson 1: Welcome! Important Details on the AWS DeepRacer Scholarship
- Lesson 2: Get Started with AWS DeepRacer
- Lesson 3: Test Drive DeepRacer
- Lesson 4: Reinforcement Learning
- Lesson 5: Tuning Your Model
- Lesson 6: DeepRacer in the Real World
- Lesson 7: The League
- the car's tech specs, assembly and calibration
- the basics of reinforcement learning and its use in AWS DeepRacer
- building, training and deploying your racing model using AWS in both simulated and real-world tracks
- how to compete in the DeepRacer League
- AWS DeepRacer:
- is fully-autonomous
- is a 1/18th scale racing car
- utilizes RL to learn driving habits
- Types of machine learning include:
- Supervised learning
- Unsupervised learning
- Reinforcement learning
- Unboxing Recap
-
AWS DeepRacer includes the following in its box:
- Car chassis (with compute and camera modules)
- Car body
- Car battery
- Car battery charger
- Car battery power adapter
- Compute module power bank
- Power bank’s connector cable
- Power adapter and power cord for compute module and power bank
- 4 pins (car chassis pins)
- 12 pins (spare parts)
- USB-A to micro USB cable
-
You can read more in the Getting Started Guide.
-
-
Other than the vehicle chassis, the vehicle also includes:
- a HD DeepLens camera
- expansion ports
- HDMI port
- microUSB port
- USB-C port
- on/off switch
- reset button
- LEDs
- Place your car on a block (secure it with duct tape or similar) to keep it in place while the wheels move.
- Get your vehicle’s IP address from when it was set up to use wi-fi.
- Sign in to the AWS DeepRacer console with that IP address as instructed here.
- Select Calibration, then Calibrate Steering Angle from the console (see here).
- Calibrate your center steering first. Remember: Due to the concepts of Ackermann steering, only one wheel will actually be straight when you calibrate center steering.
- Calibrate your left steering by choosing the value where your vehicle wheels will turn no further left.
- Calibrate your right steering by choosing the value where your vehicle wheels will turn no further right.
- Place your car on a block (secure it with duct tape or similar) to keep it in place while the wheels move.
- Get your vehicle’s IP address from when it was set up to use wi-fi.
- Sign in to the AWS DeepRacer console with that IP address as instructed here.
- Select Calibration, then Calibrate Speed from the console (see here).
- Calibrate your stopped speed by moving the bar until the wheels are no longer moving.
- Calibrate the forward motion by moving the bar and checking the direction the wheels turn. If they turn clockwise, the forward direction is set. If not, select the “Reverse direction” button to switch the direction the wheels turn to go forward.
- Calibrate the forward maximum speed. Typically, it’s best not to go too high above 2 m/s, as that is the highest the simulator provides in its action space.
- Calibrate the maximum backward speed, which should be essentially the negative value of what the maximum forward speed was set at.
- Sign into the console
- Click the Get Started / Create Model button on the right
- Create resources, if they aren't created yet
- Name and describe your model
- Choose an environment (re:Invent 2018 for now)
- Select action space
- Select reward function
- Tune hyperparameters
- Set stop condition
- decreasing rewards based on the vehicle being within a given marker or not
- Akses dan navigasikan konsol AWS DeepRacer
- Identifikasi langkah-langkah yang digunakan untuk membangun model di konsol AWS DeepRacer
- Gunakan fungsi hadiah dasar di AWS DeepRacer saat mengkonfigurasi model
- Gunakan simulator AWS DeepRacer untuk melatih dan mengevaluasi suatu model
- Accessed and navigated within the AWS DeepRacer console
- Identified the steps that go into building a model in the AWS DeepRacer console
- Used the basic reward function in AWS DeepRacer to configure a model
- Used the AWS DeepRacer simulator to train and evaluate a model
- Supervised learning trains a model based on providing labels to each input, such as classifying between images of a bulldog and those that are not a bulldog.
- Unsupervised learning can use techniques like clustering to find relationships between different points of data, such as in detecting new types of fraud
- Reinforcement learning uses feedback from previous iterations, using trial and error, to improve
- Agent - the entity exhibiting certain behaviors (actions) based on its environment. In our case, it’s our AWS DeepRacer car.
- Actions - what the agent chooses to do at certain places in the environment, such as turning, going straight, going backward, etc. Actions can be discrete or continuous.
- States - has to do with where in the environment the agent resides (at a specific location) or with what is going on in the environment (for a robotic vacuum, perhaps that its current location is also clean). By taking actions, the agent moves from one state to a new state. States can be partial or absolute.
- To begin our cycle, the agent will choose an action given its starting state in the environment. It will then transition to a new state, where it will receive some reward. This will keep on in a continuous cycle of choosing a new action, moving to a new state, receiving a reward, and so on, for the rest of a given training episode.
- Discount factor - determines the extent to which future rewards should contribute to the overall sum of expected rewards. At a factor of zero, this means DeepRacer would only care about the very next action and its reward. With a factor of one, it pays attention to future rewards as well.
- Policy - this determines what action the agent takes given a particular state. Policies are split between stochastic and deterministic policies. Note that policy functions are often denoted by the symbol \piπ.
- Stochastic - determines a probability for choosing a given action in a particular state (e.g. an 80% chance to go straight, 20% chance to turn left)
- Deterministic - directly maps actions to states.
- Convolutional neural network - a neural network whose strength is determining some output from an image or set of images. In our case, it is used with the input image from the DeepRacer vehicle (whether real or simulated).
- Value functions indicate which actions you should take to maximize rewards over the long-term (the expected rewards when starting from some given state). These are often represented with the capital letter VV.
- AWS DeepRacer uses the Proximal Policy Optimization (PPO) algorithm to help optimize the value function.
- Policy network (aka actor network) - decides which action to take given an input image
- Value network (aka critic network) - estimates the cumulative result given the input image
- Our agent is AWS DeepRacer itself, or more specifically the neural networks controlling the vehicle.
- The environment is where the agent performs its actions; in this case, the track.
- The action is what the agent decides to do based on the current state - changing speed, turning, etc.
- The state is a given point in time where AWS DeepRacer finds itself in the environment.
- The reward is feedback given to the agent based on its action from the previous state. It is rewarded for doing well or penalized for doing poorly.
- You learned about how RL differs from supervised and unsupervised learning
- You walked through an example of how a RL-trained agent behaves
- You went in depth on the main aspects of a RL model, such as agents, actions, environments, states and rewards
- You saw how each of these aspects apply to AWS DeepRacer
- Finally, you investigated some of the other use cases of RL