Description
Deep reinforcement learning algorithms ranging from Deep Q-Networks (DQN) to Deep Deterministic Policy Gradients are covered (DDPG). Use these ideas to train agents to walk, drive, or perform other complex tasks, and you'll have a solid portfolio of deep reinforcement learning projects to fall back on.
Syllabus:
Course 1: Foundations of Reinforcement Learning
Introduction to RL
- A friendly introduction to reinforcement learning.
The RL Framework: The Problem
- Learn how to define Markov Decision Processes to solve real-world problems.
The RL Framework: The Solution
- Learn about policies and value functions.
- Derive the Bellman equations.
Dynamic Programming
- Write your own implementations of iterative policy evaluation, policy improvement, policy iteration, and value iteration.
Monte Carlo Methods
- Implement classic Monte Carlo prediction and control methods.
- Learn about greedy and epsilon-greedy policies.
- Explore solutions to the Exploration-Exploitation Dilemma.
Temporal - Difference Methods
- Learn the difference between the Sarsa, Q-Learning, and Expected Sarsa algorithms.
Solve openai Gym’s Taxi - V2 Task
- Design your own algorithm to solve a classical problem from the research community.
RL In Continuous Spaces
- Learn how to adapt traditional algorithms to work with continuous spaces.
Course 2: Value-Based Methods
Deep Learning in PyTorch
- Learn how to build and train neural networks and convolutional neural networks in PyTorch.
Deep Q-Learning
- Extend value-based reinforcement learning methods to complex problems using deep neural networks.
- Learn how to implement a Deep Q-Network (DQN), along with Double-DQN, Dueling-DQN, and Prioritized Replay.
Deep RL for Robotics
- Learn from experts at NVIDIA how to use value-based methods in real-world robotics.
Project: Navigation
Using neural networks, you can train an agent to learn intelligent behaviours from sensory data.
Course 3: Policy-Based Methods
Introduction to Policy-Based Methods
- Learn the theory behind evolutionary algorithms, stochastic policy search, and the REINFORCE algorithm.
- Learn how to apply the algorithms to solve a classical control problem.
Improving Policy Gradient Methods
- Learn about techniques such as Generalized Advantage Estimation (GAE) for lowering the variance of policy gradient methods.
- Explore policy optimization methods such as Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO).
Actro-Critic Methods
- Study cutting-edge algorithms such as Deep Deterministic Policy Gradients (DDPG). LESSON FOUR Deep RL for Financial Trading
- Learn from experts at NVIDIA how to use actor-critic methods to generate optimal financial trading strategies.
Project: Continuous Control
Train a robotic arm to reach specific targets, or a four-legged virtual creature to walk.
Course 4: Multi-Agent Reinforcement Learning
Introduction MultiAgent RL
- Learn how to define Markov games to specify a reinforcement learning task with multiple agents.
- Explore how to train agents in collaborative and competitive settings.
Case Study: Alphazera
- Master the skills behind DeepMind’s AlphaZero.
Project: Collaboration and Competition
Train a system of agents to collaborate or cooperate on a complex task.