Reinforcement Learning
Policy gradient, Q-learning, and actor-critic methods built from the Bellman equation up. Train agents in MuJoCo and Atari.
Intermediate14 weeks · 14 lessons
Your progress0 / 14 · 0%
Policy gradient, Q-learning, and actor-critic methods built from the Bellman equation up. Train agents in MuJoCo and Atari.