| ALGORITHMS Overview | CS 234 |
| OVERVIEW of methods and papers | CS 285 |
| Imitation Learning | CS 224RCS 285 |
| RL Introduction | CS 285 |
| RL Basic Theory | CS 234Reviewed 2024 |
| The Key Distributions | CS 234CS 285Reviewed 2024 |
| MDP and Bellman Theory | CS 234TabularValue |
| Model-Based Tabular Control | CS 234TabularValue |
| Model-Free Policy Evaluation | CS 234TabularValue |
| Model-Free Control | CS 234TabularValue |
| Policy Search (policy gradient) | CS 234CS 285PolicyReviewed 2024 |
| Advanced Policy Search | CS 234CS 285PolicyReviewed 2024 |
| Actor Critic Algorithms | CS 285Reviewed 2024 |
| Value Function Methods | CS 234CS 285Reviewed 2024Value |
| Linear Analysis of Value Function Methods | CS 234CS 285Value |
| Optimal Control & Planning | CS 285Model-BasedReviewed 2024 |
| Model-Based RL: Learning the Model | CS 224RCS 285Model-BasedReviewed 2024 |
| Model-Based RL: Improving Policies | CS 224RCS 285Model-BasedReviewed 2024 |
| Bandits | CS 234ExplorationReviewed 2024 |
| MDP Exploration | CS 234Exploration |
| Deep Exploration | CS 285ExplorationReviewed 2024 |
| Exploration without Rewards | CS 285ExplorationReviewed 2024 |
| Offline Reinforcement Learning | CS 224RCS 285Reviewed 2024 |
| Control as Inference | CS 285Reviewed 2024 |
| Reward Learning (overview) | CS 224R |
| Inverse Reinforcement Learning | CS 234CS 285Reviewed 2024 |
| Multi-Task Learning | CS 224RExtra |
| Coordinated Exploration | CS 234ExplorationExtra |
| Deep RL on Real Robots | CS 224R |
| Meta-RL | CS 224R |
| Reset-Free RL | CS 224R |
| Skill Discovery | CS 224R |
| TODO | |