Lecture 1
资源:https://web.mit.edu/dimitrib/www/RLbook.html
在未来的自动控制方向研究中,强化学习与控制系统决策将通过互补的形式推导该领域技术的发展。强化学习与最优控制系统的核心是决策A technological “miracle” couched in sequential decision making methodology
人工智能(特别是强化学习)与动态规划是该研究方向的两个重要的理论基础:
- AI/RL = artificial intelligence / reinforcement learning: Learning through data/experience, simulation, model-free methods, feature-based representations
- Decision/Control/DP = Dynamic programming: Principle of Optimality; Markov decision problem; POMDP; policy iteration/value iteration
该领域的发展历史:Historical highlights
- Optimal control (Bellman, Shannon, and other 1950s)
- Al/Rl and decision/control/ DP ideas meet (late 80s-early 90s)
- First success, backgammon program (Tesauro, 1992, 1996)
- Algorithmic progress, analysis, applications (mid 90s)
- Machine learning, big data, robotics,