how to learn reinforcement learning(answered by Sergio Valcarcel Macua on Quora)

link:

https://www.quora.com/What-are-the-best-books-about-reinforcement-learning

 

The main RL problems are related to:
- Information representation: from POMDP to predictive state representation to deep-learning to TD-networks
- Inverse RL: how to learn the reward?
- Algorithms
  + Off-policy
  + Large scale: linear and nonlinear approximations of the value function
  + Policy search vs. Q-learning based
- Beyond MDP
  + Policy search for Black-box optimization with global performance guarantees

 

Recommended papers:

* Algorithms for Reinforcement Learning: Csaba Szepesvari. Nice compendium of ready to be implemented algorithms. 

* Reinforcement Learning and Dynamic Programming using Function Approximators. Busoniu, Lucian; Robert Babuska; Bart De Schutter; Damien Ernst (2010). This is a very practical book that explains some state-of-the-art algorithms (i.e., useful for real world problems) like fitted-Q-iteration and its variations.

* Reinforcement Learning: State-of-the-Art. Vol. 12 of Adaptation, Learning, and Optimization. Wiering, M., van Otterlo, M. (Eds.), 2012. Springer, Berlin. In Sutton's words "This book is a valuable resource for students wanting to
go beyond the older textbooks and for researchers wanting to easily catch up with
recent developments".

* Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles: Draguna Vrabie, Kyriakos G. Vamvoudakis, Frank L. Lewis. I am not familiar with this one, but I have seen it recommended.

* Markov Decision Processes in Artificial Intelligence, Sigaud O. & Buffet O. editors, ISTE Ld., Wiley and Sons Inc, 2010.

 There are also several good specialized monographs and surveys on the topic, some of these are:

+ "From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning" by Remi Munos (New trends on Machine Learning). This monograph covers important nonconvex optimistic optimization methods that can be applied to policy search. 

+ "Reinforcement Learning in Robotics: A Survey" by J. Kober, J. A. Bagnell and J. Peters. 

+ "A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning" by A. Geramifard, T. J. Walsh, S. Tllex, G. Chowdhary, N. Roy and J. P. How (Foundations and Trends in Machine Learning). 

+ "A Survey on Policy Search for Robotic" by Newmann and Peters (Foundations and Trends in Machine Learning). 

转载于:https://www.cnblogs.com/cxxszz/p/6959594.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值