Reinforcement Learning
monkey_rose
这个作者很懒,什么都没留下…
展开
-
Reinforcement Learning - An Introduction memo
1.MDP(Markov Decision Processes)finite MDP: finite state space&finite action spacetransition probabilities:p(s′ | s, a) = Pr{St+1 = s′ | St = s, At = a}r(s, a, s′) = E[Rt+1 | St = s, At = a, St+1 = s′]原创 2018-03-26 21:31:01 · 249 阅读 · 0 评论 -
Playing atari with deep reinforcement learning
Introduction传统RL手动提取选择特征,DL(CNN)可以根据raw sensory的数据抽象出一些high-level特征,DRL的困难:DL训练使用大量的labeled数据,而RL数据量小,且reward和action之间的delayDL样本间独立,RL样本间相关RL样本的分布随着学习改变,DL中设定的固定的分布本文针对2,3困难的方法: experience...原创 2018-04-10 14:35:53 · 924 阅读 · 0 评论