学习资料重要
相关博客:http://blog.csdn.net/dark_scope/article/details/8252969
专栏:http://blog.csdn.net/column/details/deeprl.html
增强学习课程 David Silver (有视频和ppt):http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Teaching.html
最好的增强学习教材:
Reinforcement Learning: An Introduction:https://webdocs.cs.ualberta.ca/~sutton/book/the-book.html
深度学习课程 (有视频有ppt有作业):Machine Learning
深度增强学习的讲座都是David Silver的:
ICLR 2015 part 1 https://www.youtube.com/watch?v=EX1CIVVkWdE
ICLR 2015 part 2 https://www.youtube.com/watch?v=zXa6UFLQCtg
UAI 2015 https://www.youtube.com/watch?v=qLaDWKd61Ig
RLDM 2015 Deep Reinforcement Learning - VideoLectures.NET
其他课程:
增强学习
Michael Littman: https://www.udacity.com/course/reinforcement-learning–ud600
AI(包含增强学习,使用Pacman实验)
Pieter Abbeel:ColumbiaX: Artificial Intelligence (AI) | edX
Deep reinforcement Learning:
Pieter Abbeel http://rll.berkeley.edu/deeprlcourse/
高级机器人技术(Advanced Robotics):
Pieter Abbeel:http://www.cs.berkeley.edu/~pabbeel/cs287-fa15/
深度学习相关课程:
用于视觉识别的卷积神经网络(Convolutional Neural Network for visual network):CS231n Convolutional Neural Networks for Visual Recognition
机器学习 Machine Learning
Andrew Ng:
Supervised Machine Learning: Regression and Classification | Coursera
神经网络(Neural Network for Machine Learning)(2012年的)
Hinton:https://www.coursera.org/course/neuralnets
最新机器人专题课程Penn(2016年开课):Robotics Specialization [6 courses] (Penn) | Coursera
2 论文资料
这两个人收集的基本涵盖了当前deep reinforcement learning 的论文资料。
3 大牛情况:
DeepMind:http://www.deepmind.com/publications.html
Pieter Abbeel 团队:http://www.eecs.berkeley.edu/~pabbeel/
Satinder Singh:Home page for Satinder Singh (Baveja) and Reinforcement Learning
CMU 进展:Lerrel Pinto
Prefered Networks: (日本创业公司)
Deep Reinforcement Learning Workshop NIPS 2015 : Deep Reinforcement Learning Workshop
深度学习研究总结:强化学习技术趋势与分析(经典论文)
ICLR 2017中和Deep Reinforcement Learning相关的论文我这边收集了一下,一共有30篇(可能有漏),大部分来自于DeepMind和OpenAI,可见DRL依然主要由DeepMind和OpenAI把持。
2 DeepMind的论文分析
[1] LEARNING TO COMPOSE WORDS INTO SENTENCES WITH REINFORCEMENT LEARNING
[2] LEARNING TO NAVIGATE IN COMPLEX ENVIRONMENTS
[3] LEARNING TO PERFORM PHYSICS EXPERIMENTS VIA DEEP REINFORCEMENT LEARNING
[4] PGQ: COMBINING POLICY GRADIENT AND Q- LEARNING
[5] Q-PROP: SAMPLE-EFFICIENT POLICY GRADIENT WITH AN OFF-POLICY CRITIC
[6] REINFORCEMENT LEARNING WITH UNSUPERVISED AUXILIARY TASKS
[7] SAMPLE EFFICIENT ACTOR-CRITIC WITH EXPERIENCE REPLAY
[8] THE PREDICTRON: END-TO-END LEARNING AND PLANNING
3 OpenAI的论文分析(包含Sergey Levine的论文)
[9] #EXPLORATION: A STUDY OF COUNT-BASED EXPLORATION FOR DEEP REINFORCEMENT LEARNING
[10] GENERALIZING SKILLS WITH SEMI-SUPERVISED REINFORCEMENT LEARNING
[11] LEARNING INVARIANT FEATURE SPACES TO TRANS- FER SKILLS WITH REINFORCEMENT LEARNING
[12] LEARNING VISUAL SERVOING WITH DEEP FEATURES AND TRUST REGION FITTED Q-ITERATION
[13] MODULAR MULTITASK REINFORCEMENT LEARNING WITH POLICY SKETCHES
[14] STOCHASTIC NEURAL NETWORKS FOR HIERARCHICAL REINFORCEMENT LEARNING
[15] THIRD PERSON IMITATION LEARNING
[16] UNSUPERVISED PERCEPTUAL REWARDS FOR IMITATION LEARNING
[17] EPOPT: LEARNING ROBUST NEURAL NETWORK POLICIES USING MODEL ENSEMBLES
[18] RL2: FAST REINFORCEMENT LEARNING VIA SLOW REINFORCEMENT LEARNING
4 其他论文
[19] COMBATING DEEP REINFORCEMENT LEARNING’S SISYPHEAN CURSE WITH INTRINSIC FEAR
[20] COMMUNICATING HIERARCHICAL NEURAL CONTROLLERS FOR LEARNING
ZERO-SHOT TASK GENERALIZATION
[21] DESIGNING NEURAL NETWORK ARCHITECTURES USING REINFORCEMENT LEARNING
[22] LEARNING TO PLAY IN A DAY: FASTER DEEP REIN- FORCEMENT LEARNING BY OPTIMALITY TIGHTENING
[23] LEARNING TO REPEAT: FINE GRAINED ACTION REPETITION FOR DEEP REINFORCEMENT LEARNING
[24] MULTI-TASK LEARNING WITH DEEP MODEL BASED REINFORCEMENT LEARNING
[25] NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING
[26] OPTIONS DISCOVERY WITH BUDGETED REINFORCE- MENT LEARNING
[27] REINFORCEMENT LEARNING THROUGH ASYNCHRONOUS ADVANTAGE ACTOR-CRITIC ON A GPU
[28] SPATIO-TEMPORAL ABSTRACTIONS IN REINFORCEMENT LEARNING THROUGH NEURAL ENCODING
[29] SURPRISE-BASED INTRINSIC MOTIVATION FOR DEEP REINFORCEMENT LEARNING
[30] TUNING RECURRENT NEURAL NETWORKS WITH REINFORCEMENT LEARNING