- 博客(5)
- 资源 (8)
- 问答 (2)
- 收藏
- 关注
原创 [CS294-112] model-based RL
Control and PlanningOpen-loop Trajectory optimization methodsassumptions: a (learned) dynamics model in handobjective: find the optimal action sequence that maximizes the expected return of the tra...
2018-12-28 18:19:54 1332 1
原创 [cs294-112 notes] lecture 6 actor-critic
p4recapping policy gradients.the gradient is computed on a sampling estimate of the original objective. The estimate is averaged across n trajectories and each T time steps.‘reward to go’ is the su...
2018-12-12 16:09:32 179
原创 Learning to Adapt: Meta-Learning for Model-Based Control
sudden changes in environment cause failureif encounter pertub in past experience, can in pri. learn to adaptstudy model-based online adaptationsample efficient than model-freealleviate a challeng...
2018-12-11 01:02:25 663
原创 One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning
IntroductionThe goal is to : enable a robot to learn from one raw video of human demonstrations on a new task, with the help of the prior knowledge of some old tasks, where both human demonstrations ...
2018-12-11 00:43:25 473
原创 [ipython] install ipykernel in multiple environments 多conda环境安装使用ipython
refer to this link.I have multiple environments in anaconda, but ipython breaks and often casts module not found exception.I had thought this is solved in this, but clearly it is not. python -m ipyk...
2018-12-03 09:56:17 681 1
[全] learning from data+e-chapters 机器学习基石/技法(林轩田)textbook
2017-08-26
ippicv_linux_20151201.tgz
2017-08-17
Android编程权威指南pdf(非扫描版不失真)+源代码
2016-07-31
Android编程权威指南第二版(android programming the big nerd ranch edition 2) 源代码
2016-07-30
Spring mvc 如何实现带参数跳转到外部网站页面
2017-03-03
初学C……求助float有效数字的问题
2015-08-29
TA创建的收藏夹 TA关注的收藏夹
TA关注的人