2018年12月_Hazekiah

12月 11月 10月

原创 [CS294-112] model-based RL

Control and PlanningOpen-loop Trajectory optimization methodsassumptions: a (learned) dynamics model in handobjective: find the optimal action sequence that maximizes the expected return of the tra...

2018-12-28 18:19:54 1332 1

原创 [cs294-112 notes] lecture 6 actor-critic

p4recapping policy gradients.the gradient is computed on a sampling estimate of the original objective. The estimate is averaged across n trajectories and each T time steps.‘reward to go’ is the su...

2018-12-12 16:09:32 179

原创 Learning to Adapt: Meta-Learning for Model-Based Control

sudden changes in environment cause failureif encounter pertub in past experience, can in pri. learn to adaptstudy model-based online adaptationsample efficient than model-freealleviate a challeng...

2018-12-11 01:02:25 663

原创 One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning

IntroductionThe goal is to : enable a robot to learn from one raw video of human demonstrations on a new task, with the help of the prior knowledge of some old tasks, where both human demonstrations ...

2018-12-11 00:43:25 473

原创 [ipython] install ipykernel in multiple environments 多conda环境安装使用ipython

refer to this link.I have multiple environments in anaconda, but ipython breaks and often casts module not found exception.I had thought this is solved in this, but clearly it is not. python -m ipyk...

2018-12-03 09:56:17 681 1