自定义博客皮肤VIP专享

*博客头图:

格式为PNG、JPG,宽度*高度大于1920*100像素,不超过2MB,主视觉建议放在右侧,请参照线上博客头图

请上传大于1920*100像素的图片!

博客底图:

图片格式为PNG、JPG,不超过1MB,可上下左右平铺至整个背景

栏目图:

图片格式为PNG、JPG,图片宽度*高度为300*38像素,不超过0.5MB

主标题颜色:

RGB颜色,例如:#AFAFAF

Hover:

RGB颜色,例如:#AFAFAF

副标题颜色:

RGB颜色,例如:#AFAFAF

自定义博客皮肤

-+

qiusuoxiaozi的博客

做独立思考,敢于尝试的研究者!

  • 博客(4)
  • 资源 (13)
  • 收藏
  • 关注

原创 A thorough understanding of on-policy and off-policy in Reinforcement learning

一句话区分on-policy and off-policy: 看behaviour policy和current policy是不是同一个就OK了!我这篇文章主要想借着理解on-policy和off-policy的过程来加深对其他RL算法的认识。因为万事万物总是相互联系的,所以在自己探究,琢磨为什么有些算法是on-policy或者off-policy的过程中,对于它们的本质也有了更深的认识。

2018-01-24 19:57:31 658

原创 The awkward Bellman optimality equation in RL

通过博文2017 Fall CS294 Lecture 6: Actor-critic introduction,一文中插播的Reinforcement Learning: An introduction(Sutton1998)书中的一页截图,对于 Vπ(s)V^\pi(s): the state-value function for policy π\pi. Qπ(s,a)Q^\pi(s,a

2018-01-21 14:29:15 1213

原创 MADDPG翻译

论文全称:Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments 项目地址: https://blog.openai.com/learning-to-cooperate-compete-and-communicate/本文是对MADDPG的翻译,huanghe摘要一, 引言二, 相关工作三...

2018-01-19 10:49:28 29834 21

原创 Python中的Attempted relative import in non-package问题

最近在帮一个伙伴debug的时候发现,在一个package的内部,直接run一个.py文件,会报错说ValueError: Attempted relative import in non-package。原来这是因为,当我run的这个.py文件,如果它在某个package的文件夹下,而且这个.py文件夹内有诸如: from . import from .. import

2018-01-15 10:41:30 21732 4

PRML 模式识别与机器学习(英文)

PRML 模式识别与机器学习

2017-08-01

Python numpy

直接下载双击即可安装numpy, 免去考虑compile的麻烦。 关于scipy的安装(也比较麻烦)也有类似的软件,具体请见我的blog

2016-04-24

提高matlab代码速度的Tips

To speed up your matlab code, which is downloaded from UFLDL website

2016-04-14

UFLDL exercise9 Convolution and Pooling

In this exercise you will use the features you learned on 8x8 patches sampled from images from the STL-10 dataset in the earlier exercise on linear decoders for classifying images from a reduced STL-10 dataset applying convolution and pooling. The reduced STL-10 dataset comprises 64x64 images from 4 classes (airplane, car, cat, dog).

2016-04-13

UFLDL exercise8 Linear Decoder

In this exercise, you will implement a linear decoder (a sparse autoencoder whose output layer uses a linear activation function). You will then apply it to learn features on color images from the STL-10 dataset. These features will be used in an later exercise on convolution and pooling for classifying STL-10 images.

2016-04-12

UFLDL exercise7 Stacked Autoencoder

In this exercise, you will use a stacked autoencoder for digit classification.

2016-04-11

UFLDL exercise5 Softmax Regression

Softmax Regression, which will be useful in later exercise.

2016-04-11

UFLDL exercise6 Self-Taught Learning

The self-taught learning paradigm with the sparse autoencoder and softmax classifier to build a classifier for handwritten digits.

2016-04-09

UFLDL exercise3&4 PCA and Whitening

里面有两个exercises 分别是PCA, PCA whitening and ZCA whitening in 2D 和 PCA and Whitening on natural images

2016-04-09

UFLDL exercise1 Sparse Autoencoder

已经矢量化了

2016-04-09

UFLDL exercise2 Learn features for handwritten digits

可以参考下载的log.txt文件

2016-04-09

有关图论的ppt

2014年暑期数模培训时老师讲的一些图论的东西,印象深刻。

2016-03-21

stanford CS229 课程讲义

Andrew Ng 主讲 CS229 关于马尔科夫决策过程的课程讲义

2016-03-08

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示
确定要删除当前文章?
取消 删除