【深度学习】逐层贪婪预训练 (greedy layer-wise pre-training)

UFLDL栈式自编码器摘来的话:

每次只训练网络中的一层,即我们首先训练一个只含一个隐藏层的网络,仅当这层网络训练结束之后才开始训练一个有两个隐藏层的网络,以此类推。
这里写图片描述

在每一步中,我们把已经训练好的前k-1层固定,然后增加第k层(也就是将我们已经训练好的前k-1的输出作为输入)。每一层的训练可以是有监督的(例如,将每一步的分类误差作为目标函数),但更通常使用无监督方法(例如自动编码器)。
这里写图片描述

这些各层单独训练所得到的权重被用来初始化最终(或者说全部)的深度网络的权重,然后对整个网络进行“微调”(即把所有层放在一起来优化有标签训练集上的训练误差)。

Note

  • 什么Autoencoder啦、RBM啦,现在都已经 没人用了
  • 现在所常说的 pre-training (预训练) ,其实 专指 migration learning (迁移学习),那是一种无比强大又省事儿的trick。
  • 9
    点赞
  • 29
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
Complexity theory of circuits strongly suggests that deep architectures can be much more efcient sometimes exponentially than shallow architectures in terms of computational elements required to represent some functions Deep multi layer neural networks have many levels of non linearities allowing them to compactly represent highly non linear and highly varying functions However until recently it was not clear how to train such deep networks since gradient based optimization starting from random initialization appears to often get stuck in poor solutions Hinton et al recently introduced a greedy layer wise unsupervised learning algorithm for Deep Belief Networks DBN a generative model with many layers of hidden causal variables In the context of the above optimization problem we study this algorithm empirically and explore variants to better understand its success and extend it to cases where the inputs are continuous or where the structure of the input distribution is not revealing enough about the variable to be predicted in a supervised task Our experiments also conrm the hypothesis that the greedy layer wise unsupervised training strategy mostly helps the optimization by initializing weights in a region near a good local minimum giving rise to internal distributed representations that are high level abstractions of the input bringing better generalization ">Complexity theory of circuits strongly suggests that deep architectures can be much more efcient sometimes exponentially than shallow architectures in terms of computational elements required to represent some functions Deep multi layer neural networks have many levels of non linearities allowin [更多]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值