【计算机科学】【2019.06】卷积神经网络的贪婪分层训练

在这里插入图片描述

本文为美国麻省理工学院(作者:Loc Quang Trinh)的硕士论文,共63页。

分层训练是一种端到端反向传播的深度卷积神经网络训练方法。尽管以前的工作未能证明分层训练的可行性,特别是在ImageNet等大规模数据集上,但最近的工作表明,在特定体系结构上的分层训练可以产生极具竞争力的性能。在ImageNet上,分层训练网络的性能与许多最先进的端到端训练网络相当。

在这篇论文中,我们比较了两种训练方法在各种网络架构下的性能差距,并进一步分析了分层训练可能存在的局限性。我们的研究结果表明,在经过一定的关键层后,分层训练很快饱和,原因是过度拟合造成的。我们将讨论几种解决此问题的方法,并帮助跨多个体系结构改进分层训练。从根本上讲,本研究强调打开黑盒,即现代深度神经网络,通过分层训练的视角研究深度网络中间隐藏层之间的分层交互作用。

Layerwise training presents an alternative approach to end-to-end back-propagation for training deep convolutional neural networks. Although previous work was unsuccessful in demonstrating the viability of layerwise training, especially on large-scale datasets such as ImageNet, recent work has shown that layerwise training on specific architectures can yield highly competitive performances. On ImageNet, the layerwise trained networks can perform comparably to many state-of-the-art end-to-end trained networks. In this thesis, we compare the performance gap between the two training procedures across a wide range of network architectures and further analyze the possible limitations of layerwise training. Our results show that layerwise training quickly saturates after a certain critical layer, due to the overfitting of early layers within the networks. We discuss several approaches we took to address this issue and help layerwise training improve across multiple architectures. From a fundamental standpoint, this study emphasizes the need to open the blackbox that is modern deep neural networks and investigate the layerwise interactions between intermediate hidden layers within deep networks, all through the lens of layerwise training.

  1.   引言
    
  2. 分层训练
  3. 理解学习表示方式
  4. 分层训练的局限性
  5. 减速训练
  6. 结论

更多精彩文章请关注公众号:在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值