DSD:Dense-Sparse-Dense training for deep neural networks

最新推荐文章于 2024-08-05 15:57:37 发布

weixin_30788619

最新推荐文章于 2024-08-05 15:57:37 发布

阅读量140

点赞数

文章标签：人工智能

原文链接：http://www.cnblogs.com/mengmengmiaomiao/p/7652779.html

版权

ICLR 2017会议论文。

摘要：

神经网络因为参数很多，所以很难训练。

Modern deep neural networks have a large number of parameters, making them very hard to train.

所以，分步骤训练参数。

We propose DSD, a dense-sparse-dense training ﬂow, for regularizing deep neural networks and achieving better optimization performance. In the ﬁrst D (Dense) step, we train a dense network to learn connection weights and importance. In the S (Sparse) step, we regularize the network by pruning the unimportant connections with small weights and retraining the network given the sparsity constraint. In the ﬁnal D (re-Dense) step, we increase the model capacity by removing the sparsity constraint, re-initialize the pruned parameters from zero and retrain the whole dense network.

实验结果好。

Experiments show that DSD training can improve the performance for a wide range of CNNs, RNNs and LSTMs on the tasks of image classiﬁcation, caption generation and speech recognition

转载于:https://www.cnblogs.com/mengmengmiaomiao/p/7652779.html

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

weixin_30788619

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
DSD:Dense-Sparse-Dense training for deep neural networks

ICLR 2017会议论文。摘要：神经网络因为参数很多，所以很难训练。Modern deep neural networks have a large number of parameters, making them very hard to train.所以，分步骤训练参数。We propose DSD, a dense-sparse-dense training ﬂow,...
复制链接

扫一扫