[深度学习论文笔记][Weight Initialization] 参数初始化部分论文导读

最新推荐文章于 2023-01-30 16:19:58 发布

Hao_Zhang_Vision

最新推荐文章于 2023-01-30 16:19:58 发布

阅读量1.7k

点赞数

本文链接：https://blog.csdn.net/Hao_Zhang_Vision/article/details/52622788

版权

本文探讨了深度学习中权重初始化的重要性，指出全零初始化的弊端，并介绍了Xavier初始化、ReLU调整的初始化、正交矩阵初始化、数据驱动的初始化以及批量归一化等方法。这些方法旨在保持层间神经元的方差，解决使用ReLU等非线性时的梯度问题，加速深度网络训练。

摘要由CSDN通过智能技术生成

Training a CNN is hard because
• Large number of parameters requires heavy computation.
• The learning objective is non-convex, and has many poor local minima.
• Deep network has vanishing/exploding gradients problem.
• Need large amount of training data.

As the matter of to handle vanishing/exploding gradients, methods mainly include
• Careful set the learning rate.
• Design better CNN architecture, activation functions, etc.
• Careful initialization of weights.
• Tuning the data distribution.

We will focus on the last two topics to handle vanishing/exploding gradients problem in this section.

Weight initialization is very important in deep learning. I think one of the reasons that early networks did not work as well is that people did not care about it too much.

Initializing all the weights to 0 is a bad idea since all the neurons learn the same

最低0.47元/天解锁文章

Hao_Zhang_Vision

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
[深度学习论文笔记][Weight Initialization] 参数初始化部分论文导读

Training a CNN is hard because• Large number of parameters requires heavy computation.• The learning objective is non-convex, and has many poor local minima.• Deep network has vanishing/explodin
复制链接

扫一扫