Keras默认权值初始化方式

V丶Chao

已于 2023-02-17 14:57:53 修改

阅读量657

点赞数

分类专栏：深度学习文章标签： keras 深度学习人工智能

于 2023-02-17 14:56:04 首次发布

本文链接：https://blog.csdn.net/u011698800/article/details/129086944

版权

深度学习专栏收录该内容

30 篇文章 1 订阅

订阅专栏

20230117 -

在最初使用Keras进行神经网络编程的时候，除了设置神经元个数，层数，或者激活函数之后，基本上对神经网络内部就不怎么管了，所以最后很多参数都是默认的。这种情况一般遇到的数据集问题，都能轻易解决。一般不是层数非常深的神经网络，偶尔也遇到过梯度爆炸和消失的问题。

但最近遇到一个数据集，默认的情况下效果也还行，但希望更进一步。就希望通过权值初始化的角度来进行改进。Keras默认的权值初始化方式[1]是

Each layer has its own default value for initializing the weights. For most of the layers, such as Dense, convolution and RNN layers, the default kernel initializer is ‘glorot_uniform’ and the default bias intializer is ‘zeros’ (you can find this by going to the related section for each layer in the documentation; for example here is the Dense layer doc). You can find the definition of glorot_uniform initializer here in the Keras documentation.

这个从他官方的文档上也是能看出的：
在这里插入图片描述

在这个问答下，另外一个人提到了一个文章[2]，具体讲述了两种初始化方式，Xavier和kaiming两种方式，从他的结论中得出，

I think this article is very interesting and it shows roughly that for “tanh” activations you should use ‘glorot_uniform’ and for “relu” layers you should use “he_uniform”

从文章的理论分析来看，确实是有用的。但是我这里并没有具体编程验证。同时，这里有一篇问答[3]对比了keras和torch两者对he_normal的实现对比。

参考

[1]Where to find a documentation about default weight initializer in Keras?
[2]Weight Initialization in Neural Networks: A Journey From the Basics to Kaiming
[3]he_normal (Keras) is truncated when kaiming_normal_ (pytorch) is not

V丶Chao

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Keras默认权值初始化方式

20230117 -在最初使用Keras进行神经网络编程的时候，除了设置神经元个数，层数，或者激活函数之后，基本上对神经网络内部就不怎么管了，所以最后很多参数都是默认的。这种情况一般遇到的数据集问题，都能轻易解决。一般不是层数非常深的神经网络，偶尔也遇到过梯度爆炸和消失的问题。但最近遇到一个数据集，默认的情况下效果也还行，但希望更进一步。就希望通过权值初始化的角度来进行改进。
复制链接

扫一扫