训练过程中的sigmoid激活值特性

最新推荐文章于 2021-11-07 00:34:20 发布

一只荣

最新推荐文章于 2021-11-07 00:34:20 发布

阅读量280

点赞数

分类专栏：深度学习文章标签：深度学习

深度学习专栏收录该内容

4 篇文章 0 订阅

订阅专栏

sigmoid神经网络在每层的激活值均值和标准偏差在训练过程中的演变情况

原因：

Our proposed explanation rests onthe hypothesis that the transformation that the lower layersof the randomly initialized network computes initially isnot useful to the classification task, unlike the transforma-tion obtained from unsupervised pre-training. The logisticlayer outputsoftmax(b+W h)might initially rely more onits biasesb(which are learned very quickly) than on the tophidden activationshderived from the input image (becausehwould vary in ways that are not predictive ofy, maybecorrelated mostly with other and possibly more dominantvariations ofx). Thus the error gradient would tend topushW htowards 0, which can be achieved by pushinghtowards 0. In the case of symmetric activation functionslike the hyperbolic tangent and the softsign, sitting around0 is good because it allows gradients to flow backwards.However, pushing the sigmoid outputs to 0 would bringthem into a saturation regime which would prevent gradi-ents to flow backward and prevent the lower layers fromlearning useful features. Eventually but slowly, the lowerlayers move toward more useful features and the top hiddenlayer then moves out of the saturation regime. Note how-ever that, even after this, the network moves into a solutionthat is of poorer quality (also in terms of generalization) then those found with symmetric activation functions, ascan be seen in figure 11.

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

一只荣 CSDN认证博客专家 CSDN认证企业博客

码龄8年

567: 原创

4万+: 周排名

79万+: 总排名

25万+: 访问

: 等级

6767: 积分

38: 粉丝

95: 获赞

38: 评论

263: 收藏

私信

关注

热门文章

分类专栏

最新评论

论文阅读《LGPMA：Complicated Table Structure Recognition with Local and Global Pyramid Mask Alignment》
CSDN-Ada助手: 你好，CSDN 开始提供 #论文阅读# 的列表服务了。请看：https://blog.csdn.net/nav/advanced-technology/paper-reading?utm_source=csdn_ai_ada_blog_reply 。如果你有更多需求，请来这里 https://gitcode.net/csdn/csdn-tags/-/issues/34?utm_source=csdn_ai_ada_blog_reply 给我们提。
论文阅读《Nougat：Neural Optical Understanding for Academic Documents》
CSDN-Ada助手: 你好，CSDN 开始提供 #论文阅读# 的列表服务了。请看：https://blog.csdn.net/nav/advanced-technology/paper-reading?utm_source=csdn_ai_ada_blog_reply 。如果你有更多需求，请来这里 https://gitcode.net/csdn/csdn-tags/-/issues/34?utm_source=csdn_ai_ada_blog_reply 给我们提。
C++基于FFmpeg对rtmp直播流进行拉流
zoufangyu1987: 您好！请问这个代码可以用GPU方式拉流吗?
2021.9.5笔试题
#苦行僧: 哪来的m卡，别人题目那是ssr装备
2021.9.5笔试题
Troc: 我的理解是，如果有n=4张普通卡，m=2张ssr卡，抽到普通卡的概率是抽到ssr卡的概率的两倍，即n/m，所以抽到一张m卡，对应抽到n/m张普通卡

最新文章

目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。