softmax单元,为什么只在输出层而不是隐藏层中使用softmax?

Most examples of neural networks for classification tasks I've seen use the a softmax layer as output activation function. Normally, the other hidden units use a sigmoid, tanh, or ReLu function as activation function. Using the softmax function here would - as far as I know - work out mathematically too.

What are the theoretical justifications for not using the softmax function as hidden layer activation functions?

Are there any publications about this, something to quote?

解决方案

I haven't found any publications about why using softmax as an activation in a hidden layer is not the best idea (except Quora question which you probably have already read) but I will try to explain why it is not the best idea to use it in this case :

1. Variables independence : a lot of regularization and effort is put to keep your variables independent, uncorrelated and quite sparse. If you use softmax layer as a hidden layer - then you will keep all your nodes (hidden variables) linearly dependent which may result in many problems and poor generalization.

2. Training issues : try to imagine that to make your network working better you have to make a part of activations from your hidden layer a little bit lower. Then - automaticaly you are making rest of them to have mean activation on a higher level which might in fact increase the error and harm your training phase.

3. Mathematical issues : by creating constrains on activations of your model you decrease the expressive power of your model without any logical explaination. The strive for having all activations the same is not worth it in my opinion.

4. Batch normalization does it better : one may consider the fact that constant mean output from a network may be useful for training. But on the other hand a technique called Batch Normalization has been already proven to work better, whereas it was reported that setting softmax as activation function in hidden layer may decrease the accuracy and the speed of learning.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值