plot函数_神经网络激活函数

最新推荐文章于 2023-12-02 21:15:06 发布

weixin_39695701

最新推荐文章于 2023-12-02 21:15:06 发布

阅读量77

点赞数

文章标签： plot函数 sigmoid函数求导

学了深度学习也快近一年了，在学Deep Learning的时候什么教程都去看，起初学起来也特别的杂乱，前面的基础没弄懂还一直往下学，自然导致我学到后面的时候有点崩溃什么都看不懂，并且因为自己还是一名在校生，平常课程也非常多，没有一个连续的学习时间也导致我的学习是断断续续，在学习的路上走过许多坑，什么都去学，也导致自己浪费了许多的时间去做了一些无用功，因此也有许多自己的理解与学习过程中的一些心得，今天决定开始写博文，一方面能巩固自己的基础，另一方面也能让我对Deep Learning不一样的见解，毕竟"温故而知新"所以我将从一些神经网络中比较基础的部分来谈谈自己的认识。

首先，从神经网络的激活函数来看，神经网络最常见的并且常用的激活函数莫过于一下几种：Sigmoid，Tanh，ReLU，leaky ReLU,本文主要从前三种进行分析。
然后在进行神经网络单元激活的同时常常会有这么一行表达式：

想必大家都不陌生

我在学习的过程中，视频也好博文也好，更多是可能是理论知识的理解，代码却比较少，所以我也将结合代码进行分析。

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

x = np.arange(-10, 10,1)
W = 1
b = 1
y = W * x + b

print (y)

plt.plot(x,y)
plt.show()

这段程序的输出是这样的：

1. Sigmoid

定义函数并且打印坐标与图像（Ps:定义求导函数是为了理解为什么ReLU是用的比较多的激活函数，下同）

def  sigmoid(x):
    '''
    #定义sigmoid
    '''
    return 1. / (1. + np.exp(-x))
def Derivation_sigmoid(y_sigmoid):
    '''
    #sigmoid求导
    '''
    return y_sigmoid * (1 - y_sigmoid)

y_sigmoid = sigmoid(x)
y_Der_sig = Derivation_sigmoid(y_sigmoid)

print('y_sigmoid:' + str(y_sigmoid))
print('y_Der_sig:' + str(y_Der_sig))
plt.plot(x, y_sigmoid, color = "blue")
plt.plot(x, y_Der_sig, color = "red")
plt.show()

函数的原图与求导后的输出结果与图像如下：

可以看出sigmoid函数在求导后两侧导数无限趋近于0，导致了神经元向更深层的网络传递的梯度变得非常小。网络参数很难得到有效训练。这种现象被称为梯度消失或者梯度弥散。

2. Tanh

def tanh(x):
    '''
    #定义tanh
    '''
    return (np.exp(x) - np.exp(-x)) / (np.exp(x) + np.exp(-x))

def Derivation_tanh(y_tanh):
    '''
    #tanh求导
    '''
    return 1 - y_tanh ** 2
y_tanh = tanh(y)
y_Der_tanh = Derivation_tanh(y_tanh)

print ('y_tanh:' + str(y_tanh))
print ('y_Der_tanh:' + str(y_Der_tanh))
plt.plot(y, y_tanh, color = 'blue')
plt.plot(y, y_Der_tanh, color = 'red')
plt.show()

函数的原函数与求导后的输出结果与图像如下：

同上可以看出tanh函数在求导后两侧导数无限趋近于0，也同样梯度消失或者梯度弥散的问题。

3. ReLU

def ReLU(x):
    '''
    #修正线性单元（Rectified linear unit)
    # max(0, x)
    '''
    return np.array([0 * item  if item <= 0 else item for item in x ])

def Derivation_ReLU(x):
    '''
    #ReLU导数(分段)： 
    #x <= 0时，导数为0 
    #x > 0时，导数为1 
    '''
    return np.array([0 * item if item <= 0 else 1 for item in x ])

y_relu = ReLU(x)

print('y_relu:' + str(y_relu))

plt.plot(x, y_relu,color = "blue")
plt.show()
y_Der_relu = Derivation_ReLU(y_relu)

print('y_Der_relu:' + str(y_Der_relu))
plt.plot(x, y_Der_relu,color = "red")
plt.show()

函数的输出结果与图像如下：

而ReLU在
- x <= 0时，导数为0
- x > 0 时，导数为 1
因此ReLU能够在x>0部分保持梯度不会衰减，但是也存在一个问题，会出现神经元坏死的情况，因为有的神经元可能永远不会被激活

那么谈了这么多，数激活函数到底是拿来干嘛用的呢？用一句话概括就是：激活函数可以引入非线性因素，解决线性模型所不能解决的问题

weixin_39695701

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫