先上结论
Sigmoid函数:
f
(
x
)
=
1
1
+
e
−
x
f(x)= \frac{1}{1+e^{-x}}
f(x)=1+e−x1
Sigmoid函数的导数:
f
′
(
x
)
=
f
(
x
)
(
1
−
f
(
x
)
)
f'(x)=f(x)(1-f(x))
f′(x)=f(x)(1−f(x))
基础知识
在推导之前,我们先回顾一些相关的求导基础知识:
- 若 f ( x ) = 1 x f(x)=\frac{1}{x} f(x)=x1,则 f ′ ( x ) = − 1 x 2 f'(x)=- \frac{1}{x^2} f′(x)=−x21
- 若 f ( x ) = e x f(x)=e^x f(x)=ex,则 f ′ ( x ) = e x f'(x)=e^x f′(x)=ex
更多请查看:常见求导法则
推导过程
方法1
首先,对
f
(
x
)
f(x)
f(x)进行变形:
f
(
x
)
=
1
1
+
e
−
x
=
1
1
+
1
e
x
=
(
1
+
1
e
x
)
−
1
=
(
e
x
e
x
+
1
e
x
)
−
1
=
(
e
x
+
1
e
x
)
−
1
=
e
x
e
x
+
1
=
(
e
x
+
1
)
−
1
e
x
+
1
=
e
x
+
1
e
x
+
1
−
1
e
x
+
1
=
1
−
1
e
x
+
1
=
1
−
(
e
x
+
1
)
−
1
\begin{aligned} f(x)&= \frac{1}{1+e^{-x}} \\ &= \frac{1}{1+\frac{1}{e^x}} \\ &=(1+\frac{1}{e^x})^{-1} \\ &=(\frac{e^x}{e^x}+\frac{1}{e^x})^{-1} \\ &=(\frac{e^{x}+1}{e^x})^{-1} \\ &=\frac{e^x}{e^{x}+1} \\ &=\frac{(e^{x}+1)-1}{e^{x}+1} \\ &=\frac{e^{x}+1}{e^{x}+1}-\frac{1}{e^{x}+1} \\ &=1-\frac{1}{e^{x}+1} \\ &=1-(e^{x}+1)^{-1} \end{aligned}
f(x)=1+e−x1=1+ex11=(1+ex1)−1=(exex+ex1)−1=(exex+1)−1=ex+1ex=ex+1(ex+1)−1=ex+1ex+1−ex+11=1−ex+11=1−(ex+1)−1
求导:
注意法则使用
- ( u ( x ) ± v ( x ) ) ′ = u ′ ( x ) ± v ′ ( x ) (u(x) \pm v(x))'=u'(x) \pm v'(x) (u(x)±v(x))′=u′(x)±v′(x)
- 以及链式法则求导: ( f ( g ( x ) ) ) ′ = f ′ ( g ( x ) ) g ′ ( x ) (f(g(x)))'=f'(g(x))g'(x) (f(g(x)))′=f′(g(x))g′(x)
f
′
(
x
)
=
(
1
−
(
e
x
+
1
)
−
1
)
′
=
(
−
1
)
(
−
1
)
(
e
x
+
1
)
−
2
e
x
=
(
e
x
+
1
)
−
2
e
x
=
(
e
x
+
1
)
−
1
(
e
x
+
1
)
−
1
e
x
\begin{aligned} f'(x)&=(1-(e^{x}+1)^{-1})' \\ &=(-1)(-1)(e^{x}+1)^{-2} e^{x}\\ &=(e^{x}+1)^{-2} e^{x}\\ &=(e^{x}+1)^{-1}(e^{x}+1)^{-1} e^{x} \end{aligned}
f′(x)=(1−(ex+1)−1)′=(−1)(−1)(ex+1)−2ex=(ex+1)−2ex=(ex+1)−1(ex+1)−1ex
由前面提到的
f
(
x
)
f(x)
f(x)的变形可知:
f
(
x
)
=
1
1
+
e
−
x
=
(
1
+
e
−
x
)
−
1
=
e
x
e
x
+
1
=
e
x
(
e
x
+
1
)
−
1
\begin{aligned} f(x)&=\frac{1}{1+e^{-x}} =(1+e^{-x})^{-1}=\frac{e^{x}}{e^{x}+1}=e^{x}(e^{x}+1)^{-1} \end{aligned}
f(x)=1+e−x1=(1+e−x)−1=ex+1ex=ex(ex+1)−1
所以:
f
′
(
x
)
=
(
e
x
+
1
)
−
1
⋅
(
e
x
+
1
)
−
1
e
x
=
(
e
x
+
1
)
−
1
⋅
e
x
(
e
x
+
1
)
−
1
=
(
e
x
+
1
)
−
1
⋅
(
1
+
e
−
x
)
−
1
=
1
e
x
+
1
⋅
1
1
+
e
−
x
=
(
e
x
+
1
)
−
e
x
e
x
+
1
⋅
1
1
+
e
−
x
=
(
e
x
+
1
e
x
+
1
−
e
x
e
x
+
1
)
⋅
1
1
+
e
−
x
=
(
1
−
e
x
e
x
+
1
)
⋅
1
1
+
e
−
x
=
(
1
−
1
1
+
e
−
x
)
⋅
1
1
+
e
−
x
=
(
1
−
f
(
x
)
)
⋅
f
(
x
)
=
f
(
x
)
(
1
−
f
(
x
)
)
\begin{aligned} f'(x)&=(e^{x}+1)^{-1} \cdot (e^{x}+1)^{-1} e^{x} \\ &= (e^{x}+1)^{-1} \cdot e^{x}(e^{x}+1)^{-1} \\ &=(e^{x}+1)^{-1} \cdot (1+e^{-x})^{-1} \\ &=\frac{1}{e^{x}+1} \cdot \frac{1}{1+e^{-x}} \\ &=\frac{(e^{x}+1)-e^{x}}{e^{x}+1} \cdot \frac{1}{1+e^{-x}} \\ &=(\frac{e^{x}+1}{e^{x}+1}-\frac{e^{x}}{e^{x}+1}) \cdot \frac{1}{1+e^{-x}} \\ &=(1-\frac{e^{x}}{e^{x}+1}) \cdot \frac{1}{1+e^{-x}} \\ &=(1-\frac{1}{1+e^{-x}}) \cdot \frac{1}{1+e^{-x}} \\ &=(1-f(x)) \cdot f(x) \\ &=f(x)(1-f(x)) \end{aligned}
f′(x)=(ex+1)−1⋅(ex+1)−1ex=(ex+1)−1⋅ex(ex+1)−1=(ex+1)−1⋅(1+e−x)−1=ex+11⋅1+e−x1=ex+1(ex+1)−ex⋅1+e−x1=(ex+1ex+1−ex+1ex)⋅1+e−x1=(1−ex+1ex)⋅1+e−x1=(1−1+e−x1)⋅1+e−x1=(1−f(x))⋅f(x)=f(x)(1−f(x))
方法2
Sigmoid 函数的数学表达式为:
σ ( x ) = 1 1 + e − x \sigma(x) = \frac{1}{1 + e^{-x}} σ(x)=1+e−x1
我们要对其进行求导,通常会用到链式法则,也就是内导数乘以外导数。计算过程如下:
首先,令内部函数为:
u
=
1
+
e
−
x
u = 1 + e^{-x}
u=1+e−x
则 Sigmoid 函数可以表示为外部函数:
σ
(
x
)
=
u
−
1
\sigma(x) = u^{-1}
σ(x)=u−1
接着我们计算这两个函数的导数:
- 内导数(对于
u
=
1
+
e
−
x
u = 1 + e^{-x}
u=1+e−x 关于
x
x
x 的导数):
d u d x = − e − x \frac{du}{dx} = -e^{-x} dxdu=−e−x
假设 v = − x v=-x v=−x,则对 1 + e − x 1+e^{-x} 1+e−x求导为: d d x ( 1 ) + d d x ( e − x ) = 0 + d d x ( e v ) \frac{d}{dx}(1)+\frac{d}{dx}(e^{-x})=0+\frac{d}{dx}(e^{v}) dxd(1)+dxd(e−x)=0+dxd(ev)
结合链式法则:
d d x ( e v ) = [ d d x ( e v ) ] d d v v = e v d d v v = e − x d d v ( − x ) = e − x × − 1 = − e − x \frac{d}{dx}(e^{v})=[\frac{d}{dx}(e^{v})] \frac{d}{dv}v=e^{v} \frac{d}{dv}v=e^{-x} \frac{d}{dv}(-x)=e^{-x} \times -1=-e^{-x} dxd(ev)=[dxd(ev)]dvdv=evdvdv=e−xdvd(−x)=e−x×−1=−e−x
- 外导数(对于
σ
(
x
)
=
u
−
1
\sigma(x) = u^{-1}
σ(x)=u−1 关于
u
u
u 的导数):
d σ d u = − u − 2 = − 1 u 2 \frac{d\sigma}{du} = -u^{-2} = -\frac{1}{u^2} dudσ=−u−2=−u21
现在,根据链式法则,找到 Sigmoid 函数关于
x
x
x 的导数:
d
σ
d
x
=
d
σ
d
u
⋅
d
u
d
x
\frac{d\sigma}{dx} = \frac{d\sigma}{du} \cdot \frac{du}{dx}
dxdσ=dudσ⋅dxdu
d
σ
d
x
=
−
1
u
2
⋅
(
−
e
−
x
)
\frac{d\sigma}{dx} = -\frac{1}{u^2} \cdot (-e^{-x})
dxdσ=−u21⋅(−e−x)
d
σ
d
x
=
e
−
x
(
1
+
e
−
x
)
2
\frac{d\sigma}{dx} = \frac{e^{-x}}{(1 + e^{-x})^2}
dxdσ=(1+e−x)2e−x
使用 Sigmoid 函数的原始定义和上面的导数结果,我们可以将导数简化为 Sigmoid 函数的形式:
d
σ
d
x
=
1
1
+
e
−
x
(
1
−
1
1
+
e
−
x
)
\frac{d\sigma}{dx} = \frac{1}{1 + e^{-x}} \left(1 - \frac{1}{1 + e^{-x}}\right)
dxdσ=1+e−x1(1−1+e−x1)
d
σ
d
x
=
σ
(
x
)
(
1
−
σ
(
x
)
)
\frac{d\sigma}{dx} = \sigma(x)(1 - \sigma(x))
dxdσ=σ(x)(1−σ(x))
所以最终得到的 Sigmoid 函数关于 x x x 的导数是:
d σ d x = σ ( x ) ( 1 − σ ( x ) ) \frac{d\sigma}{dx} = \sigma(x)(1 - \sigma(x)) dxdσ=σ(x)(1−σ(x))
如果觉得这篇文章有用,就给个赞👍和收藏⭐️吧!也欢迎在评论区分享你的看法!