关于吴恩达深度学习总结(一)相关函数
文章目录
一、cost function(成本函数)
衡量在全体训练样本上的表现情况
(6)
J
=
1
m
∑
i
=
1
m
L
(
a
(
i
)
,
y
(
i
)
)
J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})\tag{6}
J=m1i=1∑mL(a(i),y(i))(6)
J = − 1 m ∑ i = 1 m y ( i ) log ( a ( i ) ) + ( 1 − y ( i ) ) log ( 1 − a ( i ) ) J = -\frac{1}{m}\sum_{i=1}^{m}y^{(i)}\log(a^{(i)})+(1-y^{(i)})\log(1-a^{(i)}) J=−m1i=1∑my(i)log(a(i))+(1−y(i))log(1−a(i))
二、loss function(损失函数)
衡量算法的运行情况,衡量在单个训练样本上的表现情况
(3)
L
(
a
(
i
)
,
y
(
i
)
)
=
−
y
(
i
)
log
(
a
(
i
)
)
−
(
1
−
y
(
i
)
)
log
(
1
−
a
(
i
)
)
\mathcal{L}(a^{(i)}, y^{(i)}) = - y^{(i)} \log(a^{(i)}) - (1-y^{(i)} ) \log(1-a^{(i)})\tag{3}
L(a(i),y(i))=−y(i)log(a(i))−(1−y(i))log(1−a(i))(3)
三、sigmoid function(sigmoid函数)
Sigmoid函数常被用作神经网络的阈值函数,将变量映射到0,1之间。
s
i
g
m
o
i
d
(
x
)
=
1
1
+
e
−
x
sigmoid(x) = \frac{1}{1+e^{-x}}
sigmoid(x)=1+e−x1
四、y hat
识别对象满足y=1的概率
(2)
y
^
(
i
)
=
a
(
i
)
=
s
i
g
m
o
i
d
(
z
(
i
)
)
\hat{y}^{(i)} = a^{(i)} = sigmoid(z^{(i)})\tag{2}
y^(i)=a(i)=sigmoid(z(i))(2)
(1) z ( i ) = w T x ( i ) + b z^{(i)} = w^T x^{(i)} + b \tag{1} z(i)=wTx(i)+b(1)
五、参数的更新规则
θ = θ − α d θ \theta = \theta - \alpha \text{ } d\theta θ=θ−α dθ
$$
$$
alpha,对应的是学习率
六、w,b的导数
(7) ∂ J ∂ w = 1 m X ( A − Y ) T \frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^T\tag{7} ∂w∂J=m1X(A−Y)T(7)
(8) ∂ J ∂ b = 1 m ∑ i = 1 m ( a ( i ) − y ( i ) ) \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (a^{(i)}-y^{(i)})\tag{8} ∂b∂J=m1i=1∑m(a(i)−y(i))(8)
七、向量化logistic回归
A = σ ( w T X + b ) = ( a ( 0 ) , a ( 1 ) , . . . , a ( m − 1 ) , a ( m ) ) A = \sigma(w^T X + b) = (a^{(0)}, a^{(1)}, ..., a^{(m-1)}, a^{(m)}) A=σ(wTX+b)=(a(0),a(1),...,a(m−1),a(m))
J = − 1 m ∑ i = 1 m y ( i ) log ( a ( i ) ) + ( 1 − y ( i ) ) log ( 1 − a ( i ) ) J = -\frac{1}{m}\sum_{i=1}^{m}y^{(i)}\log(a^{(i)})+(1-y^{(i)})\log(1-a^{(i)}) J=−m1i=1∑my(i)log(a(i))+(1−y(i))log(1−a(i))
八、激活函数
1.sigmoid function(sigmoid函数)
s i g m o i d ( x ) = 1 1 + e − x sigmoid(x) = \frac{1}{1+e^{-x}} sigmoid(x)=1+e−x1
2.tanh 函数
t a n h ( x ) = e x − e − x e x + e − x tanh(x) = \frac{e^x-e^{-x}}{e^x+e^{-x}} tanh(x)=ex+e−xex−e−x