代价函数
假设在做二分类任务,那么代价函数和logistic回归代价函数一样: J = 1 m ∑ i = 1 m L ( y , y ^ ) J = \frac {1}{m}\sum_{i = 1}^{m}L(y,\widehat{y}) J=m1∑i=1mL(y,y )
前向传播的步骤
z
[
l
]
=
W
[
l
]
∗
a
[
l
−
1
]
+
b
[
l
]
z^{[l]} = W^{[l]} * a^{[l - 1]} + b^{[l]}
z[l]=W[l]∗a[l−1]+b[l]
a
[
l
]
=
g
(
z
[
l
]
)
a^{[l]} = g(z^{[l]})
a[l]=g(z[l])
向量化的实现如下:
z
[
l
]
=
W
[
l
]
∗
A
[
l
−
1
]
+
b
[
l
]
z^{[l]} = W^{[l]} * A^{[l - 1]} + b^{[l]}
z[l]=W[l]∗A[l−1]+b[l]
A
[
l
]
=
g
(
z
[
l
]
)
A^{[l]} = g(z^{[l]})
A[l]=g(z[l])
反向传播的步骤
输入为
d
a
[
l
]
da^{[l]}
da[l],输出为
d
a
[
l
−
1
]
da^{[l - 1]}
da[l−1],
d
W
[
l
]
dW^{[l]}
dW[l],
d
b
[
l
]
db^{[l]}
db[l]
(1)
d
z
[
l
]
=
d
a
[
l
]
⋅
g
[
l
]
‘
(
z
[
l
]
)
dz^{[l]} = da^{[l]} \cdot g^{[l]^{`}}(z^{[l]})
dz[l]=da[l]⋅g[l]‘(z[l])
(2)
d
W
[
l
]
=
d
z
[
l
]
⋅
a
[
l
−
1
]
dW^{[l]} = dz^{[l]} \cdot a^{[l -1]}
dW[l]=dz[l]⋅a[l−1]
(3)
d
b
[
l
]
=
d
z
[
l
]
db^{[l]} = dz^{[l]}
db[l]=dz[l]
(4)
d
a
[
l
−
1
]
=
W
[
l
]
T
⋅
d
z
[
l
]
da^{[l - 1]} = W^{[l]^{T}} \cdot dz^{[l]}
da[l−1]=W[l]T⋅dz[l]
(5)
d
z
[
l
]
=
W
[
l
+
1
]
T
⋅
d
z
[
l
+
1
]
⋅
g
[
l
]
‘
(
z
[
l
]
)
dz^{[l]} = W^{[l + 1]^{T}} \cdot dz^{[l + 1]} \cdot g^{[l]^`}(z^{[l]})
dz[l]=W[l+1]T⋅dz[l+1]⋅g[l]‘(z[l])
式子(5)由前四个式子带入得到,这 五个式子可以实现反向传播。
向量化实现过程可以写成:
(6)
d
Z
[
l
]
=
d
A
[
l
]
⋅
g
[
l
]
‘
(
z
[
l
]
)
dZ^{[l]} = dA^{[l]} \cdot g^{[l]^{`}}(z^{[l]})
dZ[l]=dA[l]⋅g[l]‘(z[l])
(7)
d
W
[
l
]
=
1
m
d
Z
[
l
]
⋅
A
[
l
−
1
]
dW^{[l]} = \frac{1}{m}dZ^{[l]} \cdot A^{[l -1]}
dW[l]=m1dZ[l]⋅A[l−1]
(8)
d
b
[
l
]
=
1
m
n
p
.
s
u
m
(
d
z
[
l
]
,
a
x
i
s
=
1
,
k
e
e
p
d
i
m
s
=
T
r
u
e
)
db^{[l]} = \frac{1}{m}np.sum(dz^{[l]},axis = 1, keepdims = True)
db[l]=m1np.sum(dz[l],axis=1,keepdims=True)
(9)
d
A
[
l
−
1
]
=
W
[
l
]
T
⋅
d
Z
[
l
]
dA^{[l - 1]} = W^{[l]^{T}} \cdot dZ^{[l]}
dA[l−1]=W[l]T⋅dZ[l]