神经网络前向反向传播
前向传播
输入:
a
[
L
−
1
]
a^{[L-1]}
a[L−1]
输出:
a
[
L
]
a^{[L]}
a[L]、
z
[
L
]
z^{[L]}
z[L]
传播过程:
Z
[
L
]
=
W
[
L
]
∗
A
[
L
−
1
]
+
b
[
L
]
Z^{[L]}=W^{[L]}*A^{[L-1]}+b^{[L]}
Z[L]=W[L]∗A[L−1]+b[L]
A
[
L
]
=
g
[
L
]
(
Z
[
L
]
)
A^{[L]}=g^{[L]}(Z^{[L]})
A[L]=g[L](Z[L])
反向传播
输入:
d
a
[
1
]
da^{[1]}
da[1]
输出:
d
a
[
L
−
1
]
da^{[L-1]}
da[L−1]、
d
W
[
L
]
dW^{[L]}
dW[L]、
d
b
[
L
]
db^{[L]}
db[L]
传播过程:
d
z
[
L
]
=
d
a
[
L
]
∗
g
[
L
]
′
(
z
[
L
]
)
d
w
[
L
]
=
d
z
[
L
]
∗
a
[
L
−
1
]
d
b
[
L
]
=
d
z
[
L
]
d
a
[
L
−
1
]
=
w
[
L
]
T
∗
d
z
[
L
]
dz^{[L]}=da^{[L]}*g^{[L]'}(z^{[L]})\\ dw^{[L]}=dz^{[L]}*a^{[L-1]}\\ db^{[L]}=dz^{[L]}\\ da^{[L-1]}=w^{[L]T}*dz^{[L]}
dz[L]=da[L]∗g[L]′(z[L])dw[L]=dz[L]∗a[L−1]db[L]=dz[L]da[L−1]=w[L]T∗dz[L]
向量化实现:
d
Z
[
L
]
=
d
A
[
L
]
∗
g
[
L
]
′
(
Z
[
L
]
)
d
W
[
L
]
=
1
m
∗
d
Z
[
L
]
∗
A
[
L
−
1
]
T
d
b
[
L
]
=
1
m
∗
n
p
.
s
u
m
(
d
Z
[
L
]
,
a
x
i
s
=
1
,
k
e
e
p
d
i
m
=
T
r
u
e
)
d
A
[
L
−
1
]
=
W
[
L
]
T
∗
d
Z
[
L
]
dZ^{[L]}=dA^{[L]}*g^{[L]'}(Z^{[L]})\\ dW^{[L]}=\frac{1}{m}*dZ^{[L]}*A^{[L-1]T}\\ db^{[L]}=\frac{1}{m}*np.sum(dZ^{[L]},axis=1,keepdim=True)\\ dA^{[L-1]}=W^{[L]T}*dZ^{[L]}
dZ[L]=dA[L]∗g[L]′(Z[L])dW[L]=m1∗dZ[L]∗A[L−1]Tdb[L]=m1∗np.sum(dZ[L],axis=1,keepdim=True)dA[L−1]=W[L]T∗dZ[L]