HongYiLee Backpropagation Notes
标签: Notes DeepLearning Backpropagation
Introduction of Deep Learning
1. Given a network structure, mean we define the function set.
给定一个神经网络的结构,那么就是定义了一个函数集合。然后我们要做的就是去找到一个最好的best function。
What does a Nerual network do?
y=f(x)=σ(WL...σ(W2(σ(W1+b1)+b2)+...+bL)
y
=
f
(
x
)
=
σ
(
W
L
.
.
.
σ
(
W
2
(
σ
(
W
1
+
b
1
)
+
b
2
)
+
.
.
.
+
b
L
)
So we can using parallel computing techniques to speed up matrix operation.
对于矩阵运算,我们可以使用 GPU的加速运算能力。
2. Define the goodness or badness of a function. We call it Loss Function.
3. Pick the function minimize the Loss Function.
Backpropagation反向传播
L(θ)=∑n=1Nln(θ)
L
(
θ
)
=
∑
n
=
1
N
l
n
(
θ
)
∂L(θ)∂w=∑n=1N∂l(θ)∂w
∂
L
(
θ
)
∂
w
=
∑
n
=
1
N
∂
l
(
θ
)
∂
w
So we need just to compute ∂l(θ)∂w ∂ l ( θ ) ∂ w .
∂l(θ)∂w=∂l(θ)∂z∂z∂w
∂
l
(
θ
)
∂
w
=
∂
l
(
θ
)
∂
z
∂
z
∂
w
So we have Forward pass:compute ∂z∂w ∂ z ∂ w .
and Backward pass: compute ∂l(θ)∂z ∂ l ( θ ) ∂ z .
- Forward pass:
∂z∂w1=x1 ∂ z ∂ w 1 = x 1
∂z∂w2=x2 ∂ z ∂ w 2 = x 2
... . . .
∂z∂wi=xi ∂ z ∂ w i = x i - Backward pass:
Let a=σ(z) a = σ ( z ) ,∂l∂z=∂a∂z∂l∂a ∂ l ∂ z = ∂ a ∂ z ∂ l ∂ a
∂a∂z=σ′(z) ∂ a ∂ z = σ ′ ( z )
∂l∂a=∂z′∂a∂l∂z′+∂z′′∂a∂l∂z′′ ∂ l ∂ a = ∂ z ′ ∂ a ∂ l ∂ z ′ + ∂ z ″ ∂ a ∂ l ∂ z ″So this is Chain Rule.