Machine Learning by Andrew Ng
💡 吴恩达机器学习课程学习笔记——Week 5
🐠 本人学习笔记汇总 合订本
✓ 课程网址 standford machine learning
🍭 参考资源
学习提纲
Cost Function and Back-propagation
Notation
- L = total number of layers in the network
- S l S_l Sl = number of units (not counting bias unit) in layer l
- K = number of output units/classes
we denote
h
Θ
(
x
)
k
h_Θ(x)_k
hΘ(x)k as being a hypothesis that results in the k-th output
输入x,
h
Θ
(
x
)
k
h_Θ(x)_k
hΘ(x)k是输出的第k个特征
1.Cost Function
the cost function for regularized logistic regression
and the cost function for neural network are as below
the cost function for regularized logistic regression was
for neural network, the cost function is
Here, we define
δ
j
l
\delta_j^l
δjl as the error for
a
j
l
a_j^l
ajl
2.Back-propagation Algorithm
Same as other ml algorithms, our goal is to minimize the cost function
Therefor, we need to compute
- J ( θ ) J(\theta) J(θ)
- ∂ ∂ θ i , j ( l ) J ( θ ) \frac{\partial}{\partial \theta_{i,j} ^{(l)}} J(\theta) ∂θi,j(l)∂J(θ)
The algorithm to compute
∂
∂
θ
i
,
j
(
l
)
J
(
θ
)
\frac{\partial}{\partial \theta_{i,j} ^{(l)}} J(\theta)
∂θi,j(l)∂J(θ) is as below
The whole algorithm is as below
step1 and step2, forward propagation
step3, compute
δ
(
L
)
\delta^{(L)}
δ(L)
step4, compute
δ
(
L
−
1
)
\delta^{(L-1)}
δ(L−1) and so on,
step5, compute the
∂
∂
θ
i
,
j
(
l
)
J
(
θ
)
\frac{\partial}{\partial \theta_{i,j} ^{(l)}} J(\theta)
∂θi,j(l)∂J(θ)
3.Back-propagation Intuition
Back-propagation in Practice
1.Implementation Note: Unrolling Parameters
use gradient checking to assure everything goes well
2.Gradient Checking
3.Random Initialization
4.Putting It Together
6 steps to train a network
Ideally, we want $h_\theta(x^i) \approx y^i$ But remember that $J(\Theta)$ is not a convex function and thus we can end up in a local minima instead.
Application of Neural Networks
1.Autonomous Driving
skip.
Review
skip.额外阅读