回顾梯度下降流程
#1 初始化
θ
\theta
θ
#2 求 gradient
#3
θ
t
+
1
=
θ
t
−
α
•
g
\theta^{t+1}=\theta^{t}-\alpha • g
θt+1=θt−α•g
#4
g
g
g 收敛
其中1、3、4、步都容易求得,第2步中
g
g
g 不易求得
求导
g
=
δ
δ
θ
j
J
(
θ
)
=
δ
δ
θ
j
1
2
(
h
θ
(
x
)
−
y
)
2
g=\frac{\delta}{\delta\theta_{j}}J(\theta)=\frac{\delta}{\delta\theta_{j}}\frac{1}{2}(h_{\theta}(x)-y)^{2}
g=δθjδJ(θ)=δθjδ21(hθ(x)−y)2
#对
δ
δ
θ
j
J
(
θ
)
\frac{\delta}{\delta\theta_{j}}J(\theta)
δθjδJ(θ) 求偏导数
=
2
∗
1
2
(
h
θ
(
x
)
−
y
)
δ
δ
θ
j
(
h
θ
(
x
)
−
y
)
=2*\frac{1}{2}(h_{\theta}(x)-y)\frac{\delta}{\delta\theta_{j}}(h_{\theta}(x)-y)
=2∗21(hθ(x)−y)δθjδ(hθ(x)−y)
=
(
h
θ
(
x
)
−
y
)
δ
δ
θ
j
(
∑
i
=
1
m
θ
i
x
i
−
y
)
=(h_{\theta}(x)-y)\frac{\delta}{\delta\theta_{j}}(\sum_{i=1}^{m}\theta_{i}x_{i}-y)
=(hθ(x)−y)δθjδ(∑i=1mθixi−y)
=
(
h
θ
(
x
)
−
y
)
x
j
=(h_{\theta}(x)-y)x_{j}
=(hθ(x)−y)xj
g
(
θ
0
.
.
.
θ
m
)
=
x
j
(
x
∗
θ
−
y
)
g_{(\theta_{0}...\theta_{m})}=x_{j}(x*\theta-y)
g(θ0...θm)=xj(x∗θ−y)