1.线性回归
目标函数:
J
(
θ
)
=
1
2
m
∑
i
=
1
m
(
h
θ
(
x
i
)
−
y
i
)
2
J(\theta)=\frac{1}{2m}\sum_{i=1}^m(h_\theta (x^i)-y^i)^2
J(θ)=2m1i=1∑m(hθ(xi)−yi)2
正则化后为:
J
(
θ
)
=
1
2
m
∑
i
=
1
m
(
h
θ
(
x
i
)
−
y
i
)
2
+
λ
2
n
∑
j
=
1
n
θ
j
2
J(\theta)=\frac{1}{2m}\sum_{i=1}^m(h_\theta (x^i)-y^i)^2+\frac{\lambda}{2n}\sum_{j=1}^{n}\theta_j^2
J(θ)=2m1i=1∑m(hθ(xi)−yi)2+2nλj=1∑nθj2
拟合函数:
h
θ
(
x
)
=
θ
0
+
θ
1
x
+
θ
2
x
2
+
⋯
h_\theta(x)=\theta_0+\theta_1x+\theta_2x^2+\cdots
hθ(x)=θ0+θ1x+θ2x2+⋯
最小化代价函数
J
(
θ
)
J(\theta)
J(θ)
2. 梯度下降:
θ
j
:
θ
j
−
α
∂
∂
θ
j
J
(
θ
)
\theta_j:\theta_j-\alpha\frac{\partial}{\partial\theta_j}J(\theta)
θj:θj−α∂θj∂J(θ)
3.偏导数求法:
∂
∂
θ
j
J
(
θ
)
=
1
m
∑
i
=
1
m
(
h
θ
(
x
i
)
−
y
i
)
∗
θ
j
i
+
λ
n
θ
j
\frac{\partial}{\partial\theta_j}J(\theta)=\frac{1}{m}\sum_{i=1}^m(h_\theta (x^i)-y^i)*\theta{^i_j}+\frac{\lambda}{n}\theta_j
∂θj∂J(θ)=m1i=1∑m(hθ(xi)−yi)∗θji+nλθj
4.logistic回归
损失函数:
c
o
s
t
=
{
−
log
h
θ
(
x
)
,
if y=1
−
log
(
1
−
h
θ
(
x
)
)
,
if y=0
cost = \begin{cases} -\log h_\theta(x), & \text{if y=1} \\ -\log(1-h_\theta(x)), & \text{if y=0} \end{cases}
cost={−loghθ(x),−log(1−hθ(x)),if y=1if y=0
说明:
h
θ
(
x
)
=
s
i
g
m
o
i
d
(
θ
x
)
h_\theta(x)=sigmoid(\theta x)
hθ(x)=sigmoid(θx),取值范围是
[
0
,
1
]
[0,1]
[0,1],当y=1时,如果
h
θ
(
x
)
h_\theta(x)
hθ(x)的值越大,相对应的损失函数越小;同理,当y=0时,如果
h
θ
(
x
)
h_\theta(x)
hθ(x)的值越小,相对应的损失函数越小。
目标函数:
J
(
θ
)
=
−
1
m
∑
i
=
1
m
(
y
(
i
)
log
h
θ
(
x
i
)
+
(
1
−
y
(
i
)
)
log
(
1
−
h
θ
(
x
i
)
)
)
J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}\log h_\theta(x^i)+(1-y^{(i)})\log (1-h_\theta(x^i)))
J(θ)=−m1i=1∑m(y(i)loghθ(xi)+(1−y(i))log(1−hθ(xi)))
同理,正则化后为:
J
(
θ
)
=
−
1
m
∑
i
=
1
m
(
y
(
i
)
log
h
θ
(
x
i
)
+
(
1
−
y
(
i
)
)
log
(
1
−
h
θ
(
x
i
)
)
)
+
λ
2
n
∑
j
=
1
n
θ
j
2
J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}\log h_\theta(x^i)+(1-y^{(i)})\log (1-h_\theta(x^i)))+\frac{\lambda}{2n}\sum_{j=1}^{n}\theta_j^2
J(θ)=−m1i=1∑m(y(i)loghθ(xi)+(1−y(i))log(1−hθ(xi)))+2nλj=1∑nθj2
拟合函数:
h
θ
(
x
)
=
s
i
g
m
o
i
d
(
θ
x
)
=
1
1
+
e
−
(
θ
x
+
b
)
h_\theta(x)=sigmoid(\theta x)=\frac{1}{1+e^{-(\theta x+b)}}
hθ(x)=sigmoid(θx)=1+e−(θx+b)1
s
i
g
m
o
i
d
sigmoid
sigmoid函数很大的优点就是
g
′
(
z
)
=
g
(
z
)
∗
(
1
−
g
(
z
)
)
g'(z)=g(z)*(1-g(z))
g′(z)=g(z)∗(1−g(z))
使用方法同样为梯度下降。