Classification:不能用线性回归
Logistic Regression:0
≤
h
θ
(
x
)
≤
1
{\leq}h_{\theta}(x){\leq}1
≤hθ(x)≤1
(a classification algortihm)
h
θ
(
x
)
=
g
(
θ
T
x
)
h_{\theta}(x)=g({\theta}^Tx)
hθ(x)=g(θTx)
g(z)=
1
1
+
e
−
z
\frac{1}{1+e^{-z}}
1+e−z1
h
θ
(
x
)
=
P
(
y
=
1
∣
x
;
θ
)
h_{\theta}(x)=P(y=1|x;{\theta})
hθ(x)=P(y=1∣x;θ)
x,
θ
\theta
θ已知条件下y=1的概率
Suppose predict “y=1” if
h
θ
(
x
)
≥
0.5
h_{\theta}(x)\geq0.5
hθ(x)≥0.5
predict “y=0” if
h
θ
(
x
)
<
0.5
h_{\theta}(x)<0.5
hθ(x)<0.5
Decision Boundary
h
θ
(
x
)
=
g
(
θ
0
+
θ
1
x
1
+
θ
2
x
2
)
h_{\theta}(x)=g(\theta_0+\theta_1x_1+\theta_2x_2)
hθ(x)=g(θ0+θ1x1+θ2x2)
θ
\theta
θ=
[
−
3
1
1
]
\begin{bmatrix} -3 \\ 1 \\ 1 \end{bmatrix}\quad
⎣⎡−311⎦⎤
Predict “y=1” if -3+
x
1
x_1
x1+
x
2
≥
x_2\geq
x2≥ 0
Non-linear decision boundaries
Logistic regression cost function
cost(
h
θ
(
x
)
,
y
h_{\theta}(x),y
hθ(x),y)
=
{
−
l
o
g
(
h
θ
(
x
)
)
i
f
y
=
1
−
l
o
g
(
1
−
h
θ
(
x
)
)
i
f
y
=
0
=\left\{\begin{aligned}-log(h_{\theta}(x)) \qquad if \quad y=1\\-log(1-h_{\theta}(x)) \qquad if \quad y=0 \end{aligned} \right.
={−log(hθ(x))ify=1−log(1−hθ(x))ify=0
J
(
θ
)
=
1
m
∑
i
=
1
m
c
o
s
t
(
h
θ
(
x
(
i
)
)
,
y
(
i
)
)
J(\theta)=\frac{1}{m}\sum_{i=1}^{m} cost(h_{\theta}(x^{(i)}),y^{(i)})
J(θ)=m1i=1∑mcost(hθ(x(i)),y(i))
cost(
h
θ
(
x
)
,
y
)
h_{\theta}(x),y)
hθ(x),y)=-ylog(
h
θ
(
x
)
h_{\theta}(x)
hθ(x))-(1-y)log(1-
h
θ
(
x
)
h_{\theta}(x)
hθ(x))
J ( θ ) = − 1 m [ ∑ i = 1 m y ( i ) l o g h θ ( x ( i ) ) + ( 1 − y ( i ) ) l o g ( 1 − h θ ( x ( i ) ) ) ] J(\theta)=-\frac{1}{m}[\sum_{i=1}^{m}y^{(i)}logh_\theta(x^{(i)})+(1-y^{(i)})log(1-h_\theta(x^{(i)}))] J(θ)=−m1[i=1∑my(i)loghθ(x(i))+(1−y(i))log(1−hθ(x(i)))]
Want mine (
J
(
θ
)
J(\theta)
J(θ))
Repeat
{
θ
j
=
θ
j
−
α
∂
∂
θ
j
J
(
θ
)
\theta_j=\theta_j-\alpha\frac{\partial}{\partial\theta_j}J(\theta) \quad
θj=θj−α∂θj∂J(θ)
}simultaneously update all
θ
j
\theta_j
θj
∂ ∂ θ j J ( θ ) = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x j ( i ) \frac{\partial}{\partial\theta_j}J(\theta)=\frac{1}{m}(h_{\theta}(x^{(i)})-y^{(i)})x^{(i)}_j ∂θj∂J(θ)=m1(hθ(x(i))−y(i))xj(i)
Optimization algorithms(没有细看)
多元分类
Train a logistic regression classier
h
θ
(
i
)
(
x
)
f
o
r
h_\theta^{(i)}(x) for \quad \qquad
hθ(i)(x)for each class i to predict the possibility that y=i
On a new input x,to make a prediction,pick the class i that maximizes Z
m
a
x
i
\mathop{max}\limits_{i}
imax
h
θ
(
i
)
(
x
)
h_\theta^{(i)}(x)
hθ(i)(x)
最后说一句,写博客真的好费时间(不是)
祝能有一个不错的五一