深度学习(五)逻辑回归
属于什么算法?
Logistic Regression是监督学习中经典的分类方法,用于解决二分类问题
步骤
Step 1: Function Set
y ^ = σ ( w T x + b ) , σ ( z ) = 1 1 + e − z \hat y=\sigma(w^Tx+b),\sigma(z)=\frac{1}{1+e^{-z}} y^=σ(wTx+b),σ(z)=1+e−z1 y ^ \hat y y^代表预测值
Step 2: Goodness of a Function
当
x
1
,
x
2
x^1,x^2
x1,x2属于C1类别,
x
3
x^3
x3属于C2类别
L
(
w
,
b
)
=
f
w
,
b
(
x
1
)
f
w
,
b
(
x
2
)
(
1
−
f
w
,
b
(
x
3
)
)
.
.
.
f
w
,
b
(
x
n
)
L(w,b)=f_{w,b}(x^1)f_{w,b}(x^2)(1-f_{w,b}(x^3))...f_{w,b}(x^n)
L(w,b)=fw,b(x1)fw,b(x2)(1−fw,b(x3))...fw,b(xn)设置为正确分类概率乘积
寻找最优参数
w
∗
,
b
∗
w^*,b^*
w∗,b∗满足
w
∗
,
b
∗
=
a
r
g
m
a
x
w
,
b
L
(
w
,
b
)
w^*,b^*=arg \underset{w,b}{max}L(w,b)
w∗,b∗=argw,bmaxL(w,b)
又
w
∗
,
b
∗
=
a
r
g
m
a
x
w
,
b
L
(
w
,
b
)
=
a
r
g
m
i
n
w
,
b
−
l
n
L
(
w
,
b
)
w^*,b^*=arg \underset{w,b}{max}L(w,b)=arg \underset{w,b}{min}-lnL(w,b)
w∗,b∗=argw,bmaxL(w,b)=argw,bmin−lnL(w,b),只需要关注
−
l
n
L
(
w
,
b
)
-lnL(w,b)
−lnL(w,b)即可
即衡量单个样本预测y值和实际y值接近的Loss Function为
L
(
y
^
,
y
)
=
−
(
y
l
o
g
y
^
+
(
1
−
y
)
l
o
g
(
1
−
y
^
)
)
L(\hat y,y)=-(ylog\hat y+(1-y)log(1-\hat y))
L(y^,y)=−(ylogy^+(1−y)log(1−y^)) 此为交叉熵损失函数
if y=1:
L
(
y
^
,
y
)
=
−
l
o
g
y
^
L(\hat y,y)=-log\hat y
L(y^,y)=−logy^,让
y
^
\hat y
y^尽量大,接近于1
if y=0:
L
(
y
^
,
y
)
=
−
l
o
g
(
1
−
y
^
)
L(\hat y,y)=-log(1-\hat y)
L(y^,y)=−log(1−y^),让
l
o
g
(
1
−
y
^
)
log(1-\hat y)
log(1−y^)尽量大,让
y
^
\hat y
y^尽量小,接近于0
衡量整体的Cost Function为
J
(
w
,
b
)
=
1
m
∑
i
=
1
m
L
(
y
^
i
,
y
i
)
J(w,b)=\frac{1}{m}\sum^m_{i=1}L(\hat y^i,y^i)
J(w,b)=m1∑i=1mL(y^i,yi)
Step 3:Find Best Function
使用gradient descent
和Linear Regression对比
Output:Logistic Regression在0和1之间,Linear Regression任意值
损失函数不同