线性回归
线性回归是利用数理统计中回归分析,来确定两种或两种以上变量间相互依赖的定量关系的一种统计分析方法。表现形式:
f
(
x
)
=
w
T
x
+
b
f(x) = w^Tx + b
f(x)=wTx+b
一般给定一个
x
x
x,可以预测一个数值
y
y
y。
为了消除常数项
b
b
b,我们设
x
,
=
[
1
,
x
]
T
x^, = [1,x]^T
x,=[1,x]T ,同时
w
,
=
[
b
,
w
]
T
w^, = [b,w]^T
w,=[b,w]T, 方程简化为
f
(
x
,
)
=
w
,
T
x
,
f(x^,) = w^{, T} x^,
f(x,)=w,Tx,
逻辑回归
上述线性回归的方法的值域为
(
−
∞
,
+
∞
)
(-{\infty},+{\infty})
(−∞,+∞) ,而我们想要的只是
(
0
,
1
)
(0,1)
(0,1),所以sigmod函数
1
1
+
e
−
x
\frac{1}{1+e^{-x}}
1+e−x1,其图形如下:
逻辑回归损失函数
定义
P
y
=
1
=
1
1
+
e
−
w
T
x
=
p
P_{y=1} =\frac{1}{1+e^{-w^Tx}} = p
Py=1=1+e−wTx1=p
P
y
=
0
=
1
−
p
P_{y=0} =1-p
Py=0=1−p
p
(
y
i
∣
x
i
)
=
p
y
i
(
1
−
p
)
1
−
y
i
p(y_i|x_i)= p^{y_i}(1-p)^{1-y^i}
p(yi∣xi)=pyi(1−p)1−yi
所以:样本
(
x
1
,
y
1
)
,
(
x
2
,
y
2
)
.
.
.
(
x
n
,
y
n
)
(x_1,y_1),(x_2,y_2)...(x_n,y_n)
(x1,y1),(x2,y2)...(xn,yn)的对数似然函数
F
=
l
n
(
∏
i
=
1
n
p
y
i
(
1
−
p
)
1
−
y
i
)
=
∑
i
=
1
n
y
i
l
n
(
p
)
+
(
1
−
y
i
)
l
n
(
1
−
p
)
F= ln(\prod_{i=1}^n p^{y_i}(1-p)^{1-y_i}) = \sum_{i=1}^n y_i ln(p) + (1-y_i)ln(1-p)
F=ln(i=1∏npyi(1−p)1−yi)=i=1∑nyiln(p)+(1−yi)ln(1−p)
其中
p
=
1
1
+
e
−
w
T
x
p=\frac{1}{1+e^{-w^Tx}}
p=1+e−wTx1
next: 求
▽
w
F
\triangledown_{w}F
▽wF
首先记住以下几个矩阵求导公式,具体见矩阵求导
∂
(
A
X
)
∂
X
=
A
T
\frac{\partial (AX)} {\partial X} = A^T
∂X∂(AX)=AT
∂
(
X
T
A
)
∂
X
=
A
\frac {\partial (X^T A)} {\partial X} = A
∂X∂(XTA)=A
所以
∂
w
T
x
∂
w
=
x
\frac{\partial w^Tx} {\partial w} = x
∂w∂wTx=x
下面对p求导(
p
′
=
∂
p
∂
w
p^{'}=\frac {\partial p}{\partial w}
p′=∂w∂p)
p
′
=
(
1
1
+
e
−
w
T
x
)
′
=
(
1
+
e
−
w
T
x
)
′
(
1
+
e
−
w
T
x
)
2
=
(
e
−
w
T
x
)
(
−
w
T
x
)
′
(
1
+
e
−
w
T
x
)
2
=
(
e
−
w
T
x
)
(
−
x
)
(
1
+
e
−
w
T
x
)
2
=
p
(
1
−
p
)
x
p^{'} = (\frac{1}{1+e^{-w^Tx}})^{'} \\ = \frac {(1+e^{-w^Tx})^{'}}{{(1+e^{-w^Tx}})^2} \\ = \frac {(e^{-w^Tx}) (-w^Tx)^{'}}{{(1+e^{-w^Tx}})^2} \\ = \frac {(e^{-w^Tx}) (-x)}{{(1+e^{-w^Tx}})^2} \\ = p(1-p)x
p′=(1+e−wTx1)′=(1+e−wTx)2(1+e−wTx)′=(1+e−wTx)2(e−wTx)(−wTx)′=(1+e−wTx)2(e−wTx)(−x)=p(1−p)x
最后求得:
▽
w
F
=
∑
i
=
1
n
(
y
i
l
n
(
p
)
+
(
1
−
y
i
)
l
n
(
1
−
p
)
)
′
=
∑
i
=
1
n
y
i
l
n
′
(
p
)
+
(
1
−
y
i
)
l
n
′
(
1
−
p
)
=
∑
i
=
1
n
y
i
p
′
p
+
(
1
−
y
i
)
−
p
′
1
−
p
=
∑
i
=
1
n
y
i
p
(
1
−
p
)
x
p
+
(
1
−
y
i
)
−
p
(
1
−
p
)
x
1
−
p
=
∑
i
n
(
y
i
−
p
)
x
i
\triangledown_{w}F = \sum_{i=1}^n (y_i ln(p) + (1-y_i)ln(1-p))^{'} \\ =\sum_{i=1}^n y_i ln^{'}(p) + (1-y_i)ln^{'}(1-p) \\ = \sum_{i=1}^ny_i \frac{p^{'}}{p} + (1-y_i)\frac{-p^{'}}{1-p} \\ =\sum_{i=1}^ny_i \frac{ p(1-p)x}{p} + (1-y_i)\frac{- p(1-p)x}{1-p} \\ = \sum_i^n (y_i -p)x_i
▽wF=i=1∑n(yiln(p)+(1−yi)ln(1−p))′=i=1∑nyiln′(p)+(1−yi)ln′(1−p)=i=1∑nyipp′+(1−yi)1−p−p′=i=1∑nyipp(1−p)x+(1−yi)1−p−p(1−p)x=i∑n(yi−p)xi