逻辑回归
逻辑回归(Logistic Regression)是用于处理分类问题的一种算法,常用于二分类的处理,当然也可以处理多分类问题。它的思想是基于线性回归,实质上是一种广义线性回归模型。
对于逻辑回归模型,最核心的部分就是引进了sigmoid函数。如下图:
通过sigmoid函数,可以将任意的输入映射到[0,1]之间,对于二分类问题,我们可以认为这样的输出值就是一个概率。
下面给出逻辑回归的数学推导过程:
h
θ
(
x
)
=
g
(
θ
T
x
)
=
1
1
+
e
−
θ
T
x
{h_\theta }(x) = g({\theta ^T}x) = \frac{1}{{1 + {e^{ - {\theta ^T}x}}}}
hθ(x)=g(θTx)=1+e−θTx1
对sigmoid函数求导:
g
′
(
x
)
=
(
1
1
+
e
−
x
)
′
=
e
−
x
(
1
+
e
−
x
)
2
=
1
1
+
e
−
x
e
−
x
1
+
e
−
x
=
1
1
+
e
−
x
(
1
−
1
1
+
e
−
x
)
=
g
(
x
)
(
1
−
g
(
x
)
)
\begin{array}{l} g'\left( x \right) = {\left( {\frac{1}{{1 + {e^{ - x}}}}} \right)^\prime } = \frac{{{e^{ - x}}}}{{{{\left( {1 + {e^{ - x}}} \right)}^2}}}\\ = \frac{1}{{1 + {e^{ - x}}}}\frac{{{e^{ - x}}}}{{1 + {e^{ - x}}}} = \frac{1}{{1 + {e^{ - x}}}}\left( {1 - \frac{1}{{1 + {e^{ - x}}}}} \right)\\ = g\left( x \right)\left( {1 - g\left( x \right)} \right) \end{array}
g′(x)=(1+e−x1)′=(1+e−x)2e−x=1+e−x11+e−xe−x=1+e−x1(1−1+e−x1)=g(x)(1−g(x))
逻辑回归参数估计:
假定:
P
(
y
=
1
∣
x
;
θ
)
=
h
θ
(
x
)
P
(
y
=
0
∣
x
;
θ
)
=
1
−
h
θ
(
x
)
\begin{array}{l} P\left( {y = 1\left| {x;\theta } \right.} \right) = {h_\theta }(x)\\ P\left( {y = 0\left| {x;\theta } \right.} \right) = 1 - {h_\theta }(x) \end{array}
P(y=1∣x;θ)=hθ(x)P(y=0∣x;θ)=1−hθ(x)
则有:
p
(
y
∣
x
;
θ
)
=
(
h
θ
(
x
)
)
y
(
1
−
h
θ
(
x
)
)
1
−
y
p\left( {y\left| {x;\theta } \right.} \right) = {\left( {{h_\theta }(x)} \right)^y}{\left( {1 - {h_\theta }(x)} \right)^{1 - y}}
p(y∣x;θ)=(hθ(x))y(1−hθ(x))1−y
似然函数:
L
(
θ
)
=
p
(
y
∣
x
;
θ
)
=
∏
i
=
1
m
p
(
y
(
i
)
∣
x
(
i
)
;
θ
)
=
∏
i
=
1
m
(
h
θ
(
x
(
i
)
)
)
y
(
i
)
(
1
−
h
θ
(
x
(
i
)
)
)
1
−
y
(
i
)
\begin{array}{l} L\left( \theta \right) = p\left( {y\left| {x;\theta } \right.} \right)\\ = \prod\limits_{i = 1}^m {p\left( {{y^{\left( i \right)}}\left| {{x^{\left( i \right)}};\theta } \right.} \right)} \\ = \prod\limits_{i = 1}^m {{{\left( {{h_\theta }({x^{\left( i \right)}})} \right)}^{{y^{\left( i \right)}}}}{{\left( {1 - {h_\theta }({x^{\left( i \right)}})} \right)}^{1 - {y^{\left( i \right)}}}}} \end{array}
L(θ)=p(y∣x;θ)=i=1∏mp(y(i)∣∣x(i);θ)=i=1∏m(hθ(x(i)))y(i)(1−hθ(x(i)))1−y(i)
对数似然函数:
l
(
θ
)
=
log
L
(
θ
)
=
∑
i
=
1
m
y
(
i
)
log
h
(
x
(
i
)
)
+
(
1
−
y
(
i
)
)
log
(
1
−
h
(
x
(
i
)
)
)
\begin{array}{l} l\left( \theta \right) = \log L\left( \theta \right)\\ = \sum\limits_{i = 1}^m {{y^{\left( i \right)}}\log h\left( {{x^{\left( i \right)}}} \right) + \left( {1 - {y^{\left( i \right)}}} \right)\log \left( {1 - h\left( {{x^{\left( i \right)}}} \right)} \right)} \end{array}
l(θ)=logL(θ)=i=1∑my(i)logh(x(i))+(1−y(i))log(1−h(x(i)))
求偏导:
∂
∂
θ
j
l
(
θ
)
=
(
y
1
g
(
θ
T
x
)
−
(
1
−
y
)
1
1
−
g
(
θ
T
x
)
)
∂
∂
θ
j
g
(
θ
T
x
)
=
(
y
1
g
(
θ
T
x
)
−
(
1
−
y
)
1
1
−
g
(
θ
T
x
)
)
g
(
θ
T
x
)
(
1
−
g
(
θ
T
x
)
)
∂
∂
θ
j
θ
T
x
=
(
y
(
1
−
g
(
θ
T
x
)
)
−
(
1
−
y
)
g
(
θ
T
x
)
)
x
j
=
(
y
−
h
θ
(
x
)
)
x
j
\begin{array}{l} \frac{\partial }{{\partial {\theta _j}}}l\left( \theta \right) = \left( {y\frac{1}{{g\left( {{\theta ^T}x} \right)}} - \left( {1 - y} \right)\frac{1}{{1 - g\left( {{\theta ^T}x} \right)}}} \right)\frac{\partial }{{\partial {\theta _j}}}g\left( {{\theta ^T}x} \right)\\ = \left( {y\frac{1}{{g\left( {{\theta ^T}x} \right)}} - \left( {1 - y} \right)\frac{1}{{1 - g\left( {{\theta ^T}x} \right)}}} \right)g\left( {{\theta ^T}x} \right)\left( {1 - g\left( {{\theta ^T}x} \right)} \right)\frac{\partial }{{\partial {\theta _j}}}{\theta ^T}x\\ = \left( {y\left( {1 - g\left( {{\theta ^T}x} \right)} \right) - \left( {1 - y} \right)g\left( {{\theta ^T}x} \right)} \right){x_j}\\ = \left( {y - {h_\theta }\left( x \right)} \right){x_j} \end{array}
∂θj∂l(θ)=(yg(θTx)1−(1−y)1−g(θTx)1)∂θj∂g(θTx)=(yg(θTx)1−(1−y)1−g(θTx)1)g(θTx)(1−g(θTx))∂θj∂θTx=(y(1−g(θTx))−(1−y)g(θTx))xj=(y−hθ(x))xj
参数的迭代:
θ
j
{\theta _j}
θj:=KaTeX parse error: Expected '}', got 'EOF' at end of input: …left( i \right)
以上就是逻辑回归的全部推导过程,有了参数更新公式,利用梯度下降便可以求解。