一、Logistic回归
Logistic回归是一种二分类模型,它有一个线性决策边界(超平面),但用一个非线性激活函数(Sigmoid函数)来模拟后验概率。
二、Sigmoid函数
1、数学公式
f
(
x
)
=
1
1
+
e
−
x
f(x)=\frac{1}{1+e^{-x}}
f(x)=1+e−x1
sigmoid函数和其反函数都是严格单调递增的,常用作阈值函数,将变量映射到(0,1)内。
2、求导 f ′ ( x ) = F ( f ( x ) ) f^{'}(x)=F(f(x)) f′(x)=F(f(x))
f ′ ( x ) = ( 1 1 + e − 1 ) ′ = 0 − ( − e − x ) ( 1 + e − x ) 2 = e − x ( 1 + e − x ) 2 = 1 1 + e − x ⋅ ( 1 − 1 1 + e − x ) = f ( x ) ( 1 − f ( x ) ) \begin{aligned} f^{'}(x)&=\left(\frac{1}{1+e^{-1}}\right)^{'}=\frac{0-(-e^{-x})}{(1+e^{-x})^{2}}\\ &=\frac{e^{-x}}{(1+e^{-x})^{2}}\\ &=\frac{1}{1+e^{-x}} \cdot \left(1-\frac{1}{1+e^{-x}} \right) \\ &=f(x)(1-f(x)) \end{aligned} f′(x)=(1+e−11)′=(1+e−x)20−(−e−x)=(1+e−x)2e−x=1+e−x1⋅(1−1+e−x1)=f(x)(1−f(x))
三、伯努利分布
{
P
(
x
=
1
)
=
p
,
0
<
p
<
1
P
(
x
=
0
)
=
1
−
p
\left\{ \begin{aligned} P(x=1)& = p, \quad 0< p < 1\\ P(x=0) & = 1-p \end{aligned} \right.
{P(x=1)P(x=0)=p,0<p<1=1−p
对于随机变量x的概率函数:
f
(
x
∣
p
)
=
{
p
x
(
1
−
p
)
1
−
x
,
x
=
0
,
1
0
,
x
≠
0
,
1
f(x|p)=\left\{ \begin{aligned} &p^{x}(1-p)^{1-x},& \quad x=0,1\\ &0, & \quad x \neq 0,1 \end{aligned} \right.
f(x∣p)={px(1−p)1−x,0,x=0,1x=0,1
四、Logistic回归假设函数
线性回归是一个回归算法,而Logistic回归是一个分类算法,其样本数据集是一个离散分布的样本集,模型值是{0,1}这样的离散值而非连续值。通过sigmoid函数可将线性回归与Logistic回归联系起来。
h
θ
(
x
)
=
δ
(
θ
T
x
)
=
1
1
+
e
−
θ
T
x
h_{\theta}(x)=\delta (\theta^{T}x)=\frac{1}{1+e^{-\theta^{T}x}}
hθ(x)=δ(θTx)=1+e−θTx1
在Logistic回归中作一个假设:样本事件符合伯努利分布,即0-1分布,则
P
(
y
=
1
∣
x
;
θ
)
=
h
θ
(
x
)
=
1
1
+
e
−
θ
T
x
P
(
y
=
1
∣
x
;
θ
)
=
1
−
h
θ
(
x
)
\begin{aligned} P(y=1|x;\theta)&=h_{\theta}(x)=\frac{1}{1+e^{-\theta^{T}x}}\\ P(y=1|x;\theta)&=1-h_{\theta}(x) \end{aligned}
P(y=1∣x;θ)P(y=1∣x;θ)=hθ(x)=1+e−θTx1=1−hθ(x)
简化形式为:
P
(
y
∣
x
;
θ
)
=
[
h
θ
(
x
)
]
y
[
1
−
h
θ
(
x
)
]
(
1
−
y
)
P(y|x;\theta)=[h_{\theta}(x)]^{y}[1-h_{\theta}(x)]^{(1-y)}
P(y∣x;θ)=[hθ(x)]y[1−hθ(x)](1−y)
五、似然函数
L
(
θ
)
=
∏
i
=
1
N
P
(
y
(
i
)
∣
x
i
;
θ
)
=
∏
i
=
1
N
[
h
θ
(
x
(
i
)
)
]
y
(
i
)
[
1
−
h
θ
(
x
(
i
)
)
]
(
1
−
y
(
i
)
)
=
∏
i
=
1
N
(
1
1
+
e
−
θ
T
x
(
i
)
)
y
(
i
)
(
1
−
1
1
+
e
−
θ
T
x
(
i
)
)
1
−
y
(
i
)
\begin{aligned} L(\theta)&=\prod_{i=1}^{N}P(y^{(i)}|x^{i};\theta)\\ &=\prod_{i=1}^{N}[h_{\theta}(x^{(i)})]^{y^{(i)}}[1-h_{\theta}(x^{(i)})]^{(1-y^{(i)})}\\ &=\prod_{i=1}^{N}(\frac{1}{1+e^{-\theta^{T}x^{(i)}}})^{y^{(i)}}(1-\frac{1}{1+e^{-\theta^{T}x^{(i)}}})^{1-y^{(i)}} \end{aligned}
L(θ)=i=1∏NP(y(i)∣xi;θ)=i=1∏N[hθ(x(i))]y(i)[1−hθ(x(i))](1−y(i))=i=1∏N(1+e−θTx(i)1)y(i)(1−1+e−θTx(i)1)1−y(i)
Logistic回归的目标函数即为对数似然函数,进行最大似然估计则可求解
m
a
x
θ
⇔
m
a
x
θ
∑
i
=
1
n
[
y
(
i
)
l
n
h
θ
(
x
(
i
)
)
+
(
1
−
y
(
i
)
)
l
n
(
1
−
h
θ
(
x
(
i
)
)
)
]
\underset{\theta}{max}\Leftrightarrow \underset{\theta}{max}\sum_{i=1}^{n}[y^{(i)}lnh_{\theta}(x^{(i)})+(1-y^{(i)})ln(1-h_{\theta}(x^{(i)}))]
θmax⇔θmaxi=1∑n[y(i)lnhθ(x(i))+(1−y(i))ln(1−hθ(x(i)))]