模型
给定序列
x
t
x_t
xt和对应的
y
t
∈
{
−
1
,
+
1
}
y_t\in\{-1,+1\}
yt∈{−1,+1},构建如下分类模型:
P
(
y
i
=
1
∣
x
i
)
=
e
x
p
(
ω
x
i
+
b
)
1
+
e
x
p
(
ω
x
i
+
b
)
P(y_i=1|x_i)=\dfrac{exp(\omega x_i+b)}{1+exp(\omega x_i+b)}
P(yi=1∣xi)=1+exp(ωxi+b)exp(ωxi+b)显然,当
P
(
y
i
=
1
∣
x
i
)
>
0.5
P(y_i=1|x_i)>0.5
P(yi=1∣xi)>0.5时,
y
^
i
=
1
\hat{y}_i=1
y^i=1,反之
y
^
i
=
0
\hat{y}_i=0
y^i=0
参数估计(MLE)
假设观测数据集为
{
(
x
1
,
y
1
)
,
…
,
(
x
n
,
y
n
)
}
\{(x_1,y_1),\ldots,(x_n,y_n)\}
{(x1,y1),…,(xn,yn)},令
θ
=
(
ω
,
b
)
′
\theta=(\omega,b)^\prime
θ=(ω,b)′,
x
=
(
1
,
x
i
)
′
x=(1,x_i)^\prime
x=(1,xi)′
L
(
θ
)
=
∏
i
=
1
n
[
P
(
y
i
=
1
∣
x
i
)
]
y
i
[
1
−
P
(
y
i
=
1
∣
x
i
)
]
1
−
y
i
=
∏
i
=
1
n
[
e
x
p
(
x
′
θ
)
1
+
e
x
p
(
x
′
θ
)
]
y
i
[
1
1
+
e
x
p
(
x
′
θ
)
]
1
−
y
i
=
∏
i
=
1
n
e
x
p
(
y
i
x
′
θ
)
1
+
e
x
p
(
x
′
θ
)
\begin{align*} L(\theta)&=\prod\limits_{i=1}^n[P(y_i=1|x_i)]^{y_i}[1-P(y_i=1|x_i)]^{1-y_i} \\ &=\prod\limits_{i=1}^n[\dfrac{exp(x^\prime\theta)}{1+exp(x^\prime\theta)}]^{y_i}[\dfrac{1}{1+exp(x^\prime\theta)}]^{1-y_i} \\ &=\prod\limits_{i=1}^n\dfrac{exp(y_ix^\prime\theta)}{1+exp(x^\prime\theta)} \end{align*}
L(θ)=i=1∏n[P(yi=1∣xi)]yi[1−P(yi=1∣xi)]1−yi=i=1∏n[1+exp(x′θ)exp(x′θ)]yi[1+exp(x′θ)1]1−yi=i=1∏n1+exp(x′θ)exp(yix′θ)
对
−
l
n
L
(
θ
)
-lnL(\theta)
−lnL(θ)利用梯度下降解得参数估计。