首先回顾一下贝叶斯和逻辑回归
贝叶斯
∏
i
=
1
n
p
(
x
i
∣
y
=
1
)
p
(
y
=
1
)
−
∏
i
=
1
n
p
(
x
i
∣
y
=
0
)
p
(
y
=
0
)
>
0
\quad \prod_{i=1}^n p(x_i|y=1)p(y=1) \quad - \quad \prod_{i=1}^n p(x_i|y=0)p(y=0) \quad > 0
i=1∏np(xi∣y=1)p(y=1)−i=1∏np(xi∣y=0)p(y=0)>0
满足上述表达式的样本
x
∈
R
n
x \in R^n
x∈Rn 为正样本,否则的话为负样本
逻辑回归
∑
i
=
1
n
w
i
x
i
−
w
0
>
0
\sum_{i=1}^{n} w_ix_i - w_0 > 0
i=1∑nwixi−w0>0
满足上述表达式的样本为正样本,否则的话为负样本
下面的题满足的前提条件为:
y
y
y ~
B
e
r
n
o
u
u
l
i
(
ϕ
)
Bernouuli(\phi)
Bernouuli(ϕ)
x
∣
y
=
0
x|y=0
x∣y=0~
N
(
μ
0
,
σ
0
2
)
N(\mu_0,\sigma_0^2)
N(μ0,σ02)
x
∣
y
=
1
x|y=1
x∣y=1~
N
(
μ
1
,
σ
1
2
)
N(\mu_1,\sigma_1^2)
N(μ1,σ12)
question 1
σ
0
=
σ
1
\sigma_{0} = \sigma_{1}
σ0=σ1
正样本可以表达为:
∑
i
=
1
n
[
l
n
p
(
x
i
∣
y
=
1
)
−
l
n
p
(
x
i
∣
y
=
0
)
]
>
l
n
p
(
y
=
0
)
−
l
n
p
(
y
=
1
)
\sum_{i=1}^n[ln p(x_i|y=1) - lnp(x_i|y=0)] > lnp(y=0) - lnp(y=1)
∑i=1n[lnp(xi∣y=1)−lnp(xi∣y=0)]>lnp(y=0)−lnp(y=1)
=>
∑
i
=
1
n
[
−
(
x
i
−
μ
1
)
2
/
(
2
σ
1
2
)
−
l
n
(
(
2
Π
)
0.5
σ
1
)
]
+
(
x
i
−
μ
0
)
2
/
(
2
σ
0
2
)
+
l
n
(
(
2
Π
)
0.5
σ
0
)
]
>
l
n
p
(
y
=
0
)
−
l
n
(
y
=
1
)
\sum_{i=1}^n[-(x_i - \mu_1)^2/(2\sigma_1^2) - ln((2\Pi)^{0.5}\sigma_1)] + (x_i - \mu_0)^2/(2\sigma_0^2) + ln((2\Pi)^{0.5}\sigma_0)] > lnp(y=0) - ln(y=1)
∑i=1n[−(xi−μ1)2/(2σ12)−ln((2Π)0.5σ1)]+(xi−μ0)2/(2σ02)+ln((2Π)0.5σ0)]>lnp(y=0)−ln(y=1)
=>
∑
i
=
1
n
(
μ
1
/
σ
1
2
−
μ
0
/
σ
0
2
)
x
i
>
l
n
p
(
y
=
0
)
−
l
n
p
(
y
=
1
)
+
∑
i
=
1
n
(
μ
1
2
/
σ
1
−
μ
0
2
/
σ
0
)
\sum_{i=1}^n(\mu_1/\sigma_1^2 - \mu_0/\sigma_0^2)x_i > lnp(y=0)-lnp(y=1)+\sum_{i=1}^n(\mu_1^2/\sigma_1 - \mu_0^2/\sigma_0)
∑i=1n(μ1/σ12−μ0/σ02)xi>lnp(y=0)−lnp(y=1)+∑i=1n(μ12/σ1−μ02/σ0)
在问题一的条件下,连续特征的朴素贝叶斯分类起可以写作是特征值的线形加权组合,所以可以看作是本质上的逻辑回归分类器。
question 2
正样本可以表达为:
∑
i
=
1
n
[
l
n
p
(
x
i
∣
y
=
1
)
−
l
n
p
(
x
i
∣
y
=
0
)
]
>
l
n
p
(
y
=
0
)
−
l
n
p
(
y
=
1
)
\sum_{i=1}^n[ln p(x_i|y=1) - lnp(x_i|y=0)] > lnp(y=0) - lnp(y=1)
∑i=1n[lnp(xi∣y=1)−lnp(xi∣y=0)]>lnp(y=0)−lnp(y=1)
=>
∑
i
=
1
n
[
−
(
x
i
−
μ
1
)
2
/
(
2
σ
1
2
)
−
l
n
(
(
2
Π
)
0.5
σ
1
)
]
+
(
x
i
−
μ
0
)
2
/
(
2
σ
0
2
)
+
l
n
(
(
2
Π
)
0.5
σ
0
)
]
>
l
n
p
(
y
=
0
)
−
l
n
(
y
=
1
)
\sum_{i=1}^n[-(x_i - \mu_1)^2/(2\sigma_1^2) - ln((2\Pi)^{0.5}\sigma_1)] + (x_i - \mu_0)^2/(2\sigma_0^2) + ln((2\Pi)^{0.5}\sigma_0)] > lnp(y=0) - ln(y=1)
∑i=1n[−(xi−μ1)2/(2σ12)−ln((2Π)0.5σ1)]+(xi−μ0)2/(2σ02)+ln((2Π)0.5σ0)]>lnp(y=0)−ln(y=1)
=>
∑
i
=
1
2
[
(
1
/
2
σ
0
2
−
1
/
2
σ
1
2
)
x
i
2
+
(
μ
1
/
σ
1
2
−
μ
0
/
σ
0
2
)
x
i
]
>
l
n
p
(
y
=
0
)
−
l
n
p
(
y
=
1
)
+
∑
i
=
1
n
(
l
n
(
2
Π
σ
1
)
−
l
n
(
2
Π
σ
0
)
+
∑
i
=
1
n
(
μ
1
2
/
σ
1
−
μ
0
2
/
σ
)
)
\sum_{i=1}^2[(1/2\sigma_0^2-1/2\sigma_1^2)x_i^2+(\mu_1/\sigma_1^2-\mu_0/\sigma_0^2)x_i]>lnp(y=0)-lnp(y=1)+\sum_{i=1}^n(ln(\sqrt{2\Pi}\sigma_1)-ln(\sqrt{2\Pi}\sigma_0)+\sum_{i=1}^n(\mu_1^2/\sigma_1-\mu_0^2/\sigma_))
∑i=12[(1/2σ02−1/2σ12)xi2+(μ1/σ12−μ0/σ02)xi]>lnp(y=0)−lnp(y=1)+∑i=1n(ln(2Πσ1)−ln(2Πσ0)+∑i=1n(μ12/σ1−μ02/σ))
上面表示,在
σ
i
,
0
!
=
σ
i
,
1
\sigma_{i,0} != \sigma{i,1}
σi,0!=σi,1时,连续特征的朴素贝叶斯分类判别公式不能写成特征值的线形加权组合(与
x
i
2
x_i^2
xi2有关),所以不能够说是逻辑回归,
question 3
负样本可以表达为:
∏
i
=
1
,
j
=
2
,
i
!
=
j
n
p
(
x
i
,
x
j
∣
y
=
0
)
p
(
y
=
0
)
−
∏
i
=
1
,
j
=
2
,
i
!
=
j
n
p
(
x
i
,
x
j
∣
y
=
1
)
p
(
y
=
1
)
>
0
\prod_{i=1,j=2,i!=j}^n p(x_i,x_j|y=0)p(y=0)-\prod_{i=1,j=2,i!=j}^n p(x_i,x_j|y=1)p(y=1)>0
∏i=1,j=2,i!=jnp(xi,xj∣y=0)p(y=0)−∏i=1,j=2,i!=jnp(xi,xj∣y=1)p(y=1)>0
=>
∑
i
=
1
,
j
=
2
,
i
!
=
j
n
l
n
p
(
x
i
,
x
j
∣
y
=
0
)
−
∑
i
=
1
,
j
=
1
,
i
!
=
j
n
l
n
p
(
x
i
,
x
j
∣
y
=
1
)
>
l
n
p
(
y
=
1
)
−
l
n
p
(
y
=
0
)
\sum_{i=1,j=2,i!=j}^nlnp(x_i,x_j|y=0) - \sum_{i=1,j=1,i!=j}^nlnp(x_i,x_j|y=1)>lnp(y=1)-lnp(y=0)
∑i=1,j=2,i!=jnlnp(xi,xj∣y=0)−∑i=1,j=1,i!=jnlnp(xi,xj∣y=1)>lnp(y=1)−lnp(y=0)
=>
∑
i
=
1
,
j
=
2
,
i
!
=
j
n
σ
1
2
[
μ
20
2
−
μ
21
2
+
2
x
2
(
μ
21
−
μ
20
)
]
+
σ
2
2
[
μ
10
2
]
−
μ
11
2
+
2
x
1
(
μ
11
−
μ
10
)
+
2
ρ
σ
1
σ
2
[
(
μ
20
−
μ
21
)
x
1
+
(
μ
10
−
μ
11
)
x
2
+
μ
11
μ
21
−
μ
10
μ
20
]
/
2
(
1
−
ρ
2
)
σ
1
2
σ
2
2
>
l
n
p
(
y
=
1
)
−
l
n
p
(
y
=
0
)
\sum_{i=1,j=2,i!=j}^n\sigma1^2[\mu_{20}^2-\mu_{21}^2+2x_2(\mu_{21}-\mu_{20})]+\sigma_2^2[\mu_{10}^2]-\mu_{11}^2+2x_1(\mu_{11}-\mu_{10})+2\rho\sigma_1\sigma_2[(\mu_{20}-\mu_{21})x_1+(\mu_{10}-\mu_{11})x_2+\mu_{11}\mu_{21}-\mu_{10}\mu_{20}]/2(1-\rho^2)\sigma_1^2\sigma_2^2>lnp(y=1)-lnp(y=0)
∑i=1,j=2,i!=jnσ12[μ202−μ212+2x2(μ21−μ20)]+σ22[μ102]−μ112+2x1(μ11−μ10)+2ρσ1σ2[(μ20−μ21)x1+(μ10−μ11)x2+μ11μ21−μ10μ20]/2(1−ρ2)σ12σ22>lnp(y=1)−lnp(y=0)
=>
由上述的表示可以看出,非朴素的贝叶斯分类器可以简单的看作是逻辑回归
Conclude
在推导逻辑回归的时候,我们并没有假设类内样本是服从高斯分布的,因而GDA只是逻辑回归的一个特例,其建立在更强的假设条。故两者效果比较:
a.逻辑回归是基于弱假设推导的,则其效果更稳定,适用范围更广
b.数据服从高斯分布时,GDA效果更好
c.当训练样本数很大时,根据中心极限定理,数据将无限逼近于高斯分布,则此时GDA的表现效果会非常好