(一)朴素贝叶斯法——极大似然估计
算法(朴素贝叶斯算法)
输入:训练数据
T
=
{
(
x
1
,
y
1
)
,
(
x
2
,
y
2
)
,
.
.
.
,
(
x
N
,
y
N
)
}
T=\{(x_{1},y_{1}),(x_{2},y_{2}),...,(x_{N},y_{N})\}
T={(x1,y1),(x2,y2),...,(xN,yN)},其中
x
i
=
(
x
i
(
1
)
,
x
i
(
2
)
,
.
.
.
,
x
i
(
n
)
)
T
x_i=(x^{(1)}_i,x^{(2)}_i,...,x^{(n)}_i)^T
xi=(xi(1),xi(2),...,xi(n))T是第
i
i
i个样本的第
j
j
j个特征,
x
i
(
j
)
∈
{
a
j
1
,
a
j
2
,
.
.
.
,
a
j
S
j
}
x^{(j)}_i\in\{a_{j1},a_{j2},...,a_{jS_j}\}
xi(j)∈{aj1,aj2,...,ajSj},
a
j
l
a_{jl}
ajl是第
j
j
j个特征可能取的第
l
l
l个值,
j
=
1
,
2
,
.
.
.
,
n
j=1,2,...,n
j=1,2,...,n,
l
=
1
,
2
,
.
.
.
,
S
j
l=1,2,...,S_j
l=1,2,...,Sj,
y
i
∈
{
c
1
,
c
2
,
.
.
.
,
c
K
}
y_i\in\{c_1,c_2,...,c_K\}
yi∈{c1,c2,...,cK};实例
x
x
x;
输出:实例
x
x
x的分类。
(1)计算先验概率及条件概率
P
(
Y
=
c
k
)
=
∑
i
=
1
N
I
(
y
i
=
c
k
)
N
,
k
=
1
,
2
,
.
.
.
,
K
P(Y=c_k)=\frac {\sum^{N}_{i=1}I(y_i=c_k)} {N},k=1,2,...,K
P(Y=ck)=N∑i=1NI(yi=ck),k=1,2,...,K
P
(
X
(
j
)
=
a
i
j
∣
Y
=
c
k
)
=
∑
i
=
1
N
I
(
x
(
j
)
=
a
j
l
,
y
i
=
c
k
)
∑
i
=
1
N
I
(
y
i
=
c
k
)
P(X^{(j)}=a_{ij}|Y=c_k)=\frac {\sum^{N}_{i=1}I(x^{(j)}=a_{jl},y_i=c_k)} {\sum^{N}_{i=1}I(y_i=c_k)}
P(X(j)=aij∣Y=ck)=∑i=1NI(yi=ck)∑i=1NI(x(j)=ajl,yi=ck)
(2)对于给定的实例
x
i
=
(
x
i
(
1
)
,
x
i
(
2
)
,
.
.
.
,
x
i
(
n
)
)
T
x_i=(x^{(1)}_i,x^{(2)}_i,...,x^{(n)}_i)^T
xi=(xi(1),xi(2),...,xi(n))T,计算
P
(
Y
=
c
k
)
∏
n
j
=
1
P
(
X
(
j
)
=
x
(
j
)
∣
Y
=
c
k
)
P(Y=c_k) \underset {j=1}{\overset n {\prod }}P(X^{(j)}=x^{(j)}|Y=c_k)
P(Y=ck)j=1∏nP(X(j)=x(j)∣Y=ck)
(3)确定实例的类
y
=
a
r
g
m
a
x
c
k
P
(
y
=
c
k
)
∏
n
j
=
1
P
(
X
(
j
)
=
x
(
j
)
∣
Y
=
c
k
)
y=arg\underset {c_k}{max}P(y=c_k) \underset {j=1}{\overset n {\prod }}P(X^{(j)}=x^{(j)}|Y=c_k)
y=argckmaxP(y=ck)j=1∏nP(X(j)=x(j)∣Y=ck)
例:通过下表学习朴素贝叶斯分类器并确定
x
=
(
2
,
S
)
T
x=(2,S)^T
x=(2,S)T的类标记
y
y
y。
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
X ( 1 ) X^{(1)} X(1) | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
X ( 2 ) X^{(2)} X(2) | S S S | M M M | M M M | S S S | S S S | S S S | M M M | M M M | L L L | L L L | L L L | M M M | M M M | L L L | L L L |
Y Y Y | -1 | -1 | 1 | 1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 |
解:
P
(
Y
=
1
)
=
9
15
,
P
(
Y
=
−
1
)
=
6
15
P(Y=1)=\frac 9 {15},P(Y=-1)=\frac 6 {15}
P(Y=1)=159,P(Y=−1)=156
P
(
X
(
1
)
=
1
∣
Y
=
1
)
=
2
9
,
P
(
X
(
1
)
=
2
∣
Y
=
1
)
=
3
9
,
P
(
X
(
1
)
=
3
∣
Y
=
1
)
=
4
9
P(X^{(1)}=1|Y=1)=\frac 2 {9},P(X^{(1)}=2|Y=1)=\frac 3 {9},P(X^{(1)}=3|Y=1)=\frac 4 {9}
P(X(1)=1∣Y=1)=92,P(X(1)=2∣Y=1)=93,P(X(1)=3∣Y=1)=94
P
(
X
(
2
)
=
1
∣
Y
=
1
)
=
1
9
,
P
(
X
(
2
)
=
2
∣
Y
=
1
)
=
4
9
,
P
(
X
(
2
)
=
3
∣
Y
=
1
)
=
4
9
P(X^{(2)}=1|Y=1)=\frac 1 {9},P(X^{(2)}=2|Y=1)=\frac 4 {9},P(X^{(2)}=3|Y=1)=\frac 4 {9}
P(X(2)=1∣Y=1)=91,P(X(2)=2∣Y=1)=94,P(X(2)=3∣Y=1)=94
P
(
X
(
1
)
=
1
∣
Y
=
−
1
)
=
3
6
,
P
(
X
(
1
)
=
2
∣
Y
=
−
1
)
=
2
6
,
P
(
X
(
1
)
=
3
∣
Y
=
−
1
)
=
1
6
P(X^{(1)}=1|Y=-1)=\frac 3 {6},P(X^{(1)}=2|Y=-1)=\frac 2 {6},P(X^{(1)}=3|Y=-1)=\frac 1 {6}
P(X(1)=1∣Y=−1)=63,P(X(1)=2∣Y=−1)=62,P(X(1)=3∣Y=−1)=61
P
(
X
(
2
)
=
1
∣
Y
=
−
1
)
=
3
6
,
P
(
X
(
2
)
=
2
∣
Y
=
−
1
)
=
2
6
,
P
(
X
(
2
)
=
3
∣
Y
=
−
1
)
=
1
6
P(X^{(2)}=1|Y=-1)=\frac 3 {6},P(X^{(2)}=2|Y=-1)=\frac 2 {6},P(X^{(2)}=3|Y=-1)=\frac 1 {6}
P(X(2)=1∣Y=−1)=63,P(X(2)=2∣Y=−1)=62,P(X(2)=3∣Y=−1)=61
对于给定的
x
=
(
2
,
S
)
T
x=(2,S)^T
x=(2,S)T,计算:
P
(
Y
=
1
)
P
(
X
(
1
)
=
2
∣
Y
=
1
)
P
(
X
(
2
)
=
S
∣
Y
=
1
)
=
9
15
⋅
3
9
⋅
1
9
=
1
45
P(Y=1)P(X^{(1)}=2|Y=1)P(X^{(2)}=S|Y=1)=\frac 9 {15}\cdot \frac 3 {9}\cdot \frac 1 {9}=\frac 1 {45}
P(Y=1)P(X(1)=2∣Y=1)P(X(2)=S∣Y=1)=159⋅93⋅91=451
P
(
Y
=
−
1
)
P
(
X
(
1
)
=
2
∣
Y
=
−
1
)
P
(
X
(
2
)
=
S
∣
Y
=
−
1
)
=
6
15
⋅
2
6
⋅
3
6
=
1
15
P(Y=-1)P(X^{(1)}=2|Y=-1)P(X^{(2)}=S|Y=-1)=\frac 6 {15}\cdot \frac 2 {6}\cdot \frac 3 {6}=\frac 1 {15}
P(Y=−1)P(X(1)=2∣Y=−1)P(X(2)=S∣Y=−1)=156⋅62⋅63=151
因为
P
(
Y
=
−
1
)
P
(
X
(
1
)
=
2
∣
Y
=
−
1
)
P
(
X
(
2
)
=
S
∣
Y
=
−
1
)
P(Y=-1)P(X^{(1)}=2|Y=-1)P(X^{(2)}=S|Y=-1)
P(Y=−1)P(X(1)=2∣Y=−1)P(X(2)=S∣Y=−1)最大,所以
y
=
−
1
y=-1
y=−1。
(二)朴素贝叶斯法——贝叶斯估计
为防止估计值为0,引入正数
λ
\lambda
λ
例:同上,采用拉普拉斯平滑(Laplacian smoothing)取
λ
=
1
\lambda=1
λ=1估计概率
解:
P
(
Y
=
1
)
=
10
17
,
P
(
Y
=
−
1
)
=
7
17
P(Y=1)=\frac {10} {17},P(Y=-1)=\frac 7 {17}
P(Y=1)=1710,P(Y=−1)=177
P
(
X
(
1
)
=
1
∣
Y
=
1
)
=
3
12
,
P
(
X
(
1
)
=
2
∣
Y
=
1
)
=
4
12
,
P
(
X
(
1
)
=
3
∣
Y
=
1
)
=
5
12
P(X^{(1)}=1|Y=1)=\frac 3 {12},P(X^{(1)}=2|Y=1)=\frac 4 {12},P(X^{(1)}=3|Y=1)=\frac 5 {12}
P(X(1)=1∣Y=1)=123,P(X(1)=2∣Y=1)=124,P(X(1)=3∣Y=1)=125
P
(
X
(
2
)
=
1
∣
Y
=
1
)
=
2
12
,
P
(
X
(
2
)
=
2
∣
Y
=
1
)
=
5
12
,
P
(
X
(
2
)
=
3
∣
Y
=
1
)
=
5
12
P(X^{(2)}=1|Y=1)=\frac 2 {12},P(X^{(2)}=2|Y=1)=\frac 5 {12},P(X^{(2)}=3|Y=1)=\frac 5 {12}
P(X(2)=1∣Y=1)=122,P(X(2)=2∣Y=1)=125,P(X(2)=3∣Y=1)=125
P
(
X
(
1
)
=
1
∣
Y
=
−
1
)
=
4
9
,
P
(
X
(
1
)
=
2
∣
Y
=
−
1
)
=
3
9
,
P
(
X
(
1
)
=
3
∣
Y
=
−
1
)
=
2
9
P(X^{(1)}=1|Y=-1)=\frac 4 {9},P(X^{(1)}=2|Y=-1)=\frac 3 {9},P(X^{(1)}=3|Y=-1)=\frac 2 {9}
P(X(1)=1∣Y=−1)=94,P(X(1)=2∣Y=−1)=93,P(X(1)=3∣Y=−1)=92
P
(
X
(
2
)
=
1
∣
Y
=
−
1
)
=
4
9
,
P
(
X
(
2
)
=
2
∣
Y
=
−
1
)
=
3
9
,
P
(
X
(
2
)
=
3
∣
Y
=
−
1
)
=
2
9
P(X^{(2)}=1|Y=-1)=\frac 4 {9},P(X^{(2)}=2|Y=-1)=\frac 3 {9},P(X^{(2)}=3|Y=-1)=\frac 2 {9}
P(X(2)=1∣Y=−1)=94,P(X(2)=2∣Y=−1)=93,P(X(2)=3∣Y=−1)=92
对于给定的
x
=
(
2
,
S
)
T
x=(2,S)^T
x=(2,S)T,计算:
P
(
Y
=
1
)
P
(
X
(
1
)
=
2
∣
Y
=
1
)
P
(
X
(
2
)
=
S
∣
Y
=
1
)
=
10
17
⋅
4
12
⋅
2
12
=
0.0327
P(Y=1)P(X^{(1)}=2|Y=1)P(X^{(2)}=S|Y=1)=\frac {10} {17}\cdot \frac 4 {12}\cdot \frac 2 {12}=0.0327
P(Y=1)P(X(1)=2∣Y=1)P(X(2)=S∣Y=1)=1710⋅124⋅122=0.0327
P
(
Y
=
−
1
)
P
(
X
(
1
)
=
2
∣
Y
=
−
1
)
P
(
X
(
2
)
=
S
∣
Y
=
−
1
)
=
7
17
⋅
3
9
⋅
4
9
=
0.0610
P(Y=-1)P(X^{(1)}=2|Y=-1)P(X^{(2)}=S|Y=-1)=\frac 7 {17}\cdot \frac 3 {9}\cdot \frac 4 {9}=0.0610
P(Y=−1)P(X(1)=2∣Y=−1)P(X(2)=S∣Y=−1)=177⋅93⋅94=0.0610
因为
P
(
Y
=
−
1
)
P
(
X
(
1
)
=
2
∣
Y
=
−
1
)
P
(
X
(
2
)
=
S
∣
Y
=
−
1
)
P(Y=-1)P(X^{(1)}=2|Y=-1)P(X^{(2)}=S|Y=-1)
P(Y=−1)P(X(1)=2∣Y=−1)P(X(2)=S∣Y=−1)最大,所以
y
=
−
1
y=-1
y=−1。