朴素贝叶斯(Bayes)算法例题
题目给出
-
待分类的未知样本
X=(age="<=30",income="M",student="Y",credit_rating="fair")
求该样本的buys_computer属性值是啥(Y/N)
-
下表数据
序号 | age | income | student | credit_rating | buys_computer |
---|---|---|---|---|---|
1 | <=30 | H | N | fair | N |
2 | <=30 | H | N | excellent | N |
3 | 31~40 | H | N | fair | Y |
4 | >40 | M | N | fair | Y |
5 | >40 | L | Y | fair | Y |
6 | >40 | L | Y | excellent | N |
7 | 31~40 | L | Y | excellent | Y |
8 | <=30 | M | N | fair | N |
9 | <=30 | L | Y | fair | Y |
10 | >40 | M | Y | fair | Y |
11 | <=30 | M | Y | excellent | Y |
12 | 31~40 | M | N | excellent | Y |
13 | 31~40 | H | Y | fair | Y |
14 | >40 | M | N | excellent | N |
解:
-
由于要求buys_computer属性是啥值,所以先求其属性两种值分别的概率
P(buys_computer="Y")=9/14=0.643
P(buys_computer="N")=5/14=0.357
-
再求出待分类样本X=(age="<=30",income="M",student="Y",credit_rating="fair")
中各个属性值发生在(buys_computer="Y")和(buys_computer="N")的各个概率:
-
P(age<=30|buys_computer="Y")=2/9=0.222
-
P(age<=30|buys_computer="N")=3/5=0.6
-
P(income="M"|buys_computer="Y")=4/9=0.444
-
P(income="M"|buys_computer="N")=2/5=0.4
-
P(student="Y"|buys_computer="Y")=6/9=0.677
-
P(student="Y"|buys_computer="N")=1/5=0.2
-
P(credit_rating="fair"|buys_computer="Y")=6/9=0.677
-
P(credit_rating="fair"|buys_computer="N")=2/5=0.4
-
-
假设条件独立性,使用以上概率,得到:
-
将上面所有的发生在(buys_computer="Y")的概率相乘
P(X|buys_computer="Y")=0.222 * 0.444 * 0.677 * 0.677 = 0.044----------①
-
将上面所有的发生在(buys_computer="N")的概率相乘
P(X|buys_computer="N")=0.6 * 0.4 * 0.2 * 0.4= 0.019----------②
-
再将发生在(buys_computer="Y")的概率相乘的结果①乘以(buys_computer="Y")的概率
P(X|buys_computer="Y") * P(buys_computer="Y") = 0.044 * 0.643 = 0.028-------③
-
再将发生在(buys_computer="N")的概率相乘的结果②乘以(buys_computer="N")的概率
P(X|buys_computer="N") * P(buys_computer="N") = 0.019 * 0.357 = 0.007---------④
-
-
由于③>④,所以待分类样本X=(age="<=30",income="M",student="Y",credit_rating="fair")的
buys_computer属性值是Y