•Bayes Rule
•Maximum a posteriori (MAP) hypothesis
Note P(x) is independent of h, hence can be ignored.
•Assuming that each hypothesis in H is equally probable, i.e., P(hi) = P(hj), for all i and j, then we can drop P(h) in MAP. P(d|h) is often called the likelihood of data d given h. Any hypothesis that maximizes P(d|h) is called the maximum likelihood hypothesis
(如果类的先验概率未知那我们就假设对于任意的i,j ; P(hi) = P(hj),此时我们可以不考虑 P(h),则目标函数化简如下:)
•The Bayesian approach to classifying a new instance X is to assign it to the most probable target value Y (MAP classifier)
(d为类别标示,x1,...x4为样本的各个维度的值)
(如果属性值之间不是相互独立的,那么我们不仅需要大量计算还需要一个巨大的训练样本集合,朴素贝叶斯假设样本的各个属性之间是相互独立的)
Naive Bayesian Classifier is based on the simplifying assumption that the attribute values are conditionally independent given the target value.
This means, we have
(朴素贝叶斯的目标函数)
e.g