Classifying with probability theory:naive Bayes
This chapter covers:
Using probability distributions for classification
Learning the naive Bayes classifier
Using naive Bayes to solve some problems in rel world
(1)Naive Bayes and Bayesian decision theory
Naive Bayes is a subset of Bayesian decision theory.
Bayesian decision theory:
For example,now we have an equation for a probability of a piece of data belonging to Class 1:p1(x,y), and we have an equation for a probability of a piece of data belonging to Class 2:p2(x,y). To classify a new neasurement with features (x,y), we use the following rules:
If p1(x,y)>p2(x,y), then the class is 1.
If p1(x,y)<p2(x,y), then the class is 2.
Put simply, we choose the class with the higher probability.That’s Bayesian decision theory in a nutshell: choosing the decision with the highest probability.
(2)Bayes’ rule and the Bayesian classification rule
Bayes’ rule:P(ci|x,y)=P(x,y|ci)P(ci)/P(x,y)
Use Bayes’ rule to calculate condition probability, then use the condition probability to provide the basis for classifying.
the Bayesian classification rule:
If P(c1|x,y)>P(c2|x,y), the class is c1
If P(c1|x,y)<P(c2|x,y), the class is c2
(3)Naive Bayes
Naive Bayes----an extension of the Bayesian classifier.
One of the applications of naive Bayes is docunment classification.We can look at the documents by the words used in them and treat the presence or absence of each word as a feature.Then what is meant by naive in the naive Bayes classifier?There are two assumptions in the naive Bayes. The first one is: we assume independence among the features.That means one feature or word is just as likely by itself as it is next to other words. The second one is: we make is that every feature is equally important.Despite the minor flaws of the two assumptions, naive Bayes works well in practice.