LAB 4
Multinomial naive bayes classifier to predict spam SMS.WE implement a unigram model here
1.we need to caculate following
which is the probility of class_i occur in all document , in this case we have two class ,one is spam
and another is ham
2.this is the likelihood of that word given a class ,
w represent word , c represent class
count(w,c) is the total number of that word occur in that class
count ( c ) is the total number of word in that class
we also use add-1 smoothing in this case , the purpose is that we aviod the 0 appear in the probility in the every single word by add 1 in the Numerator and add V which is the total number of set(all wrods) to the denominator,althongh the word not occur in the vacabulary ,the probility will not assign 0
3.this is priors
Caculate the Conditional Probilities
By implement the equition in 2 , calculate the P(word|class)
of all the word in the sms
both in spam
and ham
,after that we are able to caculate the likelihood of P(class ham|sms)
and P(class spam|sms)
which is
i : every word in the sms
and using same equition in spam ,after that caculating the ratio and all good to go (: