这片文章经过修订
原文章发表于2020年10月8日。由于我们的工作疏忽,原文章出现了致命的错误。为保证学术知识的严谨,避免误导,我们第一时间撤回了原文章。现将修正后的文章重新发表,在此对大家表示歉意。
Introducing the Naive Bayes Classifier
import numpy as npfrom fractions import Fractiondata="""1 0 10 0 00 1 10 0 11 1 01 0 10 1 00 1 1"""data=np.array([[int(x) for x in line.split()] for line in data.split('\n')], dtype='int')X = data[:,:-1]Y = data[:,-1]
1. Probabilities necessary for a NB classifier
To classify future data points, we need to train our model by storing the likelihood and prior probability values to compute the posterior probability of future data points to make inference.
1. Compute the likelihood values using:
def pxiy(i, xv, yv): ''' calculates P(xi=xv|y=yv) using number of xi=xv and y=yv / number of y=yv ''' return Fraction(sum(X[Y == yv][:,i] == xv), sum(Y == yv))
2. Compute the prior probabilities using:
def py(yv): ''' calculates P(y=yv) using number of y=yv / number of y ''' return Fraction(sum(Y == yv), len(Y))
P(xi|y), probabilities not shown can be calculated using the law of total probability:
for jy in [0,1]: for ix in [0,1]: print(f"P(x{ix+1}=0|y={jy}) = {pxiy(ix, 0, jy)}")# OutputP(x1=0|y=0) = 2/3P(x2=0|y=0) = 1/3P(x1=0|y=1) = 3/5P(x2=0|y=1) = 3/5
P(y), probabilities not shown can be calculated using the law of total probability:
print(f"P(y=0) = {py(0)}")# OutputP(y=0) = 3/8
Naive bayes solves the problem of combinatorial explotion of the parameters needed to store in the model. By assuming that the input features are conditionally independent on the target variable y, the number of parameters required for the prediction model is decreased. These are the parameters of the naive bayes classifier, in real world application these values are saved to compute prediction.
Using your NB model, what value of y is predicted given the observation (x1, x2) = (1, 1)?
Naive bayes assumes conditional independence between all input feature x on target y, in case of our example data:
It is important to use the right arrow, since the left and right part may not be equal, it is only equal if the naïve bayes assumption holds. If not, the value calculated using naïve bayes assumption is still a good approximation of the true likelihood value.
def naiveBayes(xv, yv): ''' calculates P(x1,x2|y=yv) using naive bayes assumption ''' return np.prod([pxiy(i, xv[i], yv) for i in range(len(xv))])
To calculate posterior probability, we can use
where P(x=xv) can be calculated as
using the law of total probability and naive bayes assumption.
def pygivenx(xv, yv): ''' calculates P(y=yv|x1,x2) ''' return (naiveBayes(xv,yv) * py(yv)/ (naiveBayes(xv, 0) * py(0) + naiveBayes(xv, 1) * py(1)))
Since
We can compute and then compare this value with different y In this case since we only have 2 possible values for y, predict can be computed by finding the maximum P(y=yv|x=xv) given each possible y value:
def predict(xv): # calculate probability using naive bayes assumption ''' calculate probability using naive bayes assumption, comparing only p(x|y=yv) * p(y=yv) ''' return np.argmax([naiveBayes(xv,0) * py(0), naiveBayes(xv,1) * py(1)])
Predict soft computes the probability of each class using P(y=yv|x=xv)
def predictSoft(xv): ''' calculate probability using naive bayes assumption ''' return [float(round(pygivenx(xv, 0), 2)), float(round(pygivenx(xv, 1), 2))] # [ P(y=0|x) , P(y=1|x) ]predictSoft([1,1])# Output: [0.45, 0.55]predict([1,1])# Output: 1
The answer is 1, since classifier predicts 0.55 probability that y=1 and 0.45 probability that y=0.
Using your NB model, what is the probability p(y = 0|x1 = 1, x2 = 0)?
float(pygivenx(xv=[1,0], yv=0))# output: 0.21739130434782608
The answer is P(y=0|x1=1,x2=0)=0.217
Using your NB model, what is the probability p(y = 0|x2 = 0)?
here x2 is at index 1, therefore i=1
float(pxiy(i=1, xv=0, yv=1) * py(yv=1) / (pxiy(i=1, xv=0, yv=0) * py(yv=0) + pxiy(i=1, xv=0, yv=1) * py(yv=1)))# output: 0.75
The answer is P(y=1|x2=0)=0.75
文案 / CUCS学术部 梁力天
编辑 / CUCS宣传部
© CUCS 2020.10