Naive Bayes 朴素贝叶斯概念解读

最新推荐文章于 2024-07-09 16:15:46 发布

hUaleeF

最新推荐文章于 2024-07-09 16:15:46 发布

阅读量192

点赞数

分类专栏： NLP Learning Notes 文章标签：机器学习算法人工智能

本文链接：https://blog.csdn.net/hua_453/article/details/127558579

版权

NLP Learning Notes 专栏收录该内容

8 篇文章 2 订阅

订阅专栏

Naive Bayes

Naive Bayes is a probabilistic machine learning algorithm based on the Bayes Theorem, used in a wide variety of classification tasks.

Bayes Theorem
Bayes’ Theorem is a simple mathematical formula used for calculating conditional probabilities.

The formula is:
在这里插入图片描述

It tells us how often A happens given that B happens, written P(A|B), when we know how often B happens given that A happens, written P(B|A) , and how likely A and B are on their own. Naive Bayes calculates the probabilities for every factor then it selects the outcome with highest probability.

The fundamental Naive Bayes assumption is that each feature makes an independent and equal contribution to the outcome.

If we have a certain event $E$ and test actors $x 1, x 2, x 3$ , etc.

We first calculate $P (x 1∣ E), P (x 2∣ E) \dots$ [read as probability of x1 given event E happened] and then select the test actor $x$ with maximum probability value.

It is powerful algorithm used for:

Real time Prediction
Text classification/ Spam Filtering
Recommendation System

Advantages

It is not only a simple approach but also a fast and accurate method for prediction.
Naive Bayes has very low computation cost.
It can efficiently work on a large dataset.
It performs well in case of discrete response variable compared to the continuous variable.
It can be used with multiple class prediction problems.
It also performs well in the case of text analytics problems.
When the assumption of independence holds, a Naive Bayes classifier performs better compared to other models like logistic regression.

Disadvantages

The assumption of independent features. In practice, it is almost impossible that model will get a set of predictors which are entirely independent.
If there is no training tuple of a particular class, this causes zero posterior probability. In this case, the model is unable to make predictions. This problem is known as Zero Probability/Frequency Problem.