SpamBayes

  1. SpamBayes是一个用Python编写的贝叶斯 垃圾邮件过滤器,它使用了Paul Graham在他的文章“垃圾邮件计划”中提出的技巧。随后,Gary Robinson和Tim Peters等人对其进行了改进。
  2. 传统的贝叶斯过滤器和SpamBayes使用的过滤器之间最显着的区别是有三种分类而不是两种:垃圾邮件,非垃圾邮件(在SpamBayes中称为ham),和不确定
  3. 用户将消息训练为火腿或垃圾邮件; 过滤邮件时,垃圾邮件过滤器为火腿生成一个分数,为垃圾邮件生成另一个分数。如果垃圾邮件分数较高且火腿分数较低,则该邮件将被归类为垃圾邮件
  4. 如果垃圾邮件分数较低且火腿得分较高,则该邮件将被归类为火腿。如果分数既高又低,则该消息将被归类为不确定
  5. 不确定的这种方法导致假阳性和假阴性的数量较少,但它可能导致许多需要人类决定的不确定因素。
  6. 来自维基百科,原文如下:
    SpamBayes Original author(s)is a Bayesian spam filter written in Python which uses techniques laid out by Paul Graham in his essay “A Plan for Spam”. It has subsequently been improved by Gary Robinson and Tim Peters, among others.
    The most notable difference between a conventional Bayesian filter and the filter used by SpamBayes is that there are three classifications rather than two: spam, non-spam (called ham in SpamBayes), and unsure. The user trains a message as being either ham or spam; when filtering a message, the spam filters generate one score for ham and another for spam.
    If the spam score is high and the ham score is low, the message will be classified as spam.
    If the spam score is low and the ham score is high, the message will be classified as ham.
    If the scores are both high or both low, the message will be classified as unsure.
    This approach leads to a low number of false positives and false negatives, but it may result in a number of unsures which need a human decision.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值