费舍尔方法

最新推荐文章于 2024-07-04 10:26:47 发布

Running_you

最新推荐文章于 2024-07-04 10:26:47 发布

阅读量1.9k

点赞数

分类专栏：算法 python

本文链接：https://blog.csdn.net/sinat_29508201/article/details/47616527

版权

朴素贝叶斯分类器存在无法准确估算分类概率的问题，而费舍尔方法提供了一种直接估算特征属于特定类别概率的方法。本文通过基础代码示例，探讨了如何使用费舍尔方法来增强分类性能，并预告将深入介绍更多相关内容。

摘要由CSDN通过智能技术生成

朴素贝叶斯不能对分类概率进行大致估算，只能判定特征项属于哪一类的概率最大，因此费舍尔方法弥补该缺陷，可以直接估算特征项从属于某一类的概率值，下面首先列出基本代码以供参考，后期补充内容：

# -*- coding: utf-8 -*-
import re
import math

def getwords(doc):
    splitter=re.compile('\\W*')

    words=[s.lower() for s in splitter.split(doc) if len(s)>2 and len(s)<20]
    #只返回一组不重复的单词
    return dict([(w,1) for w in words])

def sampletrain(cl):
    cl.train('hello,everybody,welcome to suning','good')
    cl.train('hello,everybody,nice to meet you','good')
    cl.train('hello,everybody,IT is bad to you','bad')

class classifier:
    def __init__(self,getfeatures,filename=None):
        self.fc={}
        self.cc={}
        self.getfeatures=getfeatures

    #参数类型 单词，文档类型(tezh