机器学习算法与编程--郑捷 C45D算法 python3实现修改部分

最新推荐文章于 2024-02-04 18:42:24 发布

置顶 9527----到

最新推荐文章于 2024-02-04 18:42:24 发布

阅读量385

点赞数

分类专栏： pyer 文章标签：机器学习算法及编程 python3

本文链接：https://blog.csdn.net/jingtaoqian8521/article/details/79723562

版权

pyer 专栏收录该内容

22 篇文章 0 订阅

订阅专栏

此算法需要更改的地方出除了上篇写到的loadDataSet函数，在课本中getBestFeat（）函数中信息增益计算公式处给出的矩阵相除在py3无法运行需要改为dot（A,B.T）形式

具体代码

def getBestFeat(self,dataSet):
        Num_Feats=len(dataSet[0][:-1])
        totality=len(dataSet)
        BaseEntropy=self.computeEntropy(dataSet)
        ConditionEntropy=[]
        splitInfo=[]
        allFeatVList=[]
        for f in range(Num_Feats):
            featList=[example[f] for example in dataSet]
            [splitI,featureValueList]=self.computeSplitInfo(featList)
            allFeatVList.append(featureValueList)
            splitInfo.append(splitI)
            resultGain=0.0
            for value in featureValueList:
                subSet=self.splitDataSet(dataSet,f,value)
                appearNum=float(len(subSet))
                subEntropy=self.computeEntropy(subSet)
                resultGain+=(appearNum/totality)*subEntropy
            ConditionEntropy.append(resultGain)
        infoGainArray=BaseEntropy*ones(Num_Feats)-array(ConditionEntropy)
       # infoGainRatio=infoGainArray/array(splitInfo)#py2可以这样做但是py3不行
        infoGainRatio=dot(infoGainArray,array(splitInfo).T)#py3这种用法更贴近线性代数中矩阵除法形式
        bestFeatureIndex=argsort(-infoGainRatio)[0]
        return bestFeatureIndex,allFeatVList[bestFeatureIndex]

9527----到

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
机器学习算法与编程--郑捷 C45D算法 python3实现修改部分

此算法需要更改的地方出除了上篇写到的loadDataSet函数，在课本中getBestFeat（）函数中信息增益计算公式处给出的矩阵相除在py3无法运行需要改为dot（A,B.T）形式具体代码 def getBestFeat(self,dataSet): Num_Feats=len(dataSet[0][:-1]) totality=len(dataSet) ...
复制链接

扫一扫