统计学习方法第五章作业:ID3/C4.5算法分类决策树、平方误差二叉回归树代码实现

这篇博客介绍了如何使用ID3/C4.5算法进行分类决策树的构建,并探讨了平方误差二叉回归树的实现,通过Python展示了机器学习中的决策树算法应用。
摘要由CSDN通过智能技术生成

ID3/C4.5算法分类决策树

import numpy as np
import math
class Node:
    def __init__(self,feature_index=None,value=None,label=None):
        self.feature_index=feature_index
        self.value=value
        self.child=[]
        self.label=label


class C4_5:
    def __init__(self,X,Y,c=0.1,way='ID3'):
        self.c = c
        self.root=Node()
        self.X = X
        self.Y = Y
        self.feature_num = len(X[0])
        self.label_num = len(Y)
        self.feature_set = list(range(self.feature_num))
        self.getac()
        self.way = way

    def getac(self):
        self.dict_x = {
   }
        self.dict_y = set(self.Y)
        for i in range(self.feature_num):
            self.dict_x[i] = set([X[i] for X in self.X])

    @staticmethod
    def get_label(list_):
        return max(list_, key=list_.count)

    @staticmethod
    def count_Y(Y):
        dict_y = {
   }
        for i in Y:
            if i in dict_y.keys():
                dict_y[i]+=1
            else:
                dict_y[i] = 1
        return dict_y

    def experience_entropy(self,Y):
        dict_y = self.count_Y(Y)
        D = len(Y)
        set_y = set(Y)
        return -sum([dict_y[x]/D*math.log(dict_y[x]/D,2) for x in set_y])

    def get_feature(self,X,Y,rest_x):
        HD = self.experience_entropy(Y)
        Y = np.array(Y)
        X = np.array(X)
        entropy_ = []

        if self.way == 'ID3':
            for i in rest_x:
                sum_ = 0
                list_x = np.array([x[i] for x in X])
                for j in self.dict_x[i]:
                    sum__ = 0
                    Di = sum(list_x == j)
                    if Di != 0:
                        for m in self.dict_y:
                            Dik = sum(Y[list_x == j]==m)
                            if Dik != 0:
                                sum__ += Dik/Di*math.log(Dik/Di,2)
                    sum_ -= Di/len(list_x)*sum__
                add_entropy = HD - sum_
                entropy_.append(add_entropy)

        if self.way == 'C45':
            for i in rest_x:
                sum_ = 0
                list_x = np.array([x[i] for x in X])
                for j in self.dict_x[i]:
                    sum__ = 0
                    HAD = 0
                    Di = sum(list_x == j)
                    if Di != 0:
                        for m in self.dict_y:
                            Dik = sum(Y[list_x == j]==m)
                            if Dik != 0
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值