❤❤❤ID3算法
:white_check_mark::white_check_mark:决策树的思想:
给定一个集合,其中的每个样本由若干属性表示,决策树通过贪心的策略不断挑选最优的属性。 常见的决策树算法有ID3,C4.5,CART算法等。
:zzz::zzz::zzz:ID3算法:
baseEntropy = self.calcShannonEnt(dataset) # 基础熵 num = len(dataset) # 样本总数 #子集中的概率 subDataSet = self.splitDataSet(dataset, i, val) prob = len(subDataSet) / float(num) # 条件熵 newEntropy += prob * self.calcShannonEnt(subDataSet) # 信息增益 infoGain = baseEntropy - newEntropy
:100:1.先写绘图树方法,函数。
import matplotlib.pyplot as plt
decisionNode = dict(boxstyle="sawtooth", fc="0.8")
leafNode = dict(boxstyle="round4", fc="0.8")
arrow_args = dict(arrowstyle="<-")
def plotNode(nodeTxt, centerPt, parentPt, nodeType):
createPlot.ax1.annotate(nodeTxt, xy=parentPt, xycoords='axes fraction', \
xytext=centerPt, textcoords='axes fraction', \
va="center", ha="center", bbox=nodeType, arrowprops=arrow_args)
def getNumLeafs(myTree):
numLeafs = 0
firstStr = list(myTree.keys())[0]
secondDict = myTree[firstStr]
for key in secondDict.keys():
if type(secondDict[key]).__name__ == 'dict':
numLeafs += getNumLeafs(secondDict[key])
else:
numLeafs += 1
return numLeafs
def getTreeDepth(myTree):
maxDepth = 0
firstStr = list(myTree.keys())[0]
secondDict = myTree[firstStr]
for key in secondDict.keys():
if type(secondDict[key]).__name__ == 'dict':
thisDepth = getTreeDepth(secondDict[key]) + 1
else:
thisDepth = 1
if thisDepth > maxDepth:
maxDepth = thisDepth
return maxDepth
def plotMidText(cntrPt, parentPt, txtString):
xMid = (parentPt[0] - cntrPt[0]) / 2.0 + cntrPt[0]
yMid = (parentPt[1] - cntrPt[1]) / 2.0 + cntrPt[1]
createPlot.ax1.text(xMid, yMid, txtString)
def plotTree(myTree, parentPt, nodeTxt):
numLeafs = getNumLeafs(myTree)
depth = getTreeDepth(myTree)
firstStr = list(myTree.keys())[0]
cntrPt = (plotTree.xOff + (1.0 + float(numLeafs)) / 2.0 / plotTree.totalw, plotTree.yOff)
plotMidText(cntrPt, parentPt, nodeTxt)
plotNode(firstStr, cntrPt, parentPt, decisionNode)
secondDict = myTree[firstStr]
plotTree.yOff = plotTree.yOff - 1.0 / plotTree.totalD
for key in secondDict.keys():
if type(secondDict[key]).__name__ =

本文介绍了ID3算法,一种决策树学习方法。通过贪心策略选择最优属性,涉及Python实现,包括绘图树方法和ID3决策树类。并给出了数据集和标签集。
最低0.47元/天 解锁文章

4700

被折叠的 条评论
为什么被折叠?



