决策树的理论我就不说了,主要介绍怎么实(调)现(包),如下所示:
from sklearn.tree import DecisionTreeClassifier
dtc = DecisionTreeClassifier(criterion='gini',max_depth=100)
dtc.fit(x_new,y_train)
y_predict = dtc.predict(x_new2)
sklearn中的决策树不能进行后剪枝,所以只能通过max_depth字段限制其深度与min_samples_split限制子树继续划分
下面我来说一下决策树的可视化,首先我们需要安装一些东西,例如GraphViz,给你们网址:https://graphviz.gitlab.io/_pages/Download/Download_windows.html,
我选择的Graphviz-2.38.msi,下载好了安装就可以了,然后将其bin目录加入到系统path环境变量中,例如:D:\Program Files\Graphviz2.38\bin。第二步,pip install graphviz/ pip install pydotplus就可以了
如何实现可视化,我们在这里展示两种方法
方法1
import pandas as pd
import pydotplus
from sklearn import tree
from IPython.display import Image
import os
dtc = DecisionTreeClassifier()
dtc.fit(x_new,y_train)
data_feature_name = x.columns1
data_target_name = np.unique(y.astype('str'))#本文label是flaot格式
os.environ["PATH"] += os.pathsep + 'D:/Program Files/Graphviz2.38/bin/'
dot_tree = tree.export_graphviz(dtc,out_file=None,feature_names=data_feature_name,class_names=data_target_name,filled=True, rounded=True,special_characters=True)
graph = pydotplus.graph_from_dot_data(dot_tree)
img = Image(graph.create_png())
graph.write_png("D:\\out.png")
os.environ["PATH"] += os.pathsep + 'D:/Program Files/Graphviz2.38/bin/'
dot_tree = tree.export_graphviz(dtc,out_file=None,feature_names=data_feature_name,class_names=data_target_name,filled=True, rounded=True,special_characters=True)
graph = pydotplus.graph_from_dot_data(dot_tree)
img = Image(graph.create_png())
graph.write_png("D:\\out.png")
方法二
iris=load_iris()
clf=tree.DecisionTreeClassifier()
clf=clf.fit(iris.data,iris.target)
tree.export_graphviz(clf,'D:/tree.dot')
1.可以在GraphViz的bin目录下,找到gvedit.exe文件
2.将其打开,界面如下图:
3.此时可以点击File -> Open 来打开dot文件。
4.然后点击下图所示按键,将输出并保存png/pdf文件