graphviz数据可视化 与Python交互

版权声明:本文为博主原创文章,遵循 CC 4.0 by-sa 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/zn505119020/article/details/73957277

graphviz

1下载

下载链接:  http://www.graphviz.org/Download_windows.php

选择 graphviz-2.38.msi


2安装

下一步 - 下一步

3环境配置

将graphviz下的bin安装路径 放到 系统环境变量中 , 







4测试安装成功

cmd 命令行 

输入 dot -version



出现版本信息 ,说明配置成功


Python 

Python下生成的tree.dot


1在python工程文件中放1.txt

 

1.txt内容


1.5 50 thin  
1.5 60 fat  
1.6 40 thin  
1.6 60 fat  
1.7 60 thin  
1.7 80 fat  
1.8 60 thin  
1.8 90 fat  
1.9 70 thin  
1.9 80 fat  



2决策树生成代码,生成一个新的文件 tree.dot


# -*- coding: utf-8 -*-
import numpy as np
import scipy as sp
from sklearn import tree
from sklearn.metrics import precision_recall_curve
from sklearn.metrics import classification_report
from sklearn.cross_validation import train_test_split

''''' 数据读入 '''
data = []
labels = []
with open("1.txt") as ifile:
    for line in ifile:
        tokens = line.strip().split(' ')
        data.append([float(tk) for tk in tokens[:-1]])
        labels.append(tokens[-1])
x = np.array(data)
labels = np.array(labels)
y = np.zeros(labels.shape)

''''' 标签转换为0/1 '''
y[labels == 'fat'] = 1

''''' 拆分训练数据与测试数据 '''
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

''''' 使用信息熵作为划分标准,对决策树进行训练 '''
clf = tree.DecisionTreeClassifier(criterion='entropy')
print(clf)
clf.fit(x_train, y_train)

''''' 把决策树结构写入文件 '''
with open("tree.dot", 'w') as f:
    f = tree.export_graphviz(clf, out_file=f)

''''' 系数反映每个特征的影响力。越大表示该特征在分类中起到的作用越大 '''
print(clf.feature_importances_)

'''''测试结果的打印'''
answer = clf.predict(x_train)
print(x_train)
print(answer)
print(y_train)
print(np.mean(answer == y_train))

'''''准确率与召回率'''
precision, recall, thresholds = precision_recall_curve(y_train, clf.predict(x_train))
answer = clf.predict_proba(x)[:, 1]
print(classification_report(y, answer, target_names=['thin', 'fat']))

tree.dot

digraph Tree {

node [shape=box] ;

0 [label="X[1] <= 75.0\nentropy = 1.0\nsamples= 8\nvalue = [4, 4]"] ;

1 [label="X[0] <= 1.65\nentropy =0.9183\nsamples = 6\nvalue = [4, 2]"] ;

0 -> 1 [labeldistance=2.5, labelangle=45,headlabel="True"] ;

2 [label="X[1] <= 50.0\nentropy =0.9183\nsamples = 3\nvalue = [1, 2]"] ;

1 -> 2 ;

3 [label="entropy = 0.0\nsamples = 1\nvalue = [1,0]"] ;

2 -> 3 ;

4 [label="entropy = 0.0\nsamples = 2\nvalue = [0,2]"] ;

2 -> 4 ;

5 [label="entropy = 0.0\nsamples = 3\nvalue = [3,0]"] ;

1 -> 5 ;

6 [label="entropy = 0.0\nsamples = 2\nvalue = [0,2]"] ;

0 -> 6 [labeldistance=2.5, labelangle=-45,headlabel="False"] ;

}


3在命令行 cd 定位到tree.dot的文件位置


输入  dot -Tpdf  tree.dot  -o tree.pdf  或者 dot -Tpng tree.dot -o tree.png 分别生成对应的pdf格式文件  或者 png图片格式


4生成tree.pdf文件





展开阅读全文

没有更多推荐了,返回首页