机器学习- 决策树实现可视化
-
要安装库sklearn,pandas,graphviz在cmd中安装
pip install sklearn
pip install pandas
pip install graphviz -
下载graphviz,装环境变量
下载网址:在下载的时候有点慢https://graphviz.gitlab.io/_pages/Download/Download_windows.html
安装环境变量:
在我的环境变量里添加你下载包的位置
然后在系统环境变量里添加
-
简单代码块
import pandas as pd
from sklearn.tree import DecisionTreeClassifier,export_graphviz
from sklearn.metrics import classification_report
import graphviz
data = pd.read_csv("titanic_data.csv") #将csv导入
data.drop("PassengerId",axis = 1,inplace = True)#删除pessengerId,inplace=true表示在data中直接改
data.loc[data['Sex'] == 'male','Sex'] = 1 #将数值1来代替male,用0来代替female
data.loc[data['Sex'] == 'female','Sex'] = 0
data.fillna(data['Age'].mean(),inplace=True)#用age的均值对一些空白进行填充
Dtr = DecisionTreeClassifier(max_depth=5,random_state=8)#构建决策树模型
Dtr.fit(data.iloc[:,1:],data['Survived'])#模型训练,节点,目标值
pre = Dtr.predict(data.iloc[:,1:])#模型预测
pre == data['Survived'] #比较模型预测值与样本实际值是否一致
classification_report(data['Survived'],pre)#分类报告
dot_data = export_graphviz(Dtr, feature_names=['Pclass','Sex','Age'],class_names="Survived")
graph = graphviz.Source(dot_data)
print(graph)
运行结果
实现可视化
方法一:jupyter notebook 的安装和启动
1.打开cmd
2. 查看是否以按组件:python -m pip list
3. jupyter notebook安装命令:pip install jupyter
4. 启动命令:jupyter notebook,在cmd输入jupyter notebook,浏览器会自动跳到jupyter notebook上
5.选择你的代码的路径,点击python
6.将代码复制去,按运行,就有结果出来(我的结果出来的有点慢,可能需要等一等)
运行结果,图有点大
方法二:直接在代码下面加
1.加代码,保存文件,生成文件treeone.dot
import pandas as pd
from sklearn.tree import DecisionTreeClassifier,export_graphviz
from sklearn.metrics import classification_report
import graphviz
data = pd.read_csv("titanic_data.csv") #将csv导入
data.drop("PassengerId",axis = 1,inplace = True)#删除pessengerId,inplace=true表示在data中直接改
data.loc[data['Sex'] == 'male','Sex'] = 1 #将数值1来代替male,用0来代替female
data.loc[data['Sex'] == 'female','Sex'] = 0
data.fillna(data['Age'].mean(),inplace=True)#用age的均值对一些空白进行填充
Dtr = DecisionTreeClassifier(max_depth=5,random_state=8)#构建决策树模型
Dtr.fit(data.iloc[:,1:],data['Survived'])#模型训练,节点,目标值
pre = Dtr.predict(data.iloc[:,1:])#模型预测
pre == data['Survived'] #比较模型预测值与样本实际值是否一致
classification_report(data['Survived'],pre)#分类报告
dot_data = export_graphviz(Dtr, feature_names=['Pclass','Sex','Age'],class_names="Survived")
graph = graphviz.Source(dot_data)
with open('treeone.dot', 'w') as f:
f.write(dot_data)
2.打开cmd,将路径定位到treeone.dot下,输入dot -Tpdf treeone.dot -o treeone.pdf,生成文件
3.打开treeone.pdf,结果:
参考:
https://blog.csdn.net/phyllisyuell/article/details/79903785?utm_source=blogxgwz5