决策树入门，小白也能轻松掌握！！！！

3Ghz

已于 2023-04-26 15:59:48 修改

阅读量271

点赞数 2

文章标签：决策树汽车 python pandas 机器学习

于 2023-04-24 21:54:08 首次发布

本文链接：https://blog.csdn.net/m0_63825760/article/details/130352737

版权

一、观察表格，导入库

import pandas as pd
from sklearn import tree
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
data=pd.read_excel('文件名字.xlsx')
data.head()

二、按照9：1划分数据集和训练集

from sklearn.model_selection import train_test_split
feature_vector=data.iloc[:,1:13]
objcet_variable=data['购买意愿']
x_train,x_test,y_train,y_test=train_test_split(feature_vector,objcet_variable,test_size=0.1,random_state=3)

三、构建模型训练模型进行评估

clf=tree.DecisionTreeClassifier(criterion='entropy',max_depth=7,random_state=3)
clf=clf.fit(x_train,y_train)
clf.score(x_test,y_test)

结果为0.875

四、使用混淆矩阵对模型进行评估

import warnings
from sklearn.metrics import ConfusionMatrixDisplay
warnings.filterwarnings('ignore')
y_pred=clf.predict(x_test)
cm=confusion_matrix(y_test,y_pred)
ConfusionMatrixDisplay(confusion_matrix=cm).plot()

该模型可以作为决策的依据

五、构建决策树并可视化

import matplotlib
y_pred=clf.predict(x_test)
confusion_matrix(y_test,y_pred)
name=data.columns
feature_names=name.tolist()
class_names=['愿意','不愿意']
plt.figure(dpi=80,figsize=(25,9))
matplotlib.rcParams['font.sans-serif'] = ['SimHei']     # 显示中文
matplotlib.rcParams['axes.unicode_minus'] = False
tree.plot_tree(clf,feature_names=feature_names,class_names=class_names,impurity=False,fontsize=10)

六、预测

forecast=pd.read_excel('文件名字.xlsx',sheet_name='预测客户数据')
clf.predict(forecast.iloc[:,1:])

 结果：  array([0, 0, 1, 0, 0], dtype=int64)

--------------------------------------------------------------------------------------------------------------------------------

兄弟们，写题不易，记得点赞关注！！！！！

3Ghz

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
决策树入门，小白也能轻松掌握！！！！

该模型可以作为决策的依据。
复制链接

扫一扫