学习人工智能-用python进行训练步骤,DecisionTreeClassifier模型,Graphviz Interactive Preview 预览.dot文件

用python进行训练步骤:

  1. 导入数据 import the data
  2. 清理数据 clean the data
  3. 把数据分为训练和验证部分 split the data into training and test sets
  4. 创建一个模型 create a model
  5. 训练模型 train the model
  6. 创建预测 make predictions
  7. 打分评估 score and evaluate.

以DecisionTreeClassifier为例:

直接把数据灌入模型,然后预测

import pandas as pd
from sklearn.tree import DecisionTreeClassifier


music_data = pd.read_csv('music.csv')
x =  music_data.drop(columns=['genre'])
y = music_data['genre']

model  = DecisionTreeClassifier()
model.fit(x,y)

predictions  = model.predict([[21,1],[22,0]])
print(predictions)

把数据分为训练数据和验证数据,再训练,之后打分。

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

music_data = pd.read_csv('music.csv')
x = music_data.drop(columns=['genre'])
y = music_data['genre']
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.1)


model = DecisionTreeClassifier()
model.fit(x_train, y_train)

predictions  = model.predict(x_test)


score_results = accuracy_score(y_test, predictions)
print(score_results)

训练后把训练的结果存入文件,以后可以直接载入文件进行预测。

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import joblib


music_data = pd.read_csv('music.csv')
x = music_data.drop(columns=['genre'])
y = music_data['genre']
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.1)
model = DecisionTreeClassifier()
model.fit(x_train, y_train)

#把训练完成的结果输出到文件。
joblib.dump(model, 'music-rec.joblib')

#直接载入 训练完成的文件,用来预测。
model = joblib.load('music-rec.joblib')
predictions = model.predict([[21,1],[22,0]])
print(predictions)

生成.dot文件,用VS code生成预览,查看训练结束的逻辑。

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import joblib

from sklearn import tree


music_data = pd.read_csv('music.csv')
x = music_data.drop(columns=['genre'])
y = music_data['genre']
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.1)

model = DecisionTreeClassifier()
model.fit(x_train, y_train)

tree.export_graphviz(model,
                     out_file='music-plot.dot',
                     feature_names=['age','gender'],
                     class_names=sorted(y_train.unique()),
                     label='all',
                     rounded=True,
                     filled=True
                     )

在VS code里面安装插件Graphviz Interactive Preview, 预览.dot文件,查看逻辑。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Ankie(资深技术项目经理)

打赏就是赞赏,感谢你的认可!

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值