决策树
import pandas as pd
import numpy as np
import graphviz
from sklearn.tree import DecisionTreeRegressor
from sklearn import tree
X = np.array(data[['C', 'E']]) # Create an array
y = np.array(data['NOx'])
regt = DecisionTreeRegressor(max_depth=4)
regt = regt.fit(X, y) # Build a decision tree regressor from the training set (X, y)
dot_data = tree.export_graphviz(regt, out_file=None) # Export a decision tree in DOT format
graph = graphviz.Source(dot_data)
graph.render("tree") # Save the source to file
[注]
节点属性:
X[1]
:X = np.array(data[['C', 'E']])
中的E
列,为特征值samples
:样本的数量mse
:均方误差(mean-square error, MSE)是反映估计量与被估计量之间差异程度的一种value
:平均值
print(regt.score(X, y))
------------------------------------
0.949306568162
regt1 = regt.fit(X[:, 1].reshape(-1, 1), y) # reshape(-1, 1) 将数组改为 多行1列
dot_data = tree.export_graphviz(regt, out_file=None)
graph = graphviz.Source(dot_data)
graph.render("tree1")
regt1.score(X[:, 1].reshape(-1, 1), y)
对比过后,发现 tree
和 tree1
完全相同
u = np.sort(np.unique(X[:, 1]))
t = np.diff(u)/2+u[:-1] # diff() 后一个元素减去前一个