1)导入包和数据(来自sklearn)
%matplotlib inline # 如果不是jupyter notebook 可无视
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.datasets.california_housing import fetch_california_housing
from sklearn import tree
import pydotplus
from IPython.display import Image
from sklearn.model_selection import train_test_split
2)查看数据描述
housing = fetch_california_housing()
print(housing.DESCR)
3)查看数据结构
housing.data.shape
4)决策树构建
- 这里是用决策树做回归(衡量标准是MSE),还可以做分类,调用tree.DecisionTreeClassifier
dtr =</