导入相应模块
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.tree import DecisionTreeRegressor
读取数据
scoring='neg_root_mean_squared_error'
X=pd.read_excel('boston.xlsx')
del X['Unnamed: 0']
Y=X['target'].copy()
del X['target']
names=X.columns
用决策树模型做特征选择
Tree = DecisionTreeRegressor()
Tree.fit(X,Y)
DecisionTreeRegressor()
print('feature'+'\t'+'importances')
for i in range(len(names)):
print('%s\t%.5f'%(names[i],Tree.feature_importances_[i]))
feature importances
CRIM 0.03738
ZN 0.00054
INDUS 0.00571
CHAS 0.00088
NOX 0.04893
RM 0.59258
AGE 0.00993
DIS 0.07267
RAD 0.00071
TAX 0.01198
PTRATIO 0.00837
B 0.01498
LSTAT 0.19533
排序后的特征
sortindex=np.argsort(Tree.feature_importances_)[::-1]
for i in sortindex:
print('%s\t%.5f'%(names[i],Tree.feature_importances_[i]))
RM 0.59258
LSTAT 0.19533
DIS 0.07267
NOX 0.04893
CRIM 0.03738
B 0.01498
TAX 0.01198
AGE 0.00993
PTRATIO 0.00837
INDUS 0.00571
CHAS 0.00088
RAD 0.00071
ZN 0.00054