sklearn 下的树模型

最新推荐文章于 2024-03-12 09:34:44 发布

五道口纳什

最新推荐文章于 2024-03-12 09:34:44 发布

阅读量1.7k

点赞数

分类专栏： numpy-scipy-pandas-sklearn-xgb

本文链接：https://blog.csdn.net/lanchunhui/article/details/79968363

版权

numpy-scipy-pandas-sklearn-xgb 专栏收录该内容

20 篇文章 2 订阅

订阅专栏

树模型天然会对特征进行重要性排序，以分裂数据集，构建分支；

0. 决策树模型

模型参数：
- criterion: ”gini” or “entropy”(default=”gini”)是计算属性的gini(基尼不纯度)还是entropy(信息增益)，来选择最合适的节点。
- splitter: ”best” or “random”(default=”best”)随机选择属性还是选择不纯度最大的属性，建议用默认。
- max_features: 选择最适属性时划分的特征不能超过此值。当为整数时，即最大特征数；当为小数时，训练集特征数*小数；
  - if “auto”, then max_features=sqrt(n_features).
  - If “sqrt”, thenmax_features=sqrt(n_features).
  - If “log2”, thenmax_features=log2(n_features).
  - If None, then max_features=n_features.

1. 使用 Random Forest

from sklearn.datasets import load_boston
from sklearn.ensemble import RandomForestRegressor


boston_data = load_boston()
X = boston_data['data']
y = boston_data['target']
    # dir(boston_data) ⇒ 查看其支持的属性为 ['DESCR', 'data', 'feature_names', 'target']
rf = RandomForestRegressor()
rf.fit(X, y)

print(sorted(zip(boston_data['feature_names'], map(lambda x: round(x, 4), 
                                                   rf.feature_importances_)),
             key=operator.itemgetter(1), reverse=True))

五道口纳什

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
sklearn 下的树模型

树模型天然会对特征进行重要性排序，以分裂数据集，构建分支；1. 使用 Random Forestfrom sklearn.datasets import load_bostonfrom sklearn.ensemble import RandomForestRegressorboston_data = load_boston()X = boston_data['d...
复制链接

扫一扫