特征重要性
feature_importances_
Return the feature importances (the higher, the more important the feature).
Returns: | feature_importances_ : array, shape = [n_features] The values of this array sum to 1, unless all trees are single node trees consisting of only the root node, in which case it will be an array of zeros. |
---|
这里返回的特征重要性是一个和模型输入特征等长的array,不会改变原有特征的顺序,即模型fit的train set里特征的顺序,和输出的特征重要性的数值顺序是一致的,一一对应的。
features = list(train.columns)
random_forest = RandomForestClassifier()
random_forest.fit(train, train_label)
feature_importance_values = random_forest.feature_importances_
feature_importance = pd.Dataframe({'Feature': features, 'Importance': feature_importance_values})