当涉及到决策树时,特征重要性不是一个黑匣子。来自DecisionTreeRegressor的文档:The importance of a feature is computed as the (normalized) total
reduction of the criterion brought by that feature. It is also known
as the Gini importance.
对于一个森林来说,它只是平均分布在你的森林中不同的树上。查看source code:def feature_importances_(self):
"""Return the feature importances (the higher, the more important the
feature).
Returns
-
feature_importances_ : array, shape = [n_features]
"""
if self.estimators_ is None or len(self.estimators_) == 0:
raise NotFittedError("Estimator not fitted, "
"call `fit` before `feature_importances_`.")
all_importances = Parallel(n_jobs=self.n_jobs,
backend="threading")(
delayed(getattr)(tree, 'feature_importances_')
for tree in self.estimators_)
return sum(all_importances) / len(self.estimators_)