解决SHAP不支持Adaboost的问题

最新推荐文章于 2024-03-17 00:03:17 发布

小闹兔兔

最新推荐文章于 2024-03-17 00:03:17 发布

阅读量1.3k

点赞数 2

文章标签： python

本文链接：https://blog.csdn.net/weixin_43983969/article/details/131128871

版权

文章讲述了在遇到SHAP库不支持Adaboost模型时，如何通过修改shap库的源代码来添加对AdaboostClassifier的支持。通过在_shap/explainers/_tree.py_文件中添加相关代码，成功使SHAP的TreeExplainer能够处理Adaboost模型，并展示了如何使用SummaryPlot进行特征重要性的可视化。

摘要由CSDN通过智能技术生成

SHAPley作为一种强大的、较为严密的模型可解释方法

利用博弈论中的边际递减效应，通过去掉每个特征，计算剩余的贡献值

除了能反映特征的重要性，还可以特征正负影响力

shap库支持GBDT、XGBoost、CatBoost等多种树模型，但却不支持Adaboost

在利用SHAP解释基于Adaboost的冠心病预测模型时报错（数据集见文末）

报错如下：shap.utils._exceptions.InvalidModelError: Model type not yet supported by TreeExplainer: <class 'sklearn.ensemble._weight_boosting.AdaBoostClassifier'>

解决方法：在文件.conda\envs\虚拟环境名\Lib\site-packages\shap\explainers\_tree.py

添加关于Adaboost的支持

开始参考腾讯云的回答2，添加代码；但仍报错，后又参考wangxiancao的博客

修改了第709行的Tree →SingleTree 成功运行！

正确添加代码如下：

# added begin
        elif safe_isinstance(model, ["sklearn.ensemble.AdaBoostClassifier", "sklearn.ensemble._weighted_boosting.AdaBoostClassifier"]):
            assert hasattr(model, "estimators_"), "Model has no `estimators_`! Have you called `model.fit`?"
            self.internal_dtype = model.estimators_[0].tree_.value.dtype.type
            self.input_dtype = np.float32
            scaling = 1.0 / len(model.estimators_)  # output is average of trees
            self.trees = [SingleTree(e.tree_, normalize=True, scaling=scaling) for e in model.estimators_]
            self.objective = objective_name_map.get(model.base_estimator_.criterion,
                                                    None)  # This line is done to get the decision criteria, for example gini.
            self.tree_output = "probability"  # This is the last line added

Summary Plot：绘制每个特征的全局重要性图

# 方法二 SHAP可视化解释
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(x_train)

# Summary Plot：绘制每个特征的全局重要性图
# shap_values[0]对应蓝色条形图bar shap_values对应正负样本蓝+粉色图
shap.summary_plot(shap_values, x_train, plot_type="bar", max_display=20)
shap.summary_plot(shap_values[0], x_train, plot_type="bar", max_display=20)

Summary Plot绘制输出图像，分别对应以下两种：