【Datawhale可解释性机器学习笔记】LIME

JeffDingAI

已于 2022-12-22 08:37:57 修改

阅读量223

点赞数

分类专栏： Datawhale学习笔记文章标签：学习

于 2022-12-22 08:34:19 首次发布

本文链接：https://blog.csdn.net/yichao_ding/article/details/128404002

版权

Datawhale学习笔记专栏收录该内容

80 篇文章

订阅专栏

本文介绍了LIME算法，一种用于解释任意分类器预测的通用模型，特别适用于文本和图像领域。它强调了可解释性和局部真实性的特点，尽管速度较慢，但其在模型独立解释上的优势显著。通过示例展示了如何使用LIME进行GBDT模型的实例解释和可视化。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

介绍

LIME算法是Marco Tulio Ribeiro2016年发表的论文《“Why Should I Trust You?” Explaining the Predictions of Any Classiﬁer》中介绍的局部可解释性模型算法。该算法主要是用在文本类与图像类的模型中。

论文地址
Why Should I Trust You?” Explaining the Predictions of Any Classiﬁer

基本特征

可解释性
局部保真度
与模型无关

算法优缺点

LIME算法有很强的通用性，效果好。
LIME算法速度慢
LIME算法拓展方向

代码示例

import lime 
import sklearn
import numpy as np
import sklearn.ensemble
import sklearn.metrics
import matplotlib.pyplot as plt  
from sklearn.datasets import fetch_20newsgroups

#读取数据
categories = ['alt.atheism', 'soc.religion.christian']
newsgroups_train = fetch_20newsgroups(subset='train', categories=categories)
newsgroups_test = fetch_20newsgroups(subset='test', categories=categories)
class_names = ['atheism', 'christian']

#利用GBDT分类模型区分是否违约
from sklearn.ensemble import GradientBoostingClassifier

x =data.iloc[:,:8].as_matrix()
y = data.iloc[:,8].as_matrix()

gbdt = GradientBoostingClassifier()
gbdt = gbdt.fit(x,y)
#直接将训练数据作为预测数据
pred = gbdt.score(x,y)

#中文字体显示  
plt.rc('font', family='SimHei', size=13)
from lime.lime_tabular import LimeTabularExplainer
#建立解释器
explainer = LimeTabularExplainer(x, feature_names=feature_names, class_names=class_names)
#解释第81个样本的规则
exp = explainer.explain_instance(x[81], gbdt.predict_proba)
#画图
fig = exp.as_pyplot_figure()

#画分析图
exp.show_in_notebook(show_table=True, show_all=False)