需要安装transformers
、shap
、matplotlib
、ipython
等库。
Pycharm会显示
<IPython.core.display.HTML object>
,只能在jupyter notebook中正常显示。
import shap
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import transformers
model_dir = "roberta/trained_model"
tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForSequenceClassification.from_pretrained(model_dir, num_labels=2)
classifier = transformers.pipeline("text-classification", model=model, tokenizer=tokenizer, top_k=1)
explainer = shap.Explainer(classifier)
text = ["应用软件(Application software)是指专门用于满足用户特定需求的软件。"]
shap_values = explainer(text)
shap.plots.text(shap_values)
from matplotlib import pyplot as plt
plt.rcParams['font.sans-serif'] = 'AR PL UKai CN'
plt.rcParams['axes.unicode_minus'] = False
shap.plots.bar(shap_values[0, :, 1], order=shap.Explanation.argsort.flip)
其中,plt.rcParams['font.sans-serif'] = 'AR PL UKai CN'
这句是为了在jupyter中显示中文。下面这段代码用于显示系统中所有字体。
from matplotlib.font_manager import FontManager
fm = FontManager()
my_fonts = set(f.name for f in fm.ttflist)
my_fonts