- 本文为李宏毅 2021 ML 课程的笔记
目录
Explainable AI: Why Does the Model Make This Prediction
Why we need Explainable ML?
- Loan issuers are required by law to explain their models.
- Medical diagnosis model is responsible for human life. Can it be a black box?
- If a model is used at the court, we must make sure the model behaves in a nondiscriminatory manner.
- If a self-driving car suddenly acts abnormally, we need to explain why.
- With explainable ML, we can improve ML model based on explanation.
Interpretable v.s. Powerful
- Some models are intrinsically interpretable. But not very powerful.
- For example, linear model (from weights, you know the importance of features)
- Deep network is difficult to interpretable. Deep networks are black boxes … but powerful than a linear model.
Let’s make deep network explainable.
Decision Tree
- Are there some models interpretable and powerful at the same time? – How about decision tree?
- Decision tree is all you need!?
- A tree can still be terrible!
- We usually use random forest! But how to explain it?
- A tree can still be terrible!
Goal of Explainable ML
- Completely know how an ML model works? – We do not completely know how brains work! But we trust the decision of humans!
- Make people (your customers, your boss, yourself) comfortable…
Explainable ML
Local Explanation
Global Explanation
Local Explanation: Explain the Decision
Question: Why do you think this image is a cat?
Which component is critical for making decision?
Removing or modifying the components
- Removing or modifying the components
- Large decision change ⇒ \Rightarrow ⇒ Important component
- 在下图中,用一个方块挡住图片的一部分。热力图表示方块在不同位置时,模型输出正确标签的概率,红色表示概率高,蓝色表示概率低
- 在下图中,对计算 loss 关于每个像素点的偏微分,得到 Saliency Map
Case study
Case Study: Pokémon v.s. Digimon
- Task: 对宝可梦和数码宝贝进行二分类
- Experimental Results: Training Accuracy: 98.9%; Testing Accuracy: 98.4% – Amazing!!!
- But what about the Saliency Map? – 从下图中可以看到,Saliency Map 中,亮点集中在图片的四个角上而非数码宝贝或宝可梦上!
- What Happened?: All the images of Pokémon are PNG, while most images of Digimon are JPEG. Machine discriminates Pokémon and Digimon based on the background colors.
More Examples …
- PASCAL VOC 2007 data set (机器居然关注的是网站的水印…) (Correct answers
≠
\neq
= Intelligent)
Limitation
Noisy Gradient
- 直接画 Saliency map 可能会得到很多噪声,此时可以使用 SmoothGrad
- SmoothGrad: Randomly add noises to the input image, get saliency maps of the noisy images, and average them.
Limitation: Gradient Saturation
- Gradient cannot always reflect importance
- Alternative: Integrated gradient (IG)
How a network processes the input data?
Visualization
语音处理
发现不同人说同样的话,在第 8 个隐藏层中它们的 feature 是非常接近的
Attention
Probing
- Probe: a classifier (直接拿模型中间层的 embedding 接分类器,看看效果如何)
- 当然 Probing 也不仅限于 classifier。例如在下图中,我们正在训练的模型是将语音讯号转为文本,因此该模型会去除语者信息。我们可以在模型隐藏层后接 TTS 模型,如果发现重构出的语音中说的话与原来一样,但语气不同,那么说明现在训练的模型是比较成功的
Global Explanation: Explain the whole model
Question: What does a “cat” look like?
What does a filter detect?
- 给定一张图片
X
X
X,如果
X
X
X 在经过 filter 之后输出的 feature map 中各个元素的值比较大,那么就说明
X
X
X 比较符合 filter 检测的 pattern。利用这点,我们可以直接构造出最符合 filter 所检测 pattern 的图片
X
∗
X^*
X∗ (gradient ascent):
X ∗ = arg max X ∑ i ∑ j a i j X^*=\argmax_X\sum_i\sum_j a_{ij} X∗=Xargmaxi∑j∑aij X ∗ X^* X∗ contains the patterns filter 1 can detect. - E.g., Digit classifier:
X
∗
X^*
X∗ for each filter
What does a digit look like for CNN?
- 类似于 filter pattern 的可视化方法,我们也可以直接构造图片
X
∗
X^*
X∗,使得模型输出某一类
y
i
y_i
yi 的概率最大,此时的
X
∗
X^*
X∗ 也许就能代表模型心目中
y
i
y_i
yi 类图像的样子
- 从上图中可以看到,直接解
arg max
X
y
i
\argmax_X y_i
Xargmaxyi 的优化问题得到的图片全是噪音。也许我们可以加上一些正则项,使得得到的图片更像数字 (To make people comfortable…)。下式中,正则项
R
(
X
)
R(X)
R(X) 使得得到的图片中白点较少,因为手写数字本身笔画少,白点本身就不多:
- 如果添加更精细的正则项 (添加一些真实图像的先验知识),就可以得到一些效果更好的可视化效果:
左上为火烈鸟,左下为甲虫
- Constraint from Generator: 也可以直接通过训练好的 image generator (GAN, VAE…) 来找出 X ∗ X^* X∗:
Outlook
- Using an interpretable model to mimic the behavior of an uninterpretable model.
- 但 linear model 显然没有能力达到 NN 的效果,因此就有了 Local Interpretable Model-Agnostic Explanations (LIME)
- https://youtu.be/K1mWgthGS-A
- https://youtu.be/OjqIVSwly4k