异常检测(Out-of-distribution detection\ anomaly detection)相关论文阅读

置顶

supergxt

已于 2022-04-05 11:07:01 修改

阅读量3.1k

点赞数 5

分类专栏：论文阅读异常检测文章标签：机器学习深度学习

于 2021-06-17 10:31:50 首次发布

本文链接：https://blog.csdn.net/supergxt/article/details/117984137

版权

Learning Confidence for Out-of-Distribution detection in Neural Network

作者 Terrance DeVries 、Graham W. Taylor
论文原地址：https://arxiv.org/abs/1802.04865

OOD : where a network must determine whether or not an input is outside of the set on which it is expected to safely perform

提到任何事物都有认知局限性，认识到这种局限性才可以最小化潜在风险

proposed：

1 train NN classifiers to output confidence estimates for input, and differeniate the in and out-of distribution examples.

2 misclassified in-distribution examples can be used as a proxy when calibrating ood detectors

motivation:

以学生回答问题作为motivation。当学生回答问题时，对于不太确定的问题，可以申请hints，但是会受到penalty。对于high confidence的问题，不需要hints，对于low confidence的问题，需要hints，当所有问题回答完之后，根据一共申请了多少hints（或者是受到多生penalty），确定这个模型的置信度程度

model：

在这里插入图片描述
在倒数第二层之后添加一个 confidence estimation branch，original模型有prediction branch and confidence branch.分别输出prediction probabilities p和confidence estimate c
$p_i, c ∈ [0, 1],\sum_{i=1}^Mpi = 1.$
上述motivation中提到的hints，通过interpolate between original predictions and target probability
$p_i^{'} = c * p_i + (1-c)*y_i$
这个式子还是蛮有意思的，通过置信度C，如果模型对该输入的预测置信度为1，则预测就是模型的预测，如果模型对该输入的预测置信度为零，就是模型完全不可信，则预测就是真实的标签，通过c的大小，来决定调整预测值有多少来自hints。

像传统的softmax loss一样，不过是adjusted prediction propability. 但是只用这样的loss会有一个问题就是模型会把c学的特别低，这样输出都是真实的标签，loss也就越来越小，所以必须要对c有一个限制，添加一个log penalty，confidence loss
$L_s = -\sum_{i=1}^Mlog(p_i^{'})y_i$
$L_C = -log(c)$
$L_s + \lambda Lc$

具体训练

三个关键点：

1 对于模型超参数lambda的选择

在训练过程中，很容易所有的样本的c都会收敛到1，这样子就不会考虑hints了，也就和普通的softmax没有区别了，而理想应该是预测正确的样本 c -> 1,预测错误的样本 c-> 0.

提出一个beta，作为预置参数，表示置信度惩罚项的最大值，具体实施:当Lc > beta, 就增加lambda，将Lc降下来；当Lc < beta，就减小lambda，总之就是让Lc维持在一个目标beta附近。

2 避免过度正则化

confidence learning可以看作一个很强的正则化，这在某些任务中可以看作是很强的抗过拟合方法，但是也有风险导致欠拟合（就是模型不去学复杂的decision boundry），因此该模型选择每个batch中一半的数据用之前的loss，另一半的数据还是只用传统softmax（也就是只有部分数据可以得到hints）

3 保留错误分类的training example

当多参数模型遇到小量数据集时，会将训练集过拟合，也就是在训练过程中所有样本都被判断正确，但是在该模型中，判断错误的样本同样重要，因为他们会很大程度影响c。因此在训练过程中用了一些数据增强的方法。

一些操作

1 数据预处理

作者希望数据在输入前就可以增加 in-distribution 和 out-distribution之间的差异性。论文中参考Fast Gradient Sign Method FGSM（GAN）通过对输入数据增加一些pertubation，使得模型更有几率进行错误分类。对图像进行干扰，使得模型能对in-distribution data输出更大置信度。

we observe that in-distribution examples increase in confidence more than out-of-distribution examples using this procedure, resulting in an easier separation of the two distributions

实验数据

in-distribution-dataset: SVHN, CIFAR-10

out-distribution-dataset: TinyImageNet, LSUN, iSUN, Uniform Noise, Gaussian Noise

Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks（2018 ICLR）

问题提出：

1 开篇同样是介绍OOD， However, when deploying neural networks in real-world
applications, there is often very little control over the testing data distribution. Recent works have shown that neural networks tend to make high confidence predictions even for completely unrecognizable or irrelevant inputs（对很能识别的或者完全无关的类输入也会产生高置信度。这里给予的还是最基础的OOD方法，根据置信度判断）

在这里插入图片描述

看到论文提到的这一点，联想到之前看李宏毅老师讲OOD的时候一个例子（不得不说李老师讲课真的好），他在做一个关于辛普森家族的分类任务时，输入一个其他动漫的角色，最大softmax输出达到了0.99，也就是模型对于这个样本的置信度很高，但其实这个数据应该是一个Out of distribution的，这就是上述提到问题的一个鲜明的例子。

2 2018年之前的工作training set都还需要包含一些ood data（很难获得），同时也提到如果想同时保证id data的效果和ood data的效果，那么需要一个大的网络框架。

方法基础：

1 论文“ A baseline for detecting misclassified and out-of-distribution examples in neural networks” 中提出不需要重新训练模型，一个well-trained neural networks tends to assign higher softmax score to id examples

2 使用temperature scaling 和 small controlled perturbations可以提升id 和 ood之间的 softmax score gap

3 只需要用pretrain模型 + 上面两个手段就可以提升效果

方法核心：

1 temperature scaling

具体参考：

Calibration of Modern Neural Networks

首先学习一下temperature scaling的作用，在这之前需要了解一下knowledge distillation and calibrated（校准）。论文提出temperature scaling可以区分ID和OOD的最大softmax分数。那么什么是模型校准呢？

通常模型的输出是一个对应最大softmax的索引，也就是输出预测类，但是如果我们希望模型可以输出预测的置信度(confidence)是多少，那么这个confidence就是calibrated的。譬如：进行一个分类任务，将模型预测判断为某一类A且confidence score为90%的所有样本统计在一起，总数为N；并对着N个样本进行真实类别统计，如果有90%的样本都为A类，则说明该模型是calibrated的。

在这里插入图片描述

如上图所示，横纵坐标分别是confidence和acc，蓝色图代表的模型输出，灰色线代表calibration。那么该模型就是较自信的（输出的confidence大于实际acc）譬如在confidence为0.8的这些输出中，期望的是有80%的应该分类正确，但实际只有60%分类正确，那么也就是“网络过于自信了”，输出的置信度具体式子如下：
$\hat{y} = \frac{exp(z_i/T)}{\sum exp(z_j/T)}$

最低0.47元/天解锁文章

supergxt

关注

5
点赞
踩
24

收藏

觉得还不错? 一键收藏
0
评论
异常检测(Out-of-distribution detection\ anomaly detection)相关论文阅读

作者 Terrance DeVries 、Graham W. Taylor论文原地址：https://arxiv.org/abs/1802.04865Learning Confidence for OOD detection in NNOOD : where a network must determine whether or not an input is outside of the set on which it is expected to safely perform提到任何事物都有认知
复制链接

扫一扫