The Robustness of Deep Networks A geometrical perspective论文解读

sci论文地址

Introduction

神经网络近些年已经在许多场景中得到了应用,但是一些神经网络的基础属性还没有被理解,这也是近些年研究的重点。更具体来说,神经网络对各种类型perturbation的robustness受到了很大的关注。图1阐述了深度神经网络对small additive perturbation的脆弱性:

在这里插入图片描述
A dual phenomenon was observed in [3],这时人眼无法识别的图片会被深度神经网络以很高的confidence进行分类。

The fundamental challenges raised by the robustness of deep networks to perturbation已经催生出了许多重要的工作。这些工作从实验上和理论上对深度网络在不同种类的perturbation上的robustness进行了研究,这些perturbation包括adversarial perturbations,additive random noise,structured transformations甚至universal perturbation。The robustness is usually measured as the sensitivity of the discrete classification function(即the function that assigns a label to each image)to such perturbation。尽管对于robustness的分析并不是一个新的问题,we provide an overview of the recent works that propose to assess the vulnerability of deep network architectures。除了量化在各种perturbations下网络的robustness,对于robustness的分析has further contributed to developing important insights on the geometry of the complex decision boundary of such classifier, which remain hardly understood due to the very high dimensionality of the problems that they address。实际上,一个分类器的robustness性质和决策边界的几何性质是紧密相关的。例如,the high instability of 深度神经网络to 对抗扰动揭示出数据点reside extremely close to the classifier‘s decision boundary。对于robustness的研究因此不仅能帮助我们从实际角度理解系统的稳定性,还能允许我们理解分类区域的几何特征并且使得我们derives insight toward the improvement of current architectures。

整篇文章有多个目的。First, it provides an accessible review of the recent works in the analysis of the robustness of deep neural network classifiers to different forms of perturbations, with a particular emphasis on image analysis and visual understanding applications. Second, it presents connections between the robustness of deep networks and the geometry of the decision boundaries of such classifiers. Third, the article discusses ways to improve the robustness in deep networks architectures and finally highlights some of the important open problems.

Robustness of classifiers

在大多数的分类设定中,测试数据集中误分类的比例是用于评价分类器的主要performance metric。The empirical test error提供了关于分类器的risk, defined as the probability of misclassification,when considering samples from the data distribution。正式来讲,将 μ \mu μ 定义为a distribution defined over images。分类器 f f f的risk等价于:

R ( f ) = P x ∼ μ ( f ( x ) ≠ y ( x ) ) ( 1 ) R(f)=\mathbb{P}_{x\sim\mu}(f(x)\ne y(x))\quad\quad\quad\quad\quad\quad(1) R(f)=Pxμ(f(x)=y(x))(1)

x x x y ( x ) y(x) y(x)分别对应于图片和图片的label。While the risk captures the error of f f f on the data distribution μ \mu μ,它并没有捕捉到the robustness to small arbitrary perturbations of data points。在视觉分类任务中,现在一个比较迫切的任务是学习到classifiers that achieve robustness to small perturbation to images。

这里定义一些符号,让 X \mathcal{X} X表示the ambient space where images live。让 R \mathcal{R} R表示我们所容许的扰动集合。例如,当考虑geometric perturbation时, R \mathcal{R} R is set to be the group of geometric(即 affine)transformation under study。如果我们想要衡量在任意additive perturbation下的robustness,那么我们设定 R = X \mathcal{R}=\mathcal{X} R=X。对 r ∈ R r\in\mathcal{R} rR,我们定义 T r : X → X T_r:\mathcal{X}\to\mathcal{X} Tr:XXto be the perturbation operator by r r r,对于一个数据点 x ∈ X x\in\mathcal{X} xX T r ( x ) T_r(x) Tr(x)表示被 r r r扰动后的图片 x x x。使用上面这些notations,我们将改变判别器关于图片 x x x的label的最小perturbation定义为:

r ∗ ( x ) = arg min ⁡ r ∈ R ∥ r ∥ R s u b j e c t   t o f ( T r ( x ) ) ≠ f ( x ) ( 2 ) r^*(x)=\argmin\limits_{r\in\mathcal{R}}\Vert r\Vert_{R}\quad\quad\quad\quad\quad\quad\quad\quad\quad\\subject\ to\quad f(T_r(x))\ne f(x)\quad\quad\quad\quad\quad\quad\quad\quad\quad(2) r(x)=rRargminrRsubject tof(Tr(x))=f(x)(2)

这里 ∥ ⋅ ∥ R \Vert \cdot\Vert_R R R \mathcal{R} R 上的metric。为了notation simplicity,我们省略了the dependence of r ∗ ( x ) r^*(x) r(x) on f f f, R \mathcal{R} R, δ \delta δand operator T T T。Moreover,当图片 x x x is clear from the context,we will use r ∗ r^* r to refer to r ∗ ( x ) r^*(x) r(x)。图2对扰动过程进行了阐述:

在这里插入图片描述
The pointwise robustness of f f f at x x x 因此通过 ∥ r ∗ ( x ) ∥ R \Vert r^*(x)\Vert_{\mathcal{R}} r(x)R进行衡量。注意 ∥ ⋅ ∥ R \Vert\cdot\Vert_{\mathcal{R}} R值越大表明 x x x处的robustness越高。尽管这种robustness的定义考虑到了使得分类器 f f f改变分类的结果的smallest perturbation r ∗ ( x ) r^*(x) r(x),其他的工作却采用了一些稍微不同的定义,这些工作中a ‘sufficiently small’ perturbation is sought(instead of the minimal one)[7-9]。为了衡量分类器 f f f的global robustness,我们可以计算 ∥ r ∗ ( x ) ∥ R \Vert r^*(x)\Vert_{\mathcal{R}} r(x)R在数据分布上的期望。这表明global robustness ρ ( f ) \rho(f) ρ(f)可以通过如下的方式进行定义:

ρ ( f ) = E x ∼ μ ( ∥ r ∗ ( x ) ∥ R ( 3 ) \rho(f)=\mathbb{E}_{x\sim\mu}(\Vert r^*(x)\Vert_{\mathcal{R}}\quad\quad\quad\quad\quad\quad(3) ρ(f)=Exμ(r(x)R(3)

需要注意的比较重要的一点是,在我们的robustness setting中,the perturbed point T r ( x ) T_r(x) Tr(x) need not belong to the support of the data distribution。因此,尽管the focus of the risk in(1)is the accuracy on typical images(从 μ \mu μ中进行采样),the focus of the robustness computed from (2)is instead on the distance to the ‘closest’ image(potentially outside the support of μ \mu μ)that changes the label of the classifier。The risk and robustness hence capture two fundamentally different properties of the classifier,如如下所述:

在这里插入图片描述
Observe that classification robustness和支持向量机classifier有较强的关联,后者的目标是最大化robustness,defined as the margin between support vectors. Importantly, the max-margin classifier in a given family of classifiers might, however, still not achieve robustness (in the sense of high ρ ( f ) \rho(f) ρ(f)). An illustration is provided in “Robustness and Risk: A Toy Example,” where a no zero-risk linear classifier—in particular, the max-margin classifier— achieves robustness to perturbations. Our focus in this article is turned toward assessing the robustness of the family of deep neural network classifiers that are used in many visual recognition tasks.

Perturbation forms

Robustness to additive perturbations

We first start by considering the case where the perturbation operator is simply additive,即 T r ( x ) = x + r T_r(x)=x+r Tr(x)=x+r。在这种情况下, the magnitude of the perturbation can be measured with the l p l_p lp norm of the minimal perturbation that is necessary to change the label of a classifier. According to (2), the robustness to additive perturbations of a data point x is defined as:

min ⁡ r ∈ R ∥ r ∥ p s u b j e c t   t o f ( x + r ) ≠ f ( x ) ( 4 ) \min\limits_{r\in\mathcal{R}}\Vert r\Vert_p\quad\quad\quad\quad\quad\quad\quad\quad\\subject\ to\quad f(x+r)\ne f(x)\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad(4) rRminrpsubject tof(x+r)=f(x)(4)

Depending on the conditions that one sets on the set R \mathcal{R} R that supports the perturbations, the additive model leads to different forms of robustness.

Adversarial perturbations

我们首先考虑additive perturbations不受限制的情况,即 R = X \mathcal{R}=\mathcal{X} R=X。通过解决(4)得到的perturbation通常被称作adversarial perturbation, as it corresponds to the perturbation that an adversary (having full knowledge of the model) would apply to change the label of the classifier, while causing minimal changes to the original image.

The optimization problem in (4) is nonconvex, as the constraint involves the (potentially highly complex) classification function f f f. Different techniques exist to approximate adversarial perturbations. In the following, we briefly mention some of the existing algorithms for computing adversarial perturbations:

Regularized variant[1]: 这种方式通过解决a regularized variant of the problem in (4)来获取adversarial perturbation,这个变种问题可以用如下的公式进行描述:

min ⁡ r c ∥ r ∥ p + J ( x + r , y ~ , θ ) ( 5 ) \min\limits_{r}c\Vert r\Vert_p+J(x+r,\tilde{y},\theta)\quad\quad\quad\quad\quad(5) rmincrp+J(x+r,y~,θ)(5)

这里 y ~ \tilde{y} y~是perturbed sample的target label, J J J是loss function, c c c是一个regularization 参数, θ \theta θ是模型参数。在[1]中最原始的公式中,添加了一个额外的变量限制来保证 x + r ∈ [ 0 , 1 ] x+r\in[0,1] x+r[0,1],这在(5)中为了简单性进行了省略。为了解决(5)中的优化问题,a line search is performed over c c c to find the maximum c > 0 c>0 c>0 for which the minimizer of (5) satisfies f ( x + r ) = y ~ f(x+r)=\tilde{y} f(x+r)=y~。尽管可以产生非常精准的估计,这种方法can be costly to compute on high-dimensional and large-scale datasets。Moreover,it computes targeted adversarial perturbation,where the target label is known。

Fast gradient sign(FGS)[11]:这种方式通过going in the direction of the sign of gradient of the loss function来估计an untargeted adversarial perturbation,loss function定义如下:

ϵ s i g n ( ∇ x J ( x , y ( x ) , θ ) ) \epsilon sign(\nabla_xJ(x,y(x),\theta)) ϵsign(xJ(x,y(x),θ))

这里 J J J是loss function,用来训练神经网络并且 θ \theta θ表示模型参数。尽管十分高效,这种one-step的算法仅能提供a coarse approximation to the solution of the optimization problem in (4) for p = ∞ p=\infty p=

DeepFool[5]: 这个算法通过一个迭代的步骤来最小化(4),where each iteration involves the linearization of the constraint。The linearized (constrained)problem is solved in closed form at each iteration,and the current estimate is updated;优化过程当当前的estimate of the perturbation fools the classifier时终止。In practice,DeepFool provides a tradeoff between the accuracy and efficiency of the two previous approaches。

In addition to the aforementioned optimization methods, several other approaches have recently been proposed to compute adversarial perturbations, see, e.g., [9], [12], and [13]. Different from the previously mentioned gradient-based techniques, the recent work in [14] learns a network (the adversarial transformation network) to efficiently generate a set of perturbations with a large diversity, without requiring the computation of the gradients.

使用先前提及到的这些技巧, one can compute the robustness of classifiers to additive adversarial perturbations. Quite surprisingly, deep networks are extremely vulnerable to such additive perturbations; i.e.,
small and even imperceptible adversarial perturbations can be computed to fool them with high probability. For example, the average perturbations required to fool the CaffeNet [15] and GoogleNet [16] architectures on the ILSVRC 2012 task [17] are 100 times smaller than the typical norm of natural images [5] when using the l 2 l_2 l2 norm. The high instability of deep neural networks to adversarial perturbations, which was first highlighted in [1], shows that these networks rely heavily on proxy concepts to classify objects, as opposed to strong visual concepts typically used by humans to distinguish between objects.

To illustrate this idea, we consider once again the toy classification example (see “Robustness and Risk: A Toy Example”), where the goal is to classify images based on the orientation of the stripe. In this example, linear classifiers could achieve a perfect recognition rate by exploiting the imperceptibly small bias that separates the two classes. While this proxy concept achieves zero risk, it is not robust to perturbations: one could design an additive perturbation that is as simple as a minor variation of the bias, which is sufficient to induce data misclassification. On the same line of thought, the high instability of classifiers to additive perturbations observed in [1] suggests that deep neural networks potentially capture one of the proxy concepts that separate the different classes. Through a quantitative analysis of polynomial classifiers, [10] suggests that higher-degree classifiers tend to be more robust to perturbations, as they capture the “stronger” (and more visual) concept that separates the classes (e.g., the orientation of the stripe in Figure S1 in “Robustness and Risk: A Toy Example”). For neural networks, however, the relation between the flexibility of the architecture (e.g., depth and breadth) and adversarial robustness is not well understood and remains an open problem.

Random noise

待补充

Geometric insights from robustness

对robustness的研究允许我们获得关于classifier的insights,更准确地讲,是关于the geometry of the
classification function acting on the high-dimensional input space. f : X → { 1 , ] … , C } f:\mathcal{X}\to\{1,]\dots,C\} f:X{1,],C}指代我们的C-class分类器, 并且我们用 g 1 , … , g c g_1,\dots,g_c g1,,gc 表示分类器对 C C C个类中的每个判定的概率。特别地,对于一个给定的 x ∈ X x\in\mathcal{X} xX, f ( x ) f(x) f(x) is assigned to the class having a maximal score,即 f ( x ) = a r g m a x i { ( g i ( x ) } f(x)=argmax_i\{(g_i(x)\} f(x)=argmaxi{(gi(x)}。对于深度神经网络,函数 g i g_i gi代表网络中最后一层的输出(通常指softmax层)。Note that the classifier f f f can be seen as a mapping that partitions the input space X \mathcal{X} X into classification regions, each of which has a constant estimated label (i.e., f ( x ) f(x) f(x) is constant for each such region). The decision boundary B \mathcal{B} B of the classifier is defined as the union of the boundaries of such classification regions (see Figure 2).

Adversarial perturbations

我们首先聚焦于adversarial perturbation并且highlight这些扰动和决策边界几何结构的关系。这种关系relies on the simple observation shown in “Geometric Properties of Adversarial Perturbations.”。两个几何性质都在图6中进行了展示:

在这里插入图片描述
注意这些几何性质are specific to the l 2 l_2 l2范数 The high instability of classifiers to adversarial perturbations, which we highlighted in the previous section, 揭示了自然图片都位于非常接近于决策边界的位置。尽管这个结果是理解the geometry of the data points with regard to the classifier‘s决策边界的关键,它并没有提供任何有关于决策边界形状的insight . 一个关于决策边界的local geometric描述 (位于 x x x的周围地区) is rather captured通过 r a d v ∗ ( x ) r^*_{adv}(x) radv(x)的方向,due to adversarial perturbation的垂直性. In [18] and [25], 这些adversarial perturbation的几何性质被利用来进行在数据点的周围地区可视化决策边界的typical cross sections. 特别地,a two-dimensional normal section of the decision boundary is illustrated, where the sectioning plane is spanned by the adversarial perturbation (垂直于决策边界) and a random vector in the tangent space. Examples of normal sections of decision boundaries are illustrated in Figure 7:
在这里插入图片描述
我们可以观察到SOTA深度神经网络的决策边界on these two-dimensional cross sections有一个非常低的curvature (注意x轴和y轴的区别n)。换句话说,这些图表揭示了 x x x附近位置的决策边界can be locally well approximated by a hyperplane passing through x + r a d v ∗ ( x ) x+r^*_{adv}(x) x+radv(x) with the normal vector r a d v ∗ ( x ) r^*_{adv}(x) radv(x)。 In [11], it is hypothesized that state-of-the-art classifiers are “too linear,” 导致了决策边界with very small curvature并进一步解释了 the high instability of such classifiers to adversarial perturbations. To motivate the linearity hypothesis of deep networks, the success of the FGS method (which is exact for linear classifiers) in finding adversarial perturbations is invoked. 然而,最近的一些工作对线性假设做出了挑战,例如[26], the authors show that there exist adversarial perturbations that cannot be explained with this hypothesis, and, in [27], the authors provide a new explanation based on the tilting of the decision boundary with respect to the data manifold. We stress here that the low curvature of the decision boundary does not, in general, imply that the function learned
by the deep neural network (as a function of the input image) is linear, or even approximately linear. Figure 8 shows illustrative examples of highly nonlinear functions resulting in flat decision boundaries.:
在这里插入图片描述

Moreover, it should be noted that, while the decision boundary of deep networks is very flat on random two-dimensional cross sections, these boundaries are not flat on all cross sections. That is, there exist directions in which the boundary is very curved. Figure 9 provides some illustrations of such cross sections, where the decision boundary has large curvature and therefore significantly departs from the first-order linear approximation, suggested by the flatness of the decision boundary on random sections in Figure 7.:

在这里插入图片描述

Hence, these visualizations of the decision boundary strongly suggest that the curvature along a small set of directions can be very large and that the curvature is relatively small along random directions in the input space. Using a numerical computation of the curvature, the sparsity of the curvature profile is empirically verified in [28] for deep neural networks, and the directions where the decision boundary is curved are further shown to play a major role in explaining the robustness properties of classifiers. In [29], the authors provide a complementary analysis on the curvature of the decision boundaries induced by deep networks and show that the first principal curvatures increase exponentially with the depth of a random neural network. The analyses of [28] and [29] hence suggest that the curvature profile of deep networks is highly sparse (i.e., the decision boundaries are almost flat along most directions) but can have a very large curvature along a few directions.

Universal perturbations

待补充

### 回答1: 评估机器学习系统的鲁棒性通常需要考虑以下几个方面: 1. 对抗性样本:测试模型对于被针对性地构造的样本的容忍程度。例如,攻击者可能会对原始数据进行微小的修改,以此来欺骗模型,使其做出错误的预测。评估模型的抗性能够帮助我们了解模型的一般化能力和安全性。 2. 数据偏差:测试模型在不同数据集和分布上的表现。例如,如果模型只在特定的数据集上进行训练,那么它可能会在其他数据集上表现较差。评估模型在不同数据集和分布上的表现能够帮助我们了解其在实际应用中的效果。 3. 噪声鲁棒性:测试模型对于输入数据中的噪声的容忍程度。例如,在图像分类任务中,一些像素可能被意外地修改或删除,这可能会使模型难以正确分类图像。评估模型对于噪声的鲁棒性能够帮助我们了解其在实际应用中的可靠性。 4. 模型不确定性:测试模型对于输入数据中的不确定性的处理能力。例如,在自然语言生成任务中,模型可能无法确定某些单词的正确顺序或含义。评估模型对于不确定性的处理能力能够帮助我们了解其在实际应用中的可靠性和稳定性。 综上所述,评估机器学习系统的鲁棒性需要综合考虑多个方面,并且需要在实际应用场景中进行测试和验证。 ### 回答2: 我们可以通过以下几种方式评估一个机器学习系统的稳健性: 1. 鲁棒性测试:我们可以针对各种不同的输入情况对系统进行测试,包括正常输入、异常输入、噪声输入等。如果系统在各种情况下都能保持较好的性能,那么我们可以认为它是鲁棒的。 2. 稳定性分析:我们可以对系统进行随机性测试或者重复性测试,看系统在多次运行中是否产生一致的结果。如果系统的输出结果在不同运行中保持稳定,那么可以认为系统是稳定的。 3. 对抗性测试:我们可以利用对抗样本攻击的方法来测试系统的鲁棒性。通过对输入样本做出微小改动,观察系统预测结果是否发生明显的错误。如果系统能够有效抵御对抗样本攻击,那么可以认为它是鲁棒的。 4. 数据集扩展:我们可以使用更大、更多样的数据集来训练和测试机器学习系统。如果系统在不同数据集上都表现良好,那么可以认为它的鲁棒性更高。 5. 基准测试:我们可以将机器学习系统与其他同类型的系统进行比较,看其在相同任务上的性能差异。如果系统相比其他系统表现更优秀,那么可以认为它具有较高的鲁棒性。 综上所述,我们可以通过以上几种方式来评估一个机器学习系统的稳健性,从而了解其在各种实际应用中的可靠性和表现情况。 ### 回答3: 评估机器学习系统的鲁棒性可以采取以下几种方法: 1. 对抗性测试:通过主动引入干扰或攻击来评估机器学习系统在外部干扰条件下的性能。例如,可以利用对抗样本生成算法创建一些具有微小扰动的输入样本,观察系统是否能正确分类这些样本。 2. 鲁棒性分析:通过观察机器学习系统在不同离群值、异常情况或缺失数据等情况下的表现来评估其鲁棒性。可以通过添加噪声或删除部分训练数据来模拟这些情况,然后观察系统对这些变化的适应能力。 3. 交叉验证:通过将数据集划分为训练集和测试集,使用训练集进行模型训练,再在测试集上进行评估来评估机器学习系统的鲁棒性。通过交叉验证可以检验系统对于不同数据分布的适应能力。 4. 分布偏移检测:通过检测输入数据分布的变化来评估机器学习系统的鲁棒性。当模型在预测时遇到与训练数据分布不同的测试数据时,可能会导致性能下降。可以使用一些分布偏移检测算法来检测这种情况。 5. 敏感性分析:通过评估模型对输入特征的变化的敏感性来评估机器学习系统的鲁棒性。可以逐个改变输入特征,观察模型输出的变化程度,从而获得对系统的敏感性指标。 综上所述,评估一个机器学习系统的鲁棒性需要通过对抗性测试、鲁棒性分析、交叉验证、分布偏移检测和敏感性分析等多种方法来综合评估系统在不同情况下的性能表现。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值