《Obfuscated Gradients give a false sense of security:circumventing defenses to adversarial examples》
相关文献
1、神经网络对于对抗样本的敏感性:
Szegedy et al. 2013
Biggio et al. 2013
2、iterative optimization-based attacks
Kurakin et al. 2016a
Madry et al. 2018
Carlini & Wagner 2017c
3、gradient masking
Papernot et al. 2017
4、Expectation Over Transformation
Athalye et al. 2017
5、ResNet
Zagoruyko & Komodakis, 2016
He et al. 2016
6、backpropagation
Rumelhart et al. 1986
7、降噪去对抗干扰
Guo et al. 2018
避开这种防御:Carlini & Wagner, 2017b
8、
三种obfuscated gradients:
shattered gradients:破碎梯度,产生于有意的不可导操作或者无意间数值不稳定造成
stochastic gradients:随机梯度取决于测试时间的随机性
vanishing/exploding gradients:梯度消失、梯度爆炸,很深的计算中产生的无用梯度
Attack Techniques
1、Backward Pass Differentiable Approximation (BPDA)
因为
g(x)≈x
g
(
x
)
≈
x
,
所以
▽xg(x)≈▽xx=1
▽
x
g
(
x
)
≈
▽
x
x
=
1
所以
▽xf(g(x))|x=x^≈▽xf(x)|x=g(x^)
▽
x
f
(
g
(
x
)
)
|
x
=
x
^
≈
▽
x
f
(
x
)
|
x
=
g
(
x
^
)
BPDA:
g(x)≈fi(x)
g
(
x
)
≈
f
i
(
x
)
只在backward pass上用
g(x)
g
(
x
)
代替
fi(x)
f
i
(
x
)
, 发现比forward和packward都替换的效果好得多
可以用来解决shattered gradients
2、Expectation over Transformation
用Expectation over Transformation来正确计算输入所期望的转变的梯度,来对抗让输入随机转变的防御
3、Reparameterization
solve vanishing/exploding gradients
make
x=h(z)
x
=
h
(
z
)
使得
g(h(z))=h(z)
g
(
h
(
z
)
)
=
h
(
z
)
且
h(⋅)
h
(
·
)
是可微的,这样,可以通过
f(h(z))
f
(
h
(
z
)
)
来计算梯度
Case study
1、Non-obfuscated Gradients
(1) adversarial training
用对抗样本训练,直到能正确分类
给定训练数据
X
X
和损失函数 , 一般的训练是选择参数
而对抗训练解决
(不太懂)