对抗机器学习基础(论文学习笔记三)

Foundation系列共有三个部分,是对《Evasion attacks against machine learning at test time》《Intriguing properties of neural networks》《Explaining and harnessing adversarial examples》三篇文章的阅读笔记整理。本文介绍《Explaining and harnessing adversarial examples》。

摘要(Abstract)

Several machine learning models, including neural networks, consistently misclassify adversarial examples—inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed input results in the model outputting an incorrect answer with high confidence. Early attempts at explaining this phenomenon focused on nonlinearity and overfitting. We argue instead that the primary cause of neural networks’ vulnerability to adversarial perturbation is their linear nature. This explanation is supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Moreover, this view yields a simple and fast method of generating adversarial examples. Using this approach to provide examples for adversarial training, we reduce the test set error of a maxout network on the MNIST dataset.

简而言之,本文分析了《Intriguing properties of neural networks》一文中提出的神经网络的“盲点”,指出该现象不是由于复杂的非线性或者过拟合,而是由于神经网络的“线性本质”。

有关工作(Related Work)

a) Box Constrained L-BFGS 能以高置信度找到对抗样本;
b) 对于某些数据集,对抗样本与原样本间的差别不能被人眼区分;
c) 不同结构的分类器和由不同样本训练出的分类器拥有相同的对抗样本;
d) 浅层的softmax回归模型依然有对抗样本;
e) 用对抗样本辅助训练模型可以提高泛化性能,但是循环求解带约束优化问题计算量巨大。

对抗样本的线性解释(Linear Explanation)

W T x ~ = W T x + W T η . W^T\tilde{x}=W^Tx+W^T\eta. WTx~=WTx+WT

  • 1
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值