对抗攻击算法总结论文集合（白盒、黑盒、目标检测、对抗训练等）

nanyidev

已于 2023-12-15 20:08:39 修改

阅读量2.6w

点赞数 65

分类专栏： # 学习笔记文章标签：对抗攻击黑盒攻击白盒攻击深度学习

于 2022-04-09 17:28:49 首次发布

本文链接：https://blog.csdn.net/ji_meng/article/details/124063863

版权

学习笔记专栏收录该内容

3 篇文章

订阅专栏

文章目录

前言
对抗攻击名词解释
一、白盒攻击
二、黑盒攻击
三、对抗攻击与目标检测
四、对抗训练&鲁棒性

前言

只是一个自己看过的论文小汇总，还不能当综述，但也包含了很多经典的对抗攻击算法，方便回顾和查询，自己看的第一篇综述是：
Advances in adversarial attacks and defenses in computer vision: A survey
论文这件事，真的只能多看，上学期看的，现在忘差不多了（估计还得从头再看亿遍），代码也得操练起来。
由于我没给论文链接（比较费时间），我就介绍几个搜索文献的网站

Google Scholar（首推）
arxiv
x-mol
scopus
scihub
百度学术（国内有时比上面几个好用）

代码就看论文中有没有给链接吧，然后就 paperswitchcode，基本上每一篇都有。后面有时间会编辑个论文和代码链接吧，然后简单介绍每种算法的idea和method，比较经典的应该会单出论文笔记。
算法的分类没有那么严格，可能会有一些出入，新看的论文会再加入，持续更新。

对抗攻击名词解释

术语	含义
white-box attack	白盒攻击：知道模型的全部信息
black-box attack	黑盒攻击：无法获知模型的训练过程和参数
query-based attack	基于查询的攻击：攻击者能够查询目标模型并利用其输出来优化对抗性图像
score-based attack	基于分数的攻击：需要知道模型的输出的置信度
decision-based attack	基于决策的攻击：只需要知道目标模型的预测标签（top-1 label）
targeted attacks	定向攻击，欺骗模型使模型预测为特定标签；相对于un-targeted attacks，没有特定标签，只求模型预测错误
adversarial training	对抗训练：在模型的训练数据中注入对抗性例子以使其具有对抗鲁棒性

首先：对抗攻击的最先提出：Intriguing properties of neural networks

一、白盒攻击

1.FGSM
（1）FGSM——EXPLAINING AND HARNESSING ADVERSARIAL EXAMPLES
（2）I-FGSM——ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD
（3）MI-FGSM——Boosting Adversarial Attacks with Momentum（白盒黑盒均适用）
（4）NI-FGSM,SIM——NESTEROV ACCELERATED GRADIENT AND SCALE INVARIANCE FOR ADVERSARIAL ATTACKS（增加迁移性）

2.JSMA
The Limitations of Deep Learning in Adversarial Settings

3.DeepFool
DeepFool: a simple and accurate method to fool deep neural networks

4.PGD:
Towards Deep Learning Models Resistant to Adversarial Attacks

4.CW（被称为最强白盒）
Towards Evaluating the Robustness of Neural Networks

二、黑盒攻击

黑盒开篇：Practical Black-Box Attacks against Machine Learning

主要可以分为基于查询的和基于迁移的

1.基于查询（query-based attack）

基于查询的又可分为基于分数的和基于决策的；目标是减少查询次数，增加攻击成功率，以决策攻击更活跃。

1.1 socre-based attack

1.1.1 梯度估计：
因为是黑盒，所以无法获取梯度，那么可以采用一些方法来估计梯度

（1）ZOO——ZOO: Zeroth Order Optimization Based Black-box Attacks to
Deep Neural Networks without Training Substitute Models

（2）AutoZOO——AutoZOOM: Autoencoder-Based Zeroth Order Optimization Method for Attacking Black-Box Neural Networks

（3）NES——Black-box Adversarial Attacks with Limited Queries and Information（ICML2018）

（4）Bandits——Prior convictions: Black-box adversarial attacks with bandits and priors（ICLR2019）

（5）SignHunter——SIGN BITS ARE ALL YOU NEED FOR BLACK-BOX AT

1.1.2 无梯度
采取一些方法，直接不需要估计梯度，类似于盲人摸象（效果还很好！）

（1）SimBA——Simple Black-box Adversarial Attacks（ICML2019）

（2）Square——Square attack: a query-efficient black-box adversarial attack via random search（ECCV2020）

1.1.3 迁移+分数查询组合
利用迁移性作为一个先验（初始化），再用查询的方法

（1）P-RGF——Improving Black-box Adversarial Attacks with a Transfer-based Prior（NIPS2019）

（2）TREMBA——Black-Box Adversarial Attack with Transferable Model-based Embedding

（3）Learning Black-Box Attackers with Transferable Priors and Query Feedback

（4）Switching Transferable Gradient Directions for Query-Efficient Black-Box Adversarial Attacks

（5）MetaSimulator——Simulating Unknown Target Models for Query-Efficient Black-box Attacks

(6) N ATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks

1.2 decision-based attack

（1）开篇 Boundary Attack——Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine learning Models

（2）Opt-Attack——Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach

（3）Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks

（4）Sign-Opt—— SIGN-OPT: A QUERY-EFFICIENT HARD-LABEL ADVERSARIAL ATTACK

（5）qFool——A Geometry-Inspired Decision-Based Attack

（6）NES——Black-box Adversarial Attacks with Limited Queries and Information

（7）HSJA——HopSkipJumpAttack: A Query-Efficient Decision-Based Attack

（8）GeoDA: a geometric framework for black-box adversarial attacks

（9）QEBA: Query-Efficient Boundary-Based Blackbox Attack

（10）SurFree: a fast surrogate-free black-box attack

（11）f-attack——Decision-Based Adversarial Attack With Frequency Mixup

2.基于迁移

核心观点是：计算本地代理模型的扰动，使得这些扰动也能有效地欺骗其他目标模型。所以这块就是在研究怎么提高迁移性？

（1）开篇：Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

（2）Delving into Transferable Adversarial Examples and Black-box Attacks

（3）Enhancing the Transferability of Adversarial Attacks through Variance Tuning

（3）元学习：Meta Gradient Adversarial Attack

3.基于无数据替代训练

无数据替代训练不同于普通黑盒攻击的最大区别在于不能获取训练数据

（1）DaST:Data-free Substitute Training for Adversarial Attacks（CVPR2020）

（2）Delving into Data: Effectively Substitute Training for Black-box Attack

（3）Learning Transferable Adversarial Examples via Ghost Networks

（4）Towards Efficient Data Free Black-box Adversarial Attack

4.其他

（1）通用黑盒攻击UAP——Universal adversarial perturbations

（2）单像素攻击：One Pixel Attack for Fooling Deep Neural Networks；Simple Black-Box Adversarial Attacks on Deep Neural Networks

（2）AdvDrop: Adversarial Attack to DNNs by Dropping Information

（3）无盒攻击——Practical No-box Adversarial Attacks against DNNs

三、对抗攻击与目标检测

Towards Adversarially Robust Object Detection
DPATCH: An Adversarial Patch Attack on Object Detectors

四、对抗训练&鲁棒性

Towards Deep Learning Models Resistant to Adversarial Attacks
A Closer Look at Accuracy vs. Robustness
ENSEMBLE ADVERSARIAL TRAINING ATTACKS AND DEFENSES
Towards Evaluating the Robustness of Neural Networks