攻击:
快速梯度符号法(FGSM),通过在损失梯度的梯度方向上添加增量来生成一个对抗示例:
Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples.
基本迭代方法(BIM),它是FGSM的改进版本,与FGSM相比,BIM执行多个步骤,多次迭代:
Kurakin, Alexey, Ian Goodfellow, and Samy Bengio. Adversarial machine learning at scale.
PGD 多次迭代循环的梯度下降方法:
Madry, Aleksander, et al. Towards deep learning models resistant to adversarial attacks.
CW: CarliniWagner他们设计了有效的优化目标,以找到最小的扰动:
Carlini, Nicholas, and David Wagner. Towards evaluating the robustness of neural networks.
DeepFool: 最少量的修改原始图像,来达到欺骗AI模型的目的:
DeepFool: a simple and accurate method to fool deep neural networks
One pixel attack, 黑盒攻击,只改变一个像素点即可实现攻击:
One pixel attack for fooling deep neural networks
防御:
adversarial train:将不同攻击方法生成的对抗图像添加到训练图像数据集中,增加训练图像数据集可以使模型更容易学到整个图像的空间分布
Ensemble adversarial training: Attacks and defenses
label smoothing:soft targets to replace one-hot labels
D. Warde-Farley and I. Goodfellow. 11 adversarial perturbations of deep neural networks
feature squeezing:压缩图像的方法,包括每个像素的颜色位深度和空间平滑度,来实现防御
Detecting Adversarial Examples in Deep Neural Networks using Normalizing Filters
PiexlDefend: 结合噪声图像和干净图像训练一个去噪器,在图像输入模型前进行去噪处理
Pixeldefend: Leveraging generative models to understand and defend against adversarial examples
ComDefend: 仅使用干净数据集,训练得到ComDefend模型,对图像实现压缩重构处理,使其达到去噪的效果
Comdefend: An efficient image compression model to defend adversarial examples.
HGD: high-level representation guided denoiser(HGD) method
Defense against adversarial attacks using high-level representation guided denoiser