对抗水印
本文是一种黑盒攻击,通过将对抗样本技术和图像(文字)水印结合生成自然的水印对抗样本,在水印方面仅考虑的位置和透明度,未考虑旋转角度。作者在本文中为提高攻击效果提出一种优化算法Basin Hopping Evolution (BHE),这是一种基于种群基因的全局随机搜索优化算法。
Abstract
Recent research has demonstrated that adding some imperceptible perturbations to original images can fool deep learning models. However, the current adversarial perturbations are usually shown in the form of noises, and thus have no practical meaning. Image watermark is a technique widely used for copyright protection. We can regard image watermark as a kind of meaningful noises and adding it to the original image will not affect people’s understanding of the image content, and will not arouse people’s suspicion. Therefore, it will be interesting to generate adversarial examples using watermarks. In this paper, we propose a novel watermark perturbation for adversarial examples (Adv-watermark) which com-bines image watermarking techniques and adversarial example algorithms. Adding a meaningful watermark to the clean images can attack the DNN models. Specifically, we propose a novel optimization algorithm, which is called Basin Hopping Evolution (BHE), to generate adversarial watermarks in the black-box attack mode. Thanks to the BHE, Adv-watermark only requires a few queries from the threat models to finish the attacks. A series of experiments conducted on ImageNet and CASIA-WebFace datasets show that the proposed method can efficiently generate adversarial examples, and outperforms the state-of-the-art attack methods. Moreover, Adv-watermark is more robust against image transformation defense methods.
作者:Xiaojun Jia(信工所)、Xingxing Wei(北航)、Xiaochun Cao(信工所)、Xiaoguang Han(港中文深圳大数据院)
方法概述
本文新颖的点:
- 将图像水印技术应用到对抗样本领域,是这种攻击方式落地变得可能
- 黑盒攻击基于种群基因的全局随机搜索算法
- 对于图像水印仅考虑位置和透明度,不考虑rotation
通过alpha blending实现图片与水印结合:
目标优化方法:
在优化过程,本文提出了一个基于种群基因的全局随机搜索算法BHE:如果孩子基因更适合当前的种群进化(拥有更小的多元函数值),他们就会存活下来并传递给下一代:
- Population Initiatization :
- Basin Hopping : 从起始基因开始随机选择干扰,生成局部优化解决方法Vi,g,再进行局部搜索找到最优结果。
- Crossover:
- Selection:通过模型比较父基因和孩子基因,进行选择
BHE整体流程:
实验部分:
数据集: ImageNet、CASIA-WebFace(共选取1000张);
威胁模型: Alexnet、VGG19、SqueezeNet、Resnet101、InceptionV1、Incep-
tionV3;
黑盒攻击对比方法:(选用FoolBox:A Python
toolbox to benchmark the robustness of machine learning models)
方法 | 文献 |
---|---|
spatial attack | Exploring the Landscape of Spatial Robustness |
boundary attack | Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. |
single-pixel attack | One pixelattack for fooling deep neural networks |
pointwise attack | Towards the first adversarially robust neural network model on MNIST. |
参数 | 数值 |
---|---|
step size | 0.5 |
α | 100~200 |
Basin Hopping iterations | 3 |
CR | 0.9 |
实验设置:
分别选取了ACMMM、大学logo和文字作为watermark,调整scale大小和font大小。
比较其他攻击方法:
在SOTA的图像防御上的表现:
将本文方法生成的对抗水印作为训练集,加入到orgin database进行对抗训练,再选择adv-watermask攻击:
总结
同前面新颖的点