【论文阅读】Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing Attack（2021）

Bosenya12

于 2024-08-19 15:04:35 发布

阅读量558

点赞数 17

分类专栏：模型窃取文章标签：论文阅读模型窃取黑盒硬标签软标签

本文链接：https://blog.csdn.net/Glass_Gun/article/details/141324914

版权

模型窃取专栏收录该内容

6 篇文章 0 订阅

订阅专栏

摘要

Previous studies have verified that（先前的研究已经证实） the functionality of black-box models（黑盒模型的功能） can be stolen with full probability outputs（全概率输出）. However, under the more practical hard-label setting（在更实际的硬标签设置下）, we observe that existing methods（现有的方法） suffer from catastrophic performance degradation（灾难性的性能下降）. We argue（认为） this is due to（由于） the lack of rich information（缺乏丰富的信息） in the probability prediction（预测概率） and the overfitting caused by hard labels（硬标签引起的过拟合）. To this end（为此）, we propose a novel hard-label model stealing method（新的硬标签模型窃取方法） termed black-box dissector（成为黑盒解析器）, which consists of two erasing-based modules（两个基于擦除的模块）. One is a CAM-driven erasing strategy（擦除策略） that is designed to increase the information capacity（信息容量） hidden in hard labels（硬标签） from the victim model. The other is a random-erasing-based（基于随机擦除） self-knowledge distillation module（自知识蒸馏） that utilizes soft labels（软标签） from the substitute model（替代模型） to mitigate overfitting（减轻过拟合）. Extensive experiments（广泛的实验） on four widely-used（通用） datasets consistently demonstrate（一致表明） that our method outperforms（优于） state-of-the-art methods, with an improvement of at most（最多提高） 8.27%. We also validate（验证） the effectiveness（有效性） and practical potential（实用潜力） of our method on real-world APIs and defense methods. Furthermore（此外）, our method promotes（促进） other downstream tasks（其他下游任务）, i.e., transfer adversarial attacks（转移对抗攻击）.

方法

在这里插入图片描述

结论

We investigated（研究） the problem（问题） of model stealing attacks（模型窃取攻击） under the hard-label setting（硬标签设置下） and pointed out（指出） why previous methods are not effective enough（有效性不足）. We presented a new method, termed black box dissector（称为黑盒分解）, which contains a CAM-driven erasing strategy（擦除策略） and a RE-based self-KD module. We showed its superiority on four widely-used（通过） datasets and verified（验证） the effectiveness（有效性） of our method with defense methods（防御方法）, real-world APIs, and the downstream adversarial attack（下游对抗性攻击）. Though focusing on（关注） image data（图像数据） in this paper, our method is general for（通用） other tasks as long as（只要） the CAM and similar erasing method work, e.g., synonym saliency words replacement（同义词显著词替换） for NLP tasks [4]. We believe our method can be easily extended（容易扩展） to other fields and inspire future researchers. Model stealing attack（模型窃取攻击） poses a threat（威胁） to the deployed machine learning models（机器学习模型的部署）. We hope this work will draw attention to the protection of deployed models and furthermore（进一步） shed more light on the attack mechanisms（攻击机制） and prevention methods（预防方法）. Additionally（此外）, transformer-based classifiers are becoming hot, and their security issues should also be paid attention to. This kind of classifier divides（分割） the images into patches and our method works by erasing parts of images, it is more convenient（方便） for us to align（对齐） the attention map（注意力图） by masking the patch and mine the missing information. We will validate this idea in the further work.

论文链接

Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing Attack

Bosenya12

关注

17
点赞
踩
20

收藏

觉得还不错? 一键收藏
打赏
0
评论
【论文阅读】Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing Attack（2021）

Previous studies have verified that（先前的研究已经证实） the functionality of black-box models（黑盒模型的功能） can be stolen with full probability outputs（全概率输出）. However, under the more practical hard-label setting（在更实际的硬标签设置下）, we observe that existing methods（现有的方法） suf
复制链接

扫一扫