【多模态对抗攻击】VLATTACK: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models

本文提出VLATTACK,一种利用预训练的视觉语言模型对黑盒微调模型进行攻击的方法,通过单模态和多模态级别的扰动,有效提升攻击成功率。研究强调了在实际部署中预训练模型的对抗性盲点。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

原文标题: VLATTACK: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models
原文代码: https://github.com/ericyinyzy/VLAttack
发布年度: 2023
发布期刊: NeurIPS


摘要

Vision-Language (VL) pre-trained models have shown their superiority on many multimodal tasks. However, the adversarial robustness of such models has not been fully explored. Existing approaches mainly focus on exploring the adversarial robustness under the white-box setting, which is unrealistic. In this paper, we aim to investigate a new yet practical task to craft image and text perturbations using pre-trained VL models to attack black-box fine-tuned models on different downstream tasks. Towards this end, we propose VLATTACK2 to generate adversarial samples by fusing perturbations of images and texts from both singlemodal and multimodal levels. At the single-modal level, we propose a new blockwise similarity attack (BSA) strategy to learn image perturbations for disrupting universal representations. Besides, we adopt an existing text attack strategy to generate text perturbations independent of the image-modal attack. At the multimodal level, we design a novel iterative cross-search attack (ICSA) method to update adversarial image-text pairs periodically, starting with the outputs from the single-modal level. We conduct extensive experiments to attack five widely-used VL pre-trained models for six tasks. Experimental results show that VLATTACK achieves the highest attack success rates on all tasks compared with state-of-the-art baselines, which reveals a blind spot in the deployment of pre-trained VL models.


背景

这些视觉语言(VL)预训练模型模型首先通过在大规模未标记图像文本数据集上进行预训练来学习多模态交互,然后在不同下游 VL 任务上使用标记对进行微调。尽管它们具有出色的性能,但这些 VL 模型的对抗鲁棒性仍然相对未被探索。

现有的在VL任务中进行对抗性攻击的工作主要是在白盒设置下,攻击者可以访问微调模型的梯度信息。然而,在更现实的场景中,恶意攻击者可能只能访问通过第三方发布

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值