【多模态对抗】VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Model

原文标题: VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models
原文代码: https://github.com/ericyinyzy/VQAttack
发布年度: 2024
发布期刊: AAAI


摘要

Visual Question Answering (VQA) is a fundamental task in computer vision and natural language process fields. Although the “pre-training & finetuning” learning paradigm significantly improves the VQA performance, the adversarial robustness of such a learning paradigm has not been explored. In this paper, we delve into a new problem: using a pretrained multimodal source model to create adversarial imagetext pairs and then transferring them to attack the target VQA models. Correspondingly, we propose a novel VQATTACK model, which can iteratively generate both image and text perturbations with the designed modules: the large language model (LLM)-enhanced image attack and the cross-modal joint attack module. At each iteration, the LLM-enhanced image attack module first optimizes the latent representationbased loss to generate feature-level image perturbations. Then it incorporates an LLM to further enhance the image perturbations by optimizing the designed masked answer antirecovery loss. The cross-modal joint attack module will be triggered at a specific iteration, which updates the image and text perturbations sequentially. Notably, the text perturbation updates are based on both the learned gradients in the word embedding space and word synonym-based substitution. Experimental results on two VQA datasets with five validated models demonstrate the effectiveness of the proposed VQATTACK in the transferable attack setting, compared with stateof-the-art baselines. This work reveals a significant blind spot in the “pre-training & fine-tuning” paradigm on VQA tasks. Source codes will be released.


背景

目前vqa模型有两种主流训练方式,与流行的“预训练和微调”范例相比,端到端训练的模型通常表现出较差的性能。在这种范式中,模型最初是在公共领域广泛收集的图像-文本对上进行预训练的,从而促进了多式联运关系的获取。随后,模型使用特定的 VQA 数据集进行微调,以增强其在下游任务上的性能。然而,VQA 任务背景下的对抗鲁棒性方面(受此范例支配)仍未得到充分探索。因此本文也是一篇在预训练模型上进行攻击训练和生成的黑盒攻击。

这种攻击场景具有显着的复杂性,这源于以下两个基本方面: C1 – 跨模型的可转移性。预训练的源模型和受害者目标 VQA 模型通常针对不同的任务进行训练,并在不同的数据集上进行训练。虽然可迁移性的概念已在图像模型的背景下得到广泛验证,但预训练模型领域内的这种属性尚未得到全面探索。 C2 – 不同方式的联合攻击。我们的任务围绕多模态问题,需要对图像和文本问题引入扰动以提高性能。尽管以前的方法已经有效地为每种单独的模态设计了攻击策略,但复杂的挑战在于同时优化具有连续值的图像和以离散标记为特征的文本内容的扰动。这种联合攻击任务仍然构成重大障碍,需要创新的解决方案。

创新点

为了应对这些挑战,我们提出了一种名为 VQATTACK 的新方法来探索预训练的源 VQA 模型和受害者目标 VQA 模型之间的对抗性可转移性。如图 2 所示,所提出的 VQATTACK 仅基于预训练的源模型 F 和新颖的多步攻击框架生成图像和文本扰动。

初始化输入图像文本对(I,T)后,VQATTACK将通过两个关键模块:大语言模型&#x

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值