【多模态对抗】VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Model-CSDN博客

本文链接：https://blog.csdn.net/nbwjszd/article/details/137780663

原文标题： VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models
原文代码： https://github.com/ericyinyzy/VQAttack
发布年度： 2024
发布期刊： AAAI

摘要

Visual Question Answering (VQA) is a fundamental task in computer vision and natural language process fields. Although the “pre-training & finetuning” learning paradigm significantly improves the VQA performance, the adversarial robustness of such a learning paradigm has not been explored. In this paper, we delve into a new problem: using a pretrained multimodal source model to create adversarial imagetext pairs and then transferring them to attack the target VQA models. Correspondingly, we propose a novel VQATTACK model, which can iteratively generate both image and text perturbations with the designed modules: the large language model (LLM)-enhanced image attack and the cross-modal joint attack module. At each iteration, the LLM-enhanced image attack module first optimizes the latent representationbased loss to generate feature-level image perturbations. Then it incorporates an LLM to further enhance the image perturbations by optimizing the designed masked answer antirecovery loss. The cross-modal joint attack module will be triggered at a specific iteration, which updates the image and text perturbations sequentially. Notably, the text perturbation updates are based on both the learned gradients in the word embedding space and word synonym-based substitution. Experimental results on two VQA datasets with five validated models demonstrate the effectiveness of the proposed VQATTACK in the transferable attack setting, compared with stateof-the-art baselines. This work reveals a significant blind spot in the “pre-training & fine-tuning” paradigm on VQA tasks. Source codes will be released.

背景

目前vqa模型有两种主流训练方式，与流行的“预训练和微调”范例相比，端到端训练的模型通常表现出较差的性能。在这种范式中，模型最初是在公共领域广泛收集的图像-文本对上进行预训练的，从而促进了多式联运关系的获取。随后，模型使用特定的 VQA 数据集进行微调，以增强其在下游任务上的性能。然而，VQA 任务背景下的对抗鲁棒性方面（受此范例支配）仍未得到充分探索。因此本文也是一篇在预训练模型上进行攻击训练和生成的黑盒攻击。

这种攻击场景具有显着的复杂性，这源于以下两个基本方面： C1 – 跨模型的可转移性。预训练的源模型和受害者目标 VQA 模型通常针对不同的任务进行训练，并在不同的数据集上进行训练。虽然可迁移性的概念已在图像模型的背景下得到广泛验证，但预训练模型领域内的这种属性尚未得到全面探索。 C2 – 不同方式的联合攻击。我们的任务围绕多模态问题，需要对图像和文本问题引入扰动以提高性能。尽管以前的方法已经有效地为每种单独的模态设计了攻击策略，但复杂的挑战在于同时优化具有连续值的图像和以离散标记为特征的文本内容的扰动。这种联合攻击任务仍然构成重大障碍，需要创新的解决方案。