【VQA】Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Mo

原文标题: Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering
原文代码: 暂无
发布年度: 2024
发布期刊: CVPR


摘要

The goal of selective prediction is to allow an a model to abstain when it may not be able to deliver a reliable prediction, which is important in safety-critical contexts. Existing approaches to selective prediction typically require access to the internals of a model, require retraining a model or study only unimodal models. However, the most powerful models (e.g. GPT-4) are typically only available as black boxes with inaccessible internals, are not retrainable by end-users, and are frequently used for multimodal tasks. We study the possibility of selective prediction for vision-language models in a realistic, black-box setting. We propose using the principle of neighborhood consistency to identify unreliable responses from a black-box vision-language model in question answering tasks. We hypothesize that given only a visual question and model response, the consistency of the model’s responses over the neighborhood of a visual question will indicate reliability. It is impossible to directly sample neighbors in feature space in a black-box setting. Instead, we show that it is possible to use a smaller proxy model to approximately sample from the neighborhood. We find that neighborhood consistency can be used to identify model responses to visual questions that are likely unreliable, even in adversarial settings or settings that are out-of-distribution to the proxy model.


背景

本文虽然不是对抗的任务设定,但仍然保留了黑盒的设定。在商业场景中,大部分的模型都是通过黑盒设定进行访问的。因此,当面临高风险场景中,我们希望模型最好听从专家的意见或放弃回答,而不是给出错误的答案。存在许多选择性预测 或改善模型预测不确定性的方法,例如集成 、特征空间中的梯度引导采样 、重新训练模型或训练辅助模块使用模型预测。选择性预测通常在单模态设置和/或具有封闭世界假设的任务(例如图像分类)中进行研究,并且最近才针对多模态、开放式任务(例如视觉问答)进行研究。

在现有部署中,训练数据是私有的,模型特征和梯度不可用,无法进行再训练,预测数量可能受到 API 的限制,模型输出的训练通常被禁止,并且查询是开放式的。在具有现实约束的黑盒设置中,我们如何从视觉语言模型中识别不可靠的预测?

一种直观的方法是考虑自我一致性:如果给人类受试者两个语义上相等的问题,我们期望人类受试者对问题的答案是相同的。一致性的正式定义为,给定分类器 f (·) 和特征空间中的点 x ∈ RN,对于足够小的 ε,分类器对 x 的 ε 邻域的预测应与 f (x) 一致。实施这些概念中的任何一个都不是一件容易的事。我们如何才能大规模地获得与输入视觉问题“语义等效”的视觉问题?由于我们无法访问黑盒模型的内部表示,我们如何从输入视觉问题的邻域中进行采样?

创新点

首先,我们使用大

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值