VQA v2.0数据集 图像问题答案对

一、图片

(此图片是来自VQA v2.0数据集下的val2014文件夹下的 COCO_val2014_000000000715.jpg

二、问题(question)

""" 上方图片对应的问题集,保存在VQA v2.0数据集下v2_Questions_Val_mscoco文件夹下的v2_OpenEnded_mscoco_val2014_questions.json文件中 """


{"image_id": 715, "question": "How many bananas are on display next to the oranges?", "question_id": 715000},
{"image_id": 715, "question": "How many baskets of fruit?", "question_id": 715001},
{"image_id": 715, "question": "How many fruits are shown?", "question_id": 715002}, 
{"image_id": 715, "question": "How many bananas are in this picture?", "question_id": 715003},
{"image_id": 715, "question": "What kind of fruit is this?", "question_id": 715004},
{"image_id": 715, "question": "Is this picture taken inside or outside?", "question_id": 715005}, 
{"image_id": 715, "question": "How are the drinks being kept cold?", "question_id": 715006}

三、答案(annotation)

"""上方对于图片提出的问题对应的答案,保存在VQA v2.0数据集下v2_Annotations_Val_mscoco文件夹下的 v2_mscoco_val2014_annotations.json文件中 """

 {

"answer_type": "number", 
"multiple_choice_answer": "17", 
"answers": [{"answer": "20", "answer_confidence": "maybe", "answer_id": 1}, {"answer": "19", "answer_confidence": "yes", "answer_id": 2}, {"answer": "40", "answer_confidence": "maybe", "answer_id": 3}, {"answer": "17", "answer_confidence": "yes", "answer_id": 4}, {"answer": "30", "answer_confidence": "maybe", "answer_id": 5},  {"answer": "17", "answer_confidence": "yes", "answer_id": 6}, {"answer": "19", "answer_confidence": "maybe", "answer_id": 7}, {"answer": "50", "answer_confidence": "maybe", "answer_id": 8}, {"answer": "15", "answer_confidence": "maybe", "answer_id": 9}, {"answer": "16", "answer_confidence": "yes", "answer_id": 10}], 
"image_id": 715, "question_type": "how many", "question_id": 715000

}

 {

"question_type": "how many",
"multiple_choice_answer": "2",
"answers": [{"answer": "2", "answer_confidence": "yes", "answer_id": 1}, {"answer": "2", "answer_confidence": "yes", "answer_id": 2}, {"answer": "2", "answer_confidence": "yes", "answer_id": 3}, {"answer": "6", "answer_confidence": "yes", "answer_id": 4},
{"answer": "2", "answer_confidence": "yes", "answer_id": 5}, {"answer": "5", "answer_confidence": "maybe", "answer_id": 6}, {"answer": "6", "answer_confidence": "yes", "answer_id": 7}, {"answer": "6", "answer_confidence": "yes", "answer_id": 8},
{"answer": "2", "answer_confidence": "yes", "answer_id": 9}, {"answer": "6", "answer_confidence": "yes", "answer_id": 10}], 
"image_id": 715, "answer_type": "number", "question_id": 715001

},

 {

"question_type": "how many", 
"multiple_choice_answer": "3",
"answers": [{"answer": "3", "answer_confidence": "no", "answer_id": 1}, {"answer": "3", "answer_confidence": "yes", "answer_id": 2}, {"answer": "4", "answer_confidence": "yes", "answer_id": 3}, {"answer": "3", "answer_confidence": "yes", "answer_id": 4},
{"answer": "4", "answer_confidence": "yes", "answer_id": 5}, {"answer": "3", "answer_confidence": "yes", "answer_id": 6}, {"answer": "4", "answer_confidence": "no", "answer_id": 7}, {"answer": "many", "answer_confidence": "no", "answer_id": 8},  {"answer": "3", "answer_confidence": "yes", "answer_id": 9}, {"answer": "3", "answer_confidence": "yes", "answer_id": 10}],
"image_id": 715, "answer_type": "number", "question_id": 715002

}, 

{

"answer_type": "number",
 "multiple_choice_answer": "10", 
"answers": [{"answer": "bunches", "answer_confidence": "yes", "answer_id": 1}, {"answer": "16", "answer_confidence": "maybe", "answer_id": 2}, {"answer": "15", "answer_confidence": "yes", "answer_id": 3}, {"answer": "18", "answer_confidence": "maybe", "answer_id": 4}, {"answer": "19", "answer_confidence": "maybe", "answer_id": 5}, {"answer": "over 10", "answer_confidence": "yes", "answer_id": 6}, {"answer": "20", "answer_confidence": "yes", "answer_id": 7}, {"answer": "17", "answer_confidence": "maybe", "answer_id": 8}, {"answer": "10", "answer_confidence": "yes", "answer_id": 9}, {"answer": "30", "answer_confidence": "maybe", "answer_id": 10}],
 "image_id": 715, "question_type": "how many", "question_id": 715003

},

 {

"answer_type": "other",
 "multiple_choice_answer": "banana",
 "answers": [{"answer": "banana", "answer_confidence": "yes", "answer_id": 1}, {"answer": "bananas, oranges, and pineapples", "answer_confidence": "yes", "answer_id": 2}, {"answer": "banana, pineapple, orange", "answer_confidence": "yes", "answer_id": 3}, {"answer": "oranges, bananas and pineapples", "answer_confidence": "yes", "answer_id": 4}, {"answer": "fresh", "answer_confidence": "maybe", "answer_id": 5}, {"answer": "banana,orange,pineapple,", "answer_confidence": "yes", "answer_id": 6}, {"answer": "banana", "answer_confidence": "yes", "answer_id": 7}, {"answer": "banana", "answer_confidence": "maybe", "answer_id": 8},{"answer": "bananas", "answer_confidence": "yes", "answer_id": 9}, {"answer": "banana", "answer_confidence": "yes", "answer_id": 10}], 
"image_id": 715, "question_type": "what kind of", "question_id": 715004

},

 {

"answer_type": "other",
 "multiple_choice_answer": "inside",
 "answers": [{"answer": "outside", "answer_confidence": "yes", "answer_id": 1}, {"answer": "inside", "answer_confidence": "yes", "answer_id": 2}, {"answer": "inside", "answer_confidence": "yes", "answer_id": 3}, {"answer": "inside", "answer_confidence": "yes", "answer_id": 4}, {"answer": "inside", "answer_confidence": "yes", "answer_id": 5}, {"answer": "inside", "answer_confidence": "yes", "answer_id": 6}, {"answer": "inside", "answer_confidence": "yes", "answer_id": 7}, {"answer": "inside", "answer_confidence": "yes", "answer_id": 8}, {"answer": "inside", "answer_confidence": "yes", "answer_id": 9}, {"answer": "inside", "answer_confidence": "yes", "answer_id": 10}],
 "image_id": 715, "question_type": "is this", "question_id": 715005

},

 {

"question_type": "how", 
"multiple_choice_answer": "ice", 
"answers": [{"answer": "sitting in ice", "answer_confidence": "yes", "answer_id": 1}, {"answer": "19", "answer_confidence": "maybe", "answer_id": 2}, {"answer": "ice", "answer_confidence": "yes", "answer_id": 3}, {"answer": "ice", "answer_confidence": "yes", "answer_id": 4}, {"answer": "30", "answer_confidence": "maybe", "answer_id": 5}, {"answer": "ice", "answer_confidence": "maybe", "answer_id": 6}, {"answer": "ice", "answer_confidence": "yes", "answer_id": 7}, {"answer": "ice", "answer_confidence": "yes", "answer_id": 8},{"answer": "ice", "answer_confidence": "yes", "answer_id": 9}, {"answer": "ice", "answer_confidence": "yes", "answer_id": 10}],
 "image_id": 715, "answer_type": "other", "question_id": 715006

}

以上全部组成了一个完整的VQA v2.0数据集问题答案图像对。

  • 7
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
VQA(Visual Question Answering)是指通过计算机视觉和自然语言处理技术,让计算机能够回答与图像相关的自然语言问题。在VQA研究中,数据集是非常重要的,下面介绍几个经典的VQA数据集: 1. VQA v1和VQA v2 VQA v1和VQA v2VQA领域最早和最重要的两个数据集。它们包含了超过200,000张图像和超过1,000,000个与图像相关的问题答案。这些问题涉及到图像中的对象、场景、属性等各方面,答案可以是单词、短语或句子。VQA v2相比于VQA v1,增加了一些挑战性的问题,例如需要推理或者需要多步骤计算。 2. COCO-QA COCO-QA是基于COCO(Common Objects in Context)数据集构建的VQA数据集,包含了超过120,000张图像和超过750,000个与图像相关的问题答案。与VQA数据集不同的是,COCO-QA的问题答案都是多项选择的形式,其中一个正确,其余的错误。 3. Visual7W Visual7W是一个涉及到7个“W”(Who、What、Where、When、Why、How和Which)的VQA数据集,包含了超过47,000张图像和超过300,000个与图像相关的问题答案。这些问题涉及到图像中的对象、场景、动作等各方面。 4. GQA GQA(Visual Genome Question Answering)是一个基于Visual Genome数据集构建的VQA数据集,包含了超过22,000张图像和超过1,000,000个与图像相关的问题答案。GQA中的问题具有更高的复杂性,需要对图像中的物体属性、关系、逻辑推理等方面进行推理。 以上是几个经典的VQA数据集,它们都为VQA领域的研究提供了丰富的数据资源。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值