Clarifying Question领域最常见的三个数据集

Qulac

aliannejadi/qulac: Qulac: A dataset on asking Questions for Lack of Clarity in open-domain information-seeking conversations. (github.com)
在这里插入图片描述

qulac.json:

qulac.json contains the topics, facets, questions, and answers. This is the main file of Qulac. However, it may not be very straightforward to use this file for experiments directly. That is why we have provided some auxiliary data files which we describe in this document. In the qulac.json file, you will find these fields:

  • topic_id: the ID of the topic in TREC Web Track.
  • facet_id: the ID of the facet in TREC Web Track.
  • topic_facet_id: an ID corresponding to a topic and facet pair in the following format: %d-%d. For example, 21-1 corresponds to the first facet (facet_id=1) of the 21st topic in TREC Web Track data.
  • topic_facet_question_id: an ID corresponding to a topic, facet, and question triplet in the following format: %d-%d-%d. For example, 21-1-5 corresponds to the fifth question of the first facet of the 21st topic. Each row of the data is identified by this ID.
  • topic: the TREC topic (query).
  • topic_type: an str value indicating the type of a topic. Possible values are faceted and ambiguous.
  • facet_type: an str value indicating the type of a facet. Possible values are inf (i.e., informational) and nav (i.e., navigational).
  • topic_desc: a full description of the topic as it appears in the TREC Web Track data.
  • facet_desc: a full description of the facet (information need) as it appears in the TREC Web Track data.
  • question: a clarifying question that the system can pose to the user for the current topic and facet.
  • answer: an answer to the clarifying question, assuming that the user is in the context of the current row (i.e., the user’s initial query is topic, their information need is facet, and question has been posed to the user).
topic_idfacet_idtopic_facet_idtopic_facet_question_idtopictopic_typefacet_typetopic_descfacet_descquestionanswer
1932193-2193-2-5dog clean up bagsfacetedinfCan I order dog clean-up bags online?Are there biodegradable products for the dispo…are you looking for a way to dispose your dog …im looking for dog waste bags that are biodegr…
1442144-2144-2-5trombone for saleambiguousinfinformation on where I could buy a new or used…good places to sell a used tromboneare you looking for a place to sell a used tro…yes
78378-378-3-7dietingambiguousinfFind “reasonable” dieting advice, that is no…Find crash diet plans that promise quick weigh…do you want to know if dieting is safei would like to know more on quick and safe di…

qulac_hist012_dict.tar.gz:

qulac_hist012_dict.tar.gz can be used for experiments involving multi-turn conversations. As we have mentioned in [1], the conversations are artificially generated following the data that is available in qulac.json. Hence, the structure of the dict is as follows (after decompression):

{ <record_id>: 
	{ 
	  'history_id': <the ID of conversation history (context)>,
	  'history_list': [
				{ 'question': <question1 string>,
				  'answer': <answer1 string> },
				{ 'question': <question2 string>,
				  'answer': <answer2 string> },
				{ 'question': <question2 string>,
				  'answer': <answer2 string> },		 					 
			    ],
	 'query': <query (topic) string>,
	 'question': <current question string>,
	 'answer': <current answer string>
  }
  ....
}
  • Record ID:

    topic_id - facet_id - past_question_id_1 - past_question_id_2 - current_question_id - answer_flag
    
    • The flag is used to indicate whether the record is referring to the results that are obtained with (=1) or without (=0) final answer
 '18-2-1-2-10-1': {	 
	'history_id': '18-2-1-2',
	'history_list': [{'answer': 'no i just want to find spreadsheets and templates',
			'question': 'are you interested in a service for wedding budgeting'},
			{'answer': 'yes i want to find some spreadsheets to help me budget',
			'question': 'are you looking for advice on wedding budgeting'}],
	'query': 'wedding budget calculator',
	'question': 'what is your projected budget for your wedding',
	'answer': 'i need to find a spreadsheet to figure it out'},

'25-1-3-8-1' : {	 
	'history_id': '25-1-3',
	'history_list': [{'answer': 'no i am looking for information on the greek mathematician euclid',
			'question': 'do you need directions to euclid ave'}],
	'query': 'euclid',
	'question': 'do you want to know related people',
	'answer': 'no i only want to know about one particular person'}

MIMICS

microsoft/MIMICS: MIMICS: A Large-Scale Data Collection for Search Clarification (github.com)
在这里插入图片描述

Each clarification in MIMICS consists of a clarifying question and up to five candidate answers

queryheadaches
questionWhat do you want to know about this medical condition?
candidate answers (options)symptom, treatment, causes, diagnosis, diet

MIMICS contains three datasets:

  • MIMICS-Click includes over 400k unique queries, their associated clarification panes, and the corresponding aggregated user interaction signals (i.e., clicks).

    [‘#HASH#value excel’, ‘What version of Excel are you looking for?’, ‘2010’, ‘2013’, ‘2016’, ‘’, ‘’, ‘medium’, ‘0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]

    [‘%2f’, ‘What language are you looking for?’, ‘javascript’, ‘python’, ‘’, ‘’, ‘’, ‘medium’, ‘0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]

    [‘.net’, ‘Select one to refine your search’, ‘powershell .net’, ‘iis .net’, ‘windows .net’, ‘sql .net’, ‘exchange .net’, ‘high’, ‘0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]

    [‘.net 3.5 framework’, ‘Select one to refine your search’, ‘windows’, ‘powershell’, ‘xml’, ‘azure’, ‘json’, ‘high’, ‘3’, ‘0.8571428571428572’, ‘0.0’, ‘0.0’, ‘0.14285714285714285’, ‘0.0’]

  • MIMICS-ClickExplore is an exploration data that includes aggregated user interaction signals for over 60k unique queries, each with multiple clarification panes.

    Column(s)Description
    query(string) The query text.
    question(string) The clarifying question.
    option_1, …, option_5(string) Up to five candidate answers.
    impression_level(string) A three-level impression label (i.e., low, medium, or high).
    engagement_level(integer) A label in [0, 10] representing total user engagements.
    option_cctr_1, …, option_cctr_5(real) The conditional click probability on each candidate answer.

    [‘0 degrees’, ‘Select one to refine your search’, ‘celsius’, ‘kelvin’, ‘fahrenheit’, ‘’, ‘’, ‘medium’, ‘0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]
    [‘0 degrees’, ‘Select one to refine your search’, ‘fahrenheit’, ‘celsius’, ‘kelvin’, ‘’, ‘’, ‘medium’, ‘4’, ‘1.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]
    [‘0 degrees’, ‘Select one to refine your search’, ‘boots for 0 degrees’, ‘gloves for 0 degrees’, ‘’, ‘’, ‘’, ‘medium’, ‘0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]

  • MIMICS-Manual includes over 2k unique real search queries. Each query-clarification pair in this dataset has been manually labeled by at least three trained annotators. It contains graded quality labels for the clarifying question, the candidate answer set, and the landing result page for each candidate answer.

    Column(s)Description
    query(string) The query text.
    question(string) The clarifying question.
    option_1, …, option_5(string) Up to five candidate answers.
    question_label(integer) A three-level quality label for the clarifying question
    options_overall_label(integer) A three-level quality label for the candidate answer set
    option_label_1, …, option_label_5(integer) The conditional click probability on each candidate answer.

[‘multiple system atrophy’, ‘What do you want to know about this medical condition?’, ‘symptom’, ‘treatment’, ‘causes’, ‘diagnosis’, ‘diet’, ‘2’, ‘2’, ‘2’, ‘2’, ‘2’, ‘2’, ‘2’]

[‘team fortress 2’, ‘What would you like to know about this game?’, ‘team fortress 2 steam’, ‘team fortress 2 mods’, ‘team fortress 2 gameplay’, ‘team fortress 2 cheats’, ‘’, ‘1’, ‘2’, ‘2’, ‘2’, ‘2’, ‘2’, ‘’]

[‘google chrome exe’, ‘Select one to refine your search’, ‘64 bit’, ‘32 bit’, ‘’, ‘’, ‘’, ‘’, ‘2’, ‘2’, ‘2’, ‘’, ‘’, ‘’]
[‘google chrome exe’, ‘Select one to refine your search’, ‘32 bit’, ‘64 bit’, ‘’, ‘’, ‘’, ‘’, ‘2’, ‘2’, ‘2’, ‘’, ‘’, ‘’]

ClariQ

ConvAI3 Data Challenge

ClariQ is a part of this challenge.

The challenge ran in two stages:

  • stage1: participants were provided with a static dataset consisting mainly of an initial user request, clarifying question and user answer
  • stage2: human-in-the-loop

Stage1: initial dataset

The dataset consist of:

  • User Request: an initial user request in the conversational form with a label reflects if is needed ranged from 1 to 4
    • 1: don’t need any clarification
    • 4: need clarification (must)
  • Clarification question: a set of possible clarifying questions
  • User Answers: each questions is supplied with a user answer

Stage2: human-in-the-loop

Enables the top-performing teams of the first stage to evaluate their models with the help of human evaluators. We evaluate the performance of a system in two aspects:

  • how much the conversation can help a user find the information they are looking for
  • how natural and realistic does the conversation appear to a human evaluator

ClariQ Dataset

aliannejadi/ClariQ: ClariQ: SCAI Workshop data challenge on conversational search clarification. (github.com)

FeatureValue
# train (dev) topics187 (50)
# faceted topics141
# ambiguous topics57
# single topics39
# facets891
# total questions3,929
# single-turn conversations11,489
# multi-turn conversations~ 1 million
# documents~ 2 million

File Format

train.tsv and dev.tsv

They have the same format, contain topics, facets, questions, answers and clarification need labels.

  • topic_id: the ID of the topic (initial_request).
  • initial_request: the query (text) that initiates the conversation.
  • topic_desc: a full description of the topic as it appears in the TREC Web Track data.
  • clarification_need: a label from 1 to 4, indicating how much it is needed to clarify a topic.
  • facet_id: the ID of the facet.
  • facet_desc: a full description of the facet (information need) as it appears in the TREC Web Track data.
  • question_id: the ID of the question as it appears in question_bank.tsv.
  • question: a clarifying question that the system can pose to the user for the current topic and facet.
  • answer: an answer to the clarifying question, assuming that the user is in the context of the current row (i.e., the user’s initial query is initial_request, their information need is facet_desc, and question has been posed to the user).
topic_idinitial_requesttopic_descclarification_needfacet_idfacet_descquestion_idquestionanswer
14I’m interested in dinosaursI want to find information about and pictures of dinosaurs.4F0159Go to the Discovery Channel’s dinosaur site, which has pictures of dinosaurs and games.Q00173are you interested in coloring booksno i just want to find the discovery channels website
14I’m interested in dinosaursI want to find information about and pictures of dinosaurs.4F0159Go to the Discovery Channel’s dinosaur site, which has pictures of dinosaurs and games.Q03021which dinosaurs are you interested inim not asking for that i just want to go to the discovery channel dinosaur page
test.tsv

only contains the list of test topics, as well as their ID’s.

topic_idinitial_request
201I would like to know more about raspberry pi
202Give me information on uss carl vinson.
question_bank.tsv

Constitutes of all the questions in the collection. The TSV file has two columns: question_id, question(txet)

question_idquestion
Q00001
Q02318what kind of medium do you want this information to be in
Q02319what kind of penguin are you looking for
Q02320what kind of pictures are you looking for

Note: selecting Q00001 means selecting no question

dev_synthetic.pkl.tar.gz & train_synthetic.pkl.tar.gz

These files contain dicts of synthetically built multi-turn conversations (up to three turns).

{<record_id>: {'topic_id': <int>,
  'facet_id': <str>,
  'initial_request': <str>,
  'question': <str>,
  'answer': <str>,
  'conversation_context': [{'question': <str>,
   'answer': <str>},
  {'question': <str>,
   'answer': <str>}],
  'context_id': <int>},
  ...
  }

where

  • <record_id> is an int indicating the ID of the current conversation record.
    • While in the dev set there exists multiple <record_id> values per <context_id>, in the test file there would be only one.
  • 'topic_id', 'facet_id', and 'initial_request' indicate the topic, facet, and initial request of the current conversation, according to the single turn dataset.
  • 'question': current clarifying question that is being posed to the user.
  • 'answer': user’s answer to the clarifying question.
  • 'conversation_context' identifies the context of the current conversation. A context consists of previous turns in a conversation. As we see, it is a list of 'question' and 'answer' items. This list tells us which questions have been asked in the conversation so far, and what has been the answer to them.
  • 'context_id' is the ID of the conversation context. Basically, participants should predict the next utternace for each context_id.
  2288: {'topic_id': 8,
  'facet_id': 'F0969',
  'initial_request': 'I want to know about appraisals.',
  'question': 'are you looking for a type of appraiser',
  'answer': 'yes jewelry',
  'conversation_context': [],
  'context_id': 969},
  
 1570812: {'topic_id': 293,
 'facet_id': 'F0729',
 'initial_request': 'Tell me about the educational advantages of social networking sites.',
 'question': 'which social networking sites would you like information on',
 'answer': 'i don have a specific one in mind just overall educational benefits to social media sites',
 'conversation_context': [{'question': 'what level of schooling are you interested in gaining the advantages to social networking sites',
   'answer': 'all levels'},
  {'question': 'what type of educational advantages are you seeking from social networking',
   'answer': 'i just want to know if there are any'}],
 'context_id': 976573}
single_turn_train_eval.pkl & multi_turn_***_evla.pkl.tar.gz

These files are dicts of pre-computed document relevance results after asking each question

 { <evaluation_metric>: 
  	[ 
  	  <context_id>: 
  	  {
    	    <question_id> : 
  	  	 {
  	  	   'no_answer': <float>,
  	  	   'with_answer': <float>
  	  	 }
  	  	 , ... , 
  	  	 'MAX': 
  	  	  {
  	  	    'no_answer': <float>,
  	  	    'with_answer: <float>
  	  	  },
  	  	 'MIN':
  	  	  {
  	  	    'no_answer: <float>,
  	  	    'with_answer: <float>
  	  	  } 
  	  }
    ]
    ...
  }	
  • MAX and MIN: These refer to the maximum and minimum performance that the retrieval model achieves by asking the “best” and “worst” questions among the candidate questions.
top10k_docs_dict.pkl.tar.gz

A dict consisting of a list of document ID’s for a given topic_id, this dict is useful for having the list of top 10,000 documents as an initial ranking.

train.qrel & dev.qrel

These files contain the relevance assessments of ClueWeb09 and ClueWeb12 collections for every facet in the train and dev sets, respectively

<facet_id> 0 <document_id> <relevance_score>
F0001 0 clueweb09-en0038-74-08250 1
F0001 0 clueweb09-enwp01-17-11113 1
F0002 0 clueweb09-en0001-02-21241 1
F0002 0 clueweb09-en0006-52-11056 1
  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

长命百岁️

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值