《STAR: A Schema-Guided Dialog Dataset for Transfer Learning》论文阅读

内容:作者公开了名叫STAR的schema-guided任务型对话的新数据集。特别地,作者提出了新式的对话数据模式,解决了过去数据集的问题,以应对zero-shot generalization across tasks and domains问题。这里schema这个词的意思是“task specification”,而不是google-SGD里的schema。



面向任务型对话:与开发域对话不同,面向任务型对话是以完成任务为目标的,由一组对话steps组成的。这些steps相当于先验logic,无法从数据中学到。事实上,对于实际应用应该不需要大批数据就可以改变logic。对一个对话,完成任务的完美step序列可以组织为一张图(graph),utterance与actions都用途的节点联系起来。我们把这样的图叫做task schema,简称schema。

【注:本文语境下的schema,与其他论文里出现的schema意思不要混了,本文的schema是关于next system action的,近似于task specification(任务说明) ,而其他论文里的schema可能指数据本体。】

在过去的模型里,所训练任务的schema都是通过对话语料捕获进模式参数里的。当需要泛化到新的任务的时候,这些隐式地被记住的schema就不再合适了,这导致新任务迁移比较困难。而在STAR中,我们提供了每一种任务的显式的schema表示,从而使模型能够condition on the schema。


1、Realistic, variable user behavior:真实对话很少能沿着完美的路径完成,用户的行为是不可预期的。我们捕捉了这些行为。

2、Progression of difficulty:我们收集了3类对话:(1)happy,即对话沿着schema中的某一路径前进;(2)unhappy:用户异想天开,人为增加任务复杂性;(3)multi-task:对话涉及多个领域与任务。我们任务增加对话进行的复杂性有利于模型对新任务的迁移学习。

3、Consistency on the system side:任务型对话中,系统的行为应该是确定性的,不受用户的误导。具体地,我们鼓励对话双方尽量遵循给定任务schema。

4、Explicit knowledge base queries:对话系统需要调用外部API(如KB query,查知识库)。在STAR中,我们首次把system<–>KB侧的交互也作为标注表示进了对话内。因此,我们的模型必须能够学习与解释何时查、查什么、怎么把查到结果返回给用户。

【按:我认为第4条是最重要的。显式的查库约束有利于避免不符合逻辑的system actions】




3、新的schema guided dialog models,利用显式task schemas以泛化未知任务。

Related Work



Data Collection Method


(1)User’s Interface


(2)Wizard’s Interface

与user侧不同,我们期望wizard侧能够behave in consistent and structured manner。我们设计了wizard侧的界面。

wizard被指示尽可能地跟随task shcema的flow chart表示(如图1)。


红色框内是knowledge base items,详情略。

Together, the schema flow charts, knowledge base forms, and response suggestions provide a framework for wizards to make their behavior consistent and structured, even when dealing with erratic users.

The STAR Dataset


1、We focused our data collection design choices around grounding dialogs in knowledge base queries, and providing an explicit mechanism that encourages consistent system actions【KB查询语句的提供鼓励了一致和稳定的系统动作。】

2、size of the vocabulary used【单词和语言上的多样性】

3、Evaluating the consistency of the wizard’s behavior is even more challenging, because we did not annotate user intents in the dataset【我们丢弃了用户意图的标签,代价是 wizard行为一致性的评估比较困难】

4、Here, skipping questions or asking questions twice is allowed,since users might provide more than one (or no) piece of information at a time. On average, in 91% of all single-task dialogs the wizards follow the correct order of actions at the beginning of the dialog【91%的对话数据是单任务的。】

5、An essential feature of task-oriented dialog is its history dependence, i.e. the next system action hinges on what has been said or decided in multiple previous turns, as this sets it apart from question answering settings. To assess the history dependence of STAR, we train a transformer-based response selector【定量衡量对话历史回合中的context对next system action】


过去的模型的task schema都是隐式的(记作schema-free),而我们模型中task schema是显式的(记作schema-guided)。本节我们提出baseline models for next action prediction and response generation, both with and without conditioning on the schema.

1 Schema Representation



dialog state = current position in the schema graph


我们根据the flow charts与a few example dialogs,人为构建了schema graphs。

我们把schema表示考虑为对模型的贡献、而非对数据集的贡献。我们期望在未来能够人工构建出更多种类任务的schema,或者研究基于数据驱动的机制自动抽取chema representations。

2 Next Action Prediction

We introduce both schema-free and schema-guided BERT baselines for next action prediction on STAR.

(1)schema-free:用BERT编码dialog context,取CLS对应的向量作为句子语义特征 h CLS h_{\text {CLS}} hCLS,然后在所有可能的actions上计算分类分布。

(2)schema-guided:我们提出了schema-guided BERT classifier。具体地,为schema中的每一个节点,创建一个隐表示(latent representation),方法是用BERT编码该节点相关联的文本。记为矩阵K。然后,对每一个节点,我们考虑它的subsequent node作为对应的next action,表示为one-hot形式,记为矩阵V。计算如下的概率:
P scm = softmax ( h CLS T K ) V P_{\text {scm}}=\text{softmax}(h^T_{\text {CLS}}K)V Pscm=softmax(hCLSTK)V

【按:作者的模型虽然对动作预测与回复生成都提出了新的算法,但核心创新点其实还是schema graph,以及如何将graph嵌入为分类任务的矩阵。这里的疑问点是,schema-guided BERT classifier里面,BERT用了两次,反向传播的时候是作为同一个BERT训练的吗?】

3 Generation

系统回复的生成,使用现在主流的微调GPT2。给定对话历史 H H H,使用schema-guided BERT classifier预测出top 3 actions a 1 , a 2 , a 3 a_1,a_2,a_3 a1,a2,a3。对每一个action,它对应的回复模板记为 t 1 , t 2 , t 3 t_1,t_2,t_3 t1,t2,t3。我们将三个模板加对话历史拼接起来,然后用GPT-2生成ground-truth response。

【按:这里的top3 action指一个回合系统的动作最多只有3个?回复模式长什么样子?】



1 Next Action Prediction

分析:我们的schema-augmented BERT模型性能不如schema-free模型。尽管如此,考虑到我们的schema的意图本就是为了便于transfer learning,在seen tasks或domains上性能略差也是情有可原的。

2 Response Generation

这个任务的评估指标:BLEU-4, IEM, Entity F-1.

分析:显式schema下,性能更好。unhappy情况的性能略低一些,这是符合直觉的,因为unhappy情形用户说话不会按schema来。multi-task setting情况性能提升最明显,说明显式构建schema对multi-task 对话尤为有效。

3 Other Tasks


1、knowledge base query prediction (predicting the correct knowledge base keys, values and operators)

2、schema prediction (predicting the schema of a task, given a collection of dialogs)

3、out-of-domain detection (detecting whether a user has
made an out-of-domain request).


4 Zero-Shot Transfer


1、(i) using only happy dialogs and (ii) using both happy and unhappy dialogs

2、experiment with both (i) task transfer and (ii) domain transfer。我们认为,task间的overlap应该更高,而domain间的overlap应该更低。



分析:(1)schema的构建对few-shot性能的提升有优势;(2)未来工作需要进一步探索better leveraging the task-specific schemas to facilitate generalizability to unseen tasks and domains的机制




A Tasks

1、Descriptions of the 24 tasks

2、distribution of action counts for single- and multi-task dialogs

3、Multi-task scenarios connect tasks. For the user, task instructions are sometimes given during the dialog(Co-occurrence of tasks)

4、the fraction of single-task dialogs per task in which the wizards follow the prescribed order of questions at the beginning of the dialog

B Worker Payments

C Data format

每一个dialog文件or task schemas文件都以JSON形式储存。dialog文件核心是一个events的list。Events可以是用户话语或者wizard的习惯回复,或者如表9所示的其他信息。

D Multiple-choice tests

E Example Dialogs

F Selected Comments from Turkers


UserGuide instruct
{‘Text’: “Mark says: ‘Yeah, actually, if the weather is good, we could just go out to the park and book a restaurant for the evening’. You agree. So depending on the weather, either continue searching / booking the venue, or ask your assistant to help you find and book a restaurant for Wednesday evening. [instruction 4 of 7]”}

Wizard query
{‘APIName’: ‘weather’, ‘Constraints’: [{‘Day’: ‘“Wednesday”’}]}

KnowledgeBase return_item
{‘APIName’: ‘weather’, ‘Item’: {‘APIName’: ‘weather’, ‘City’: ‘Chicago’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 18, ‘Weather’: ‘Snowing’, ‘id’: 284}, ‘TotalItems’: -1}
简单总结一下,Agent侧,User紧随UserGuide之后。User的action只能是话语,或者标志对话结束的complete。系统侧的动作则有很多种,除话语外,还包括寻求建议、选取建议、查库、选取话题、选取主、选取次。KB则紧接Wizard query之后。


dial_ID = ‘6617’

0 UserGuide instruct
{‘Text’: ‘Maybe the AI Assistant can help? Say hello and ask if it can help you plan the party. [instruction 1 of 7]’}
【注释:UserGuide 是界面在合适时机指导用户做事】
1 User utter
{‘Text’: ‘hello’}

2 UserGuide instruct
{‘Text’: “Mark says: ‘I think the Southside Venue is quite nice’. [instruction 2 of 7]”}

3 Wizard request_suggestions
{‘Text’: ‘hi how can i help you’}
【注释:suggested responses,推荐的回复。】
4 Wizard pick_suggestion
{‘ActionLabel’: ‘hello’, ‘ActionLabelOptions’: [‘hello’, ‘party_no_venue_available’, ‘party_ask_food’], ‘Text’: ‘Hello, how can I help?’}
5 User utter
{‘Text’: 'yes i would like to organize a birthday party '}

6 Wizard request_suggestions
{‘Text’: ‘What venue?’}

7 Wizard pick_suggestion
{‘ActionLabel’: ‘party_ask_venue’, ‘ActionLabelOptions’: [‘party_ask_venue’, ‘party_ask_drinks’, ‘party_booking_failed’], ‘Text’: ‘At what venue would you like to have your party organised?’}

8 User utter
{‘Text’: ‘Southside Venue’}

9 UserGuide instruct
{‘Text’: ‘It occurs to you that if the weather is nice, you might just have the party outside somewhere. Ask your assistant what the weather will be like on Wednesday. [instruction 3 of 7]’}

10 Wizard request_suggestions
{‘Text’: ‘HOst name?’}

11 Wizard pick_suggestion
{‘ActionLabel’: ‘ask_name’, ‘ActionLabelOptions’: [‘ask_name’, ‘party_ask_venue’, ‘party_ask_starting_time’], ‘Text’: ‘May I have your name, please?’}

12 User utter
{‘Text’: ‘Ben’}

13 UserGuide instruct
{‘Text’: “Mark says: ‘Yeah, actually, if the weather is good, we could just go out to the park and book a restaurant for the evening’. You agree. So depending on the weather, either continue searching / booking the venue, or ask your assistant to help you find and book a restaurant for Wednesday evening. [instruction 4 of 7]”}

14 Wizard request_suggestions
{‘Text’: ‘What day?’}

15 Wizard pick_suggestion
{‘ActionLabel’: ‘party_ask_day’, ‘ActionLabelOptions’: [‘party_ask_day’, ‘out_of_scope’, ‘party_ask_end_time’], ‘Text’: ‘On what day would you like your party organised?’}

16 User utter
{‘Text’: ‘Wednesday’}

17 UserGuide instruct
{‘Text’: ‘Note: You are located in the Center of town, which you may or may not take into account if you search for a restaurant. [instruction 5 of 7]’}

18 Wizard request_suggestions
{‘Text’: ‘What time and how many guests?’}
19 Wizard utter
{‘Text’: ‘What time and how many guests?’}

20 User utter
{‘Text’: 'How will the weather be Wednesday? '}

21 UserGuide instruct
{‘Text’: ‘When the question comes up: you both like Pizza and Burgers, but no Thai food. [instruction 6 of 7]’}

22 Wizard select_task
{‘Task’: ‘weather’}
23 Wizard query
{‘APIName’: ‘weather’, ‘Constraints’: [{‘Day’: ‘“Wednesday”’}]}
24 KnowledgeBase return_item
{‘APIName’: ‘weather’, ‘Item’: {‘APIName’: ‘weather’, ‘City’: ‘Chicago’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 18, ‘Weather’: ‘Snowing’, ‘id’: 284}, ‘TotalItems’: -1}
25 Wizard request_suggestions
{‘PrimaryItem’: {‘APIName’: ‘weather’, ‘City’: ‘Chicago’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 18, ‘Weather’: ‘Snowing’, ‘id’: 284}, ‘Text’: ‘chicago 18’}

26 Wizard pick_suggestion
{‘ActionLabel’: ‘weather_ask_location’, ‘ActionLabelOptions’: [‘weather_ask_location’, ‘goodbye_1’, ‘out_of_scope’], ‘PrimaryItem’: {‘APIName’: ‘weather’, ‘City’: ‘Chicago’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 18, ‘Weather’: ‘Snowing’, ‘id’: 284}, ‘Text’: ‘For what location would you like the weather forecast?’}
27 User utter
{‘Text’: ‘at the Southside Venue’}

28 Wizard query
{‘APIName’: ‘weather’, ‘Constraints’: [{‘Day’: ‘“Wednesday”’}, {‘City’: ‘api.is_equal_to(“Los Angeles”)’}], ‘PrimaryItem’: {‘APIName’: ‘weather’, ‘City’: ‘Chicago’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 18, ‘Weather’: ‘Snowing’, ‘id’: 284}}
29 KnowledgeBase return_item
{‘APIName’: ‘weather’, ‘Item’: {‘APIName’: ‘weather’, ‘City’: ‘Los Angeles’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 28, ‘Weather’: ‘Raining’, ‘id’: 251}, ‘TotalItems’: -1}

30 Wizard request_suggestions
{‘PrimaryItem’: {‘APIName’: ‘weather’, ‘City’: ‘Los Angeles’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 28, ‘Weather’: ‘Raining’, ‘id’: 251}, ‘Text’: ‘los angeles 28’}

31 Wizard pick_suggestion
{‘ActionLabel’: ‘weather_inform_forecast’, ‘ActionLabelOptions’: [‘weather_ask_location’, ‘weather_inform_forecast’, ‘out_of_scope’], ‘PrimaryItem’: {‘APIName’: ‘weather’, ‘City’: ‘Los Angeles’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 28, ‘Weather’: ‘Raining’, ‘id’: 251}, ‘Text’: ‘It will be Raining all day on Wednesday in Los Angeles, with temperatures of around 28 degrees celsius.’}

32 User utter
{‘Text’: 'okay '}

33 Wizard request_suggestions
{‘PrimaryItem’: {‘APIName’: ‘weather’, ‘City’: ‘Los Angeles’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 28, ‘Weather’: ‘Raining’, ‘id’: 251}, ‘Text’: ‘anything else?’}

34 Wizard pick_suggestion
{‘ActionLabel’: ‘anything_else’, ‘ActionLabelOptions’: [‘anything_else’, ‘weather_inform_forecast’, ‘goodbye_1’], ‘PrimaryItem’: {‘APIName’: ‘weather’, ‘City’: ‘Los Angeles’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 28, ‘Weather’: ‘Raining’, ‘id’: 251}, ‘Text’: ‘Is there anything else that I can do for you?’}

35 User utter
{‘Text’: ‘it will be 3 guests and the time will be 2 pm’}
36 UserGuide instruct
{‘Text’: ‘When you are done booking (not just finding!) either a restaurant or a party venue, you can say goodbye and end the dialogue. This might take a while though, e.g. in case the venue is not available. [instruction 7 of 7]’}

37 Wizard select_task
{‘PrimaryItem’: {‘APIName’: ‘weather’, ‘City’: ‘Los Angeles’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 28, ‘Weather’: ‘Raining’, ‘id’: 251}, ‘Task’: ‘party_plan’}
38 Wizard query
{‘APIName’: ‘party_plan’, ‘Constraints’: [{‘Name’: ‘api.is_equal_to(“Southside Venue”)’}, {‘HostName’: ‘“Ben”’}, {‘Day’: ‘“Wednesday”’}, {‘StartTimeHour’: ‘“2 pm”’}, {‘NumberGuests’: ‘3’}, {‘RequestType’: ‘“Check”’}], ‘PrimaryItem’: {‘APIName’: ‘weather’, ‘City’: ‘Los Angeles’, ‘Day’: ‘Wednesday’, ‘TemperatureCelsius’: 28, ‘Weather’: ‘Raining’, ‘id’: 251}}

39 KnowledgeBase return_item
{‘APIName’: ‘party_plan’, ‘Item’: {‘APIName’: ‘party_plan’, ‘Day’: ‘Wednesday’, ‘Message’: ‘The venue is available.’, ‘Time’: ‘2 pm’, ‘VenueName’: ‘Southside Venue’}, ‘TotalItems’: -1}

40 Wizard request_suggestions
{‘PrimaryItem’: {‘APIName’: ‘party_plan’, ‘Day’: ‘Wednesday’, ‘Message’: ‘The venue is available.’, ‘Time’: ‘2 pm’, ‘VenueName’: ‘Southside Venue’}, ‘Text’: ‘available’}

41 Wizard pick_suggestion
{‘ActionLabel’: ‘party_ask_confirm_booking’, ‘ActionLabelOptions’: [‘party_ask_confirm_booking’, ‘party_ask_day’, ‘party_booking_successful’], ‘PrimaryItem’: {‘APIName’: ‘party_plan’, ‘Day’: ‘Wednesday’, ‘Message’: ‘The venue is available.’, ‘Time’: ‘2 pm’, ‘VenueName’: ‘Southside Venue’}, ‘Text’: ‘OK, the Southside Venue would be happy to accommodate you on Wednesday @ 2 pm.\nCan I book it for you now?’}
42 User utter
{‘Text’: ‘yes’}

43 Wizard query
{‘APIName’: ‘party_plan’, ‘Constraints’: [{‘Name’: ‘api.is_equal_to(“Southside Venue”)’}, {‘HostName’: ‘“Ben”’}, {‘Day’: ‘“Wednesday”’}, {‘StartTimeHour’: ‘“2 pm”’}, {‘NumberGuests’: ‘3’}, {‘RequestType’: ‘“Book”’}], ‘PrimaryItem’: {‘APIName’: ‘party_plan’, ‘Day’: ‘Wednesday’, ‘Message’: ‘The venue is available.’, ‘Time’: ‘2 pm’, ‘VenueName’: ‘Southside Venue’}}
44 KnowledgeBase return_item
{‘APIName’: ‘party_plan’, ‘Item’: {‘APIName’: ‘party_plan’, ‘Day’: ‘Wednesday’, ‘Message’: ‘Your event has been successfully scheduled.’, ‘Time’: ‘2 pm’, ‘VenueName’: ‘Southside Venue’}, ‘TotalItems’: -1}

45 Wizard request_suggestions
{‘PrimaryItem’: {‘APIName’: ‘party_plan’, ‘Day’: ‘Wednesday’, ‘Message’: ‘Your event has been successfully scheduled.’, ‘Time’: ‘2 pm’, ‘VenueName’: ‘Southside Venue’}, ‘Text’: ‘successful’}

46 Wizard pick_suggestion
{‘ActionLabel’: ‘party_booking_successful’, ‘ActionLabelOptions’: [‘party_booking_successful’, ‘party_ask_confirm_booking’, ‘hello’], ‘PrimaryItem’: {‘APIName’: ‘party_plan’, ‘Day’: ‘Wednesday’, ‘Message’: ‘Your event has been successfully scheduled.’, ‘Time’: ‘2 pm’, ‘VenueName’: ‘Southside Venue’}, ‘Text’: ‘Great, your party has been successfully booked at the Southside Venue on Wednesday at 2 pm!’}

47 User utter
{‘Text’: ‘thank you, have a good day’}

48 Wizard request_suggestions
{‘PrimaryItem’: {‘APIName’: ‘party_plan’, ‘Day’: ‘Wednesday’, ‘Message’: ‘Your event has been successfully scheduled.’, ‘Time’: ‘2 pm’, ‘VenueName’: ‘Southside Venue’}, ‘Text’: ‘thanks good bye’}

49 Wizard pick_suggestion
{‘ActionLabel’: ‘goodbye_1’, ‘ActionLabelOptions’: [‘goodbye_1’, ‘out_of_scope’, ‘party_booking_failed’], ‘PrimaryItem’: {‘APIName’: ‘party_plan’, ‘Day’: ‘Wednesday’, ‘Message’: ‘Your event has been successfully scheduled.’, ‘Time’: ‘2 pm’, ‘VenueName’: ‘Southside Venue’}, ‘Text’: ‘Thank you and goodbye.’}

50 User complete

总结:通过如上所述的标注模式,作者实现了模型能够learn when to query the knowledge base, what the query should be, and how to explain the returned knowledge base item to the user。





