[WWW2020 Best Paper] Open Intent Extraction from Natural Language Interactions
从自然语言交互中提取开放意图?
Accurately discovering user intents from their written or spoken language plays a critical role in natural language understanding and automated dialog response. Most existing research models this as a classification task with a single intent label per utterance, grouping user utterances into a single intent type from a set of categories known beforehand. Going beyond this formulation, we define and investigate a new problem of open intent discovery. It involves discovering one or more generic intent types from text utterances, that may not have been encountered during training. We propose a novel domain-agnostic approach, OPINE, which formulates the problem as a sequence tagging task under an open-world setting. It employs a CRF on top of a bidirectional LSTM to extract intents in a consistent format, subject to constraints among intent tag labels.We apply a multi-head self-attention mechanism to effectively learn dependencies between distant words. We further use adversarial training to improve performance and robustly adapt our model across varying domains. Finally, we curate and plan to release an open intent annotated dataset of 25K real-life utterances spanning diverse domains. Extensive experiments show that our approach outperforms state-of-the-art baselines by 5-15% F1 score points.We also demonstrate the efficacy of OPINE in recognizing multiple,diverse domain intents with limited (can also be zero) training examples per unique domain.
定义了一个在意图发现[Intent Extraction]的基础上,定义了一个新的问题,开放意图发现[Open Intent Extraction]。
开放世界的序列标注问题。a sequence tagging task under an open-world setting