【论文阅读】Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representati

Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representations

Foreword

Abs

Learning multiple intent representations for queries has potential applications in facet generation, document ranking, search result diversification,and search explanation. The state-of-the-art model for this task assumes that there is a sequence of intent representations. In this paper, we argue that the model should not be penalized as long as it generates an accurate and complete set of intent representations. Based on this intuition,we propose a stochastic permutation invariant approach for optimizing such networks. We extrinsically evaluate the proposed approach on a facet generation task and demonstrate significant improvements compared to competitive baselines. Our analysis shows that the proposed permutation invariant
approach has the highest impact on queries with more potential intents.

Intro

NMIR ignores the premutation invariance nature of query intents(loss function)

  • it assumes that the query intents should be generated as a sequence

We propose PINMIR, looks at the query intents as a set rather than a sequence(use permutation invariant loss)

  • permutation invariant loss often consider all possible permutations of the predicted output
    • computationally inefficient
  • propose a stochastic variation of out permutation invariant loss

The Permutation Invariant NMIR

Limitations of NMIR

  • uses the cross-entropy loss function of seq2seq model, thus expects that the predictions follow the same order as the ground truth.
  • uses a greedy algorithm for assigning each cluster to a ground truth query intent during training. Therefore, the model’s performance depends on this heuristic cluster-intent assignment algorithm

In PINMIR, we no longer need the intent-cluster matching algorithm, since the order of generated intents do not matter.

  • A side benefit: in reality, sometimes documents address more than one query intent and assigning only one intent to a document would be sub-optimal(If we also use one cluster to generate one intent description, i think there is no such benefit)

Note: just no longer need intent-cluster mapping, not no longer need clustering! (my personal view)

  • we don’t care the order of generated facet descriptions, so we can use one document cluster to generate any facet description

Loss Function

​ First, we need to define a permutation invariant loss function for training the model

Common permutation invariant loss functions include Chamfer loss and Hungarian loss

  • Chamfer loss is based on Chamfer distance and it’s not applicable to our work due to the design of decoder for text generation

We extend the Hungarian loss for text set generation. The proposed loss function for a query q i q_i qi is:

在这里插入图片描述

  • π ( F i ) \pi(F_i) π(Fi) denotes all permutations of ground truth intents for query q i q_i qi
  • v v v denotes the encoder representation
  • L C E L_{CE} LCE is the average seq2seq loss for generating each facet description
  • quite expensive to compute

Propose to use a stochastic variation of this loss that instead of iterating over all possible permutations, takes 𝑠 samples from the permutation set and computes the loss based on the sampled query intent sequences

Position Resetting:

Although the order does not matters between each intent description, it matters within the intent description

we modify the standard decoder architecture in transformer.

  • The decoder generates tokens one-by-one and each token becomes the decoder’s input for generating the next token.(modify?)
  • we reset the position embedding of the decoder for every new intent description(the position embedding of the decoder for every intent description is equal)
    • Thus, the decoder representations for every permutation of a given set of intents would be identical

Experiment

Data: MIMICS-Click

  • The top retrieved documents in response to each query is obtained by the Bing’s public web search API.
  • Only use the documents’ snippets to represent a document

在这里插入图片描述

variable / max: the number of facets for each query

  • The improvements in terms of exact match are marginal, while we observe significant improvements for term overlap F1, BLEU 4-gram, and Set BERT-Score(variable)
  • The improvements are statistically significant in nearly all cases, except for term overlap recall and Set BERT-Score recall.(max)
    • The permutation invariant model has higher impacts on the queries with more intents.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

长命百岁️

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值