【论文翻译&笔记】Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start User

最新推荐文章于 2024-10-18 12:27:51 发布

上杉绘梨衣LC

最新推荐文章于 2024-10-18 12:27:51 发布

阅读量450

点赞数

文章标签：人工智能推荐系统

Li S, Lei W, Wu Q, et al. Seamlessly unifying attributes and items: Conversational recommendation for cold-start users[J]. arXiv preprint arXiv:2005.12979, 2020.

摘要：像协同过滤这样的静态推荐方法存在为冷启动用户执行实时个性化的固有局限性。在线推荐，例如，多臂强盗方法，通过交互式地探索用户在线偏好，并进行探索利用(EE)权衡，解决了这一局限性。然而，现有的基于bandit的方法对推荐行为的建模是均匀的。特别的是，他们只考虑了将产品看做手臂的情况，而不能处理产品属性（attributes），而产品属性自然提供了用户当前需求的可解释信息，可以有效地过滤掉不需要的产品。

在这项工作中，我们考虑为冷启动用户提供对话式推荐，系统可以交互地询问用户属性和推荐的项目。这个重要的场景在最近的工作[45]中进行了研究。但是，它使用一个手工制作的函数来决定何时询问属性或提出建议。这种属性和物品的分离建模使得系统的有效性高度依赖于手工功能的选择，从而引入了系统的脆弱性。为了解决这个限制，我们在同一个arm空间无缝地统一了属性和产品，并使用Thompson Sampling框架自动地实现了它们的EE权衡。我们的会话汤普森抽样(ConTS)模型通过选择拉动奖励最大的手臂，整体地解决了会话推荐中的所有问题。在三个基线数据集上的广泛的实验显示ConTS超过当前最好的方法 ConUCB和 EstimationâĂŞActionâĂŞReflection 在成功率和平均对话次数上。

关键词： Conversational Recommendation; Interactive Recommendation; Recommender System; Dialogue System

1.Introduction

传统推荐系统方法产生冷启动问题 cold-start scenario

such data make these methods suffer from the the cold-start scenario where new users come with zero or few past historical data。

conversational recommendation system (CRS)

scenario : customer service chatbot many scenarios that interact with users to facilitate information seeking can be abstracted as the problem of CRS

淘宝小蜜快手入口选择喜欢的标签

attribute asking & item recommending

highlight two desired properties for such a CRS:

1) deciding the strategy of asking and recommending in an intelligent way

2) keeping the balance between exploiting known preferences and exploring new interests to achieve successful recommendations.

Conversational Thompson Sampling (ConTS) method

modeling attribute asking and item recommending in the unified framework of Thompson Sampling.

key idea: model attributes and items as undifferentiated arms in the same arm space, where the reward estimation for each arm is jointly decided by the user's representation, candidate items, and candidate attributes to capture the mutual promotion of attributes and items.

two main advantages of ConTS:

（1）它解决了[21]中CRS中的对话策略问题——推荐什么产品，询问什么属性，以及是否轮流询问或推荐——作为手臂选择的单一问题，这是为了最大化奖励而优化的;

(2)继承了上下文Thompson抽样[1]的抽样更新机制，能够自然地实现EE平衡。

main contributions of this paper:

研究了具有属性偏好的冷启动用户会话推荐的新任务。通过将属性和产品建模为同一空间中的任意手臂，我们提出了一个名为ConTS的整体解决方案，以端到端的方式解决CRS中的三个会话策略问题。同时，在冷启动情况下，我们将上下文Thompson抽样应用于会话推荐，以保持EE平衡。

我们在已有的Yelp和LastFM数据集上进行了实验，并提供了一个新的快手视频点击记录冷启动会话推荐评估数据集。广泛的实验表明,我们的ConTS优于最先进的CRS方法会话UCB (ConUCB)[45]和EstimationaĂŞActionaĂŞ反射(EAR)[21]在这两个指标 success rate and average turn for cold-start users. 进一步分析表明，探索的重要性和ConTS行动策略的有效性。所有的代码和数据将被公布，以促进研究界研究CRS

2.related work

3.Preliminary

multi-round conversational recommdendation (MCR) scenario

3.1 Problem Setting

MCR objective: The objective is to achieve successful recommendations with the fewest conversation turns for each cold-start user u who has no past interaction history.

V denotes all the items and P denotes all item attributes

喜欢产品属性或不喜欢

接受推荐或者拒绝推荐

只有当消费者接受推荐时会话成功结束（接受推荐的定义是什么？是点击还是购买？消费者在一轮中必须产生购买行为吗？）

there are three questions to consider:

(1) what attributes to ask

(2) what items to recommend

(3) whether to ask or recommend in a turn.

3.2 使用thompson sampling 相较于ucb方法的优势

Differently from bandit algorithms that use the upper confidence bound (UCB) [4, 22] to model the uncertainty of the learner’s estimation on a user’s preference, Thompson Sampling balances EE by sampling from a posterior distribution of the reward estimation, which is shown to be a more effective and robust method in many situations according to some previous research works [5, 34]

embedding如何理解？

On the one hand, this posterior update incorporates past experience into the reward estimation;

on the other hand, it reflects the model’s uncertainty on arms with different context.

4.Method

1)Initialization and Sampling 2) Arm Choosing 3) Updating.

（1）Initialization and Sampling: first turn

（2）arm choosing :

属性和产品都在同一个空间中

we make a key contribution to unify the attributes p ⊆ P and items v ⊆ V as the undifferentiated arms in the same space (the whole arm pool is A = P ∪ V).

key difference between our ConTS and ConUCB [45].In ConUCB, the items and attributes are separately modelled in two different arm spaces.

（3）updating:

It firstly updates the current arm pool according to the feedback (e.g., removing items not containing the user’s preferred attributes, see Section 4.4). ConTS then updates the distribution of user embedding based on our unified reward estimation function and the user’s feedback (step (8)).

workflow

4.2 Initialization and Sampling

Initialization with offline FM.

get arm embedding.

FM优点

① FMs allow parameter estimation under very sparse data where SVMs fails. （FM模型可以 在非常稀疏的数据中进行合理的参数估计 ，而SVM做不到这点）

② **FMs have linear complexity,**can be optimized in the primal and do not rely on support vectors like SVMs.

（ 在FM模型的复杂度是线性的，优化效果很好，而且不需要像SVM一样依赖于支持向量。）

③ FMs are a general predictor that can work with any real valued feature vector. In contrast to this, other state-of-the-art factorization models work only on very restricted input data.

（ FM是一个通用模型，它可以用于任何特征为实值的情况。而其他的因式分解模型只能用于一些输入数据比较固定的情况。）

5 Experiments

datasets:

Yelp 8 : This is a dateset for business recommendation

LastFM 8 : This dataset is for music artist recommendation

Kuaishou 10 : We construct a new dataset built on the video-click records of cold-start users in Kuaishou app

5.2 Experimental Setting

baselines:

Abs Greedy

EAR:EAR is a state-of-the-art model on MCR scenario for warm-start users

ConUCB:: Conversational UCB is a recently proposed method to apply bandit to con-versational recommendation scenario. It models the attributes and items as different arms and choose them separately by different ranking score.

To validate the key component of our ConTS design, we also compare with the following variants, each with one component ablated:

ConTS-uinit

ConTS-P u

ConTS-exp

user simulator : build a user simulator to simulate a user’s feedback based on the user's historical user-item interaction records.

Evaluation Metrics: success rate SR@t ; the average turn AT

Implementation Details

The key contribution in our ConTS is the holistic modeling: we seamlessly unify the attributes and items in the same arm space, naturally fitting them into the framework of contextual Thompson Sampling.