【论文翻译&笔记】Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start User

Li S, Lei W, Wu Q, et al. Seamlessly unifying attributes and items: Conversational recommendation for cold-start users[J]. arXiv preprint arXiv:2005.12979, 2020.
摘要: 像协同过滤这样的静态推荐方法存在为冷启动用户执行实时个性化的固有局限性。在线推荐,例如,多臂强盗方法,通过交互式地探索用户在线偏好,并进行探索利用(EE)权衡,解决了这一局限性。然而,现有的基于bandit的方法对推荐行为的建模是均匀的。特别的是,他们只考虑了将产品看做手臂的情况, 而不能处理产品属性(attributes),而产品属性自然提供了用户当前需求的可解释信息,可以有效地过滤掉不需要的产品。
在这项工作中,我们考虑为冷启动用户提供对话式推荐,系统可以交互地询问用户属性和推荐的项目。 这个重要的场景在最近的工作[45]中进行了研究。但是,它使用一个手工制作的函数来决定何时询问属性或提出建议。 这种属性和物品的分离建模使得系统的有效性高度依赖于手工功能的选择,从而引入了系统的脆弱性。为了解决这个限制,我们在同一个arm空间无缝地统一了属性和产品,并使用Thompson Sampling框架自动地实现了它们的EE权衡。 我们的会话汤普森抽样(ConTS)模型通过选择拉动奖励最大的手臂,整体地解决了会话推荐中的所有问题。在三个基线数据集上的广泛的实验显示ConTS超过当前最好的方法 ConUCB和 EstimationâĂŞActionâĂŞReflection  在成功率和平均对话次数上。
关键词: Conversational Recommendation; Interactive Recommendation; Recommender System; Dialogue System

1.Introduction

传统推荐系统方法产生冷启动问题 cold-start scenario  
such data make these methods suffer from the the cold-start scenario where new users come with zero or few past historical data。
conversational recommendation system (CRS)
scenario :  customer service chatbot   many scenarios that interact with users to facilitate information seeking can be abstracted as the problem of CRS
淘宝小蜜  快手入口选择喜欢的标签
attribute asking & item recommending
highlight two desired properties for such a CRS:
1) deciding the strategy of asking and recommending in an intelligent way
2) keeping the balance between exploiting known preferences and exploring new interests to achieve successful recommendations.
Conversational Thompson Sampling (ConTS) method
modeling attribute asking and item recommending in the unified framework of Thompson Sampling.
key idea:  model attributes and items as undifferentiated arms in the same arm space, where the reward estimation for each arm is jointly decided by the user's representation, candidate items, and candidate attributes to capture the mutual promotion of attributes and items.
two main advantages of ConTS:
(1)它解决了[21]中CRS中的对话策略问题——推荐什么产品,询问什么属性,以及是否轮流询问或推荐——作为手臂选择的单一问题,这是为了最大化奖励而优化的;
(2)继承了上下文Thompson抽样[1]的抽样更新机制,能够自然地实现EE平衡。
main contributions of this paper:
研究了具有属性偏好的冷启动用户会话推荐的新任务。 通过将属性和产品建模为同一空间中的任意手臂,我们提出了一个名为ConTS的整体解决方案,以端到端的方式解决CRS中的三个会话策略问题。 同时,在冷启动情况下,我们将上下文Thompson抽样应用于会话推荐,以保持EE平衡。
我们在已有的Yelp和LastFM数据集上进行了实验, 并提供了一个新的快手视频点击记录冷启动会话推荐评估数据集 广泛的实验表明,我们的ConTS优于最先进的CRS方法会话UCB (ConUCB)[45]和EstimationaĂŞActionaĂŞ反射(EAR)[21]在这两个指标 success rate and average turn for cold-start users. 进一步分析表明,探索的重要性和ConTS行动策略的有效性。 所有的代码和数据将被公布,以促进研究界研究CRS
2.related work

3.Preliminary

multi-round conversational recommdendation (MCR) scenario 
3.1  Problem Setting
MCR objective: The objective is to achieve successful recommendations with the fewest conversation turns for each cold-start user u who has no past interaction history.
V denotes all the items and P denotes all item attributes
喜欢产品属性或不喜欢 
接受推荐或者拒绝推荐
只有当消费者接受推荐时会话成功结束(接受推荐的定义是什么?是点击还是购买?消费者在一轮中必须产生购买行为吗?)
there are three questions to consider: 
(1) what attributes to ask
(2) what items to recommend
(3) whether to ask or recommend in a turn.
3.2 使用thompson sampling 相较于ucb方法的优势
Differently from bandit algorithms that use the upper confidence bound (UCB) [4, 22] to model the uncertainty of the learner’s estimation on a user’s preference, Thompson Sampling balances EE by sampling from a posterior distribution of the reward estimation, which is shown to be a more effective and robust  method in many situations according to some previous research works [5, 34]
embedding如何理解?
On the one hand, this posterior update incorporates past experience into the reward estimation;
 on the other hand, it reflects the model’s uncertainty on arms with different context.

4.Method

1)Initialization and Sampling 2) Arm Choosing 3) Updating.
(1)Initialization and Sampling: first turn
(2)arm choosing :  
属性和产品都在同一个空间中
we make a key contribution to unify the attributes p ⊆ P and items v ⊆ V as the undifferentiated arms in the same space (the whole arm pool is A = P ∪ V).
key difference between our ConTS and ConUCB [45].In ConUCB, the items and attributes are separately modelled in two different arm spaces.
(3)updating:
It firstly updates the current arm pool according to the feedback (e.g., removing items not containing the user’s preferred attributes, see Section 4.4). ConTS then updates the distribution of user embedding based on our unified reward estimation function and the user’s feedback (step (8)).
workflow 
4.2 Initialization and Sampling
Initialization with offline FM.
get arm embedding.

相关知识:

FM( Factorization Machines)的理论和实践
CTR预估 click-through rate 特征工程
FM主要是解决稀疏数据下的特征组合问题,并且其预测的复杂度是线性的,对于连续和离散特征有较好的通用性。

FM优点

① FMs allow parameter estimation  under very sparse data where SVMs fails. (FM模型可以 在非常稀疏的数据中进行合理的参数估计 ,而SVM做不到这点)

② **FMs have linear complexity,**can be optimized in the primal and do not rely on support vectors like SVMs.
在FM模型的复杂度是线性的,优化效果很好,而且不需要像SVM一样依赖于支持向量。)

③  FMs are a general predictor that can work with any real valued feature vector. In contrast to this, other state-of-the-art factorization models work only on very restricted input data.
FM是一个通用模型,它可以用于任何特征为实值的情况。而其他的因式分解模型只能用于一些输入数据比较固定的情况。)

5 Experiments

datasets:
Yelp 8 : This is a dateset for business recommendation
LastFM 8 : This dataset is for music artist recommendation
Kuaishou 10 : We construct a new dataset built on the video-click records of cold-start users in Kuaishou app
5.2 Experimental Setting
baselines:
Abs Greedy
EAR:EAR is a state-of-the-art model on MCR scenario for warm-start users
ConUCB:: Conversational UCB is a recently proposed method to apply bandit to con-versational recommendation scenario. It models the attributes and items as different arms and choose them separately by different ranking score.
To validate the key component of our ConTS design, we also compare with the following variants, each with one component ablated:
ConTS-uinit
ConTS-P u
ConTS-exp
user simulator : build a user simulator to simulate a user’s feedback based on the user's historical user-item interaction records.
Evaluation Metrics: success rate SR@t ; the average turn AT  
Implementation Details
The key contribution in our ConTS is the holistic modeling: we seamlessly unify the attributes and items in the same arm space, naturally fitting them into the framework of contextual Thompson Sampling.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值