【论文笔记(0)】A case study of batch and incremental recommender systems in supermarket data under concept-CSDN博客

本文链接：https://blog.csdn.net/qq_42901861/article/details/121692701

该论文探讨了推荐系统在面临用户偏好变化（概念漂移）和新用户/物品引入（冷启动）时的挑战。传统批量学习方法与流式学习方法进行了比较，后者在适应数据流和减轻冷启动问题上表现更优。实验结果显示，流式推荐系统能更好地应对概念漂移，而神经网络模型在冷启动场景下展现出良好的泛化能力。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Ⅰ 论文信息

论文为 A case study of batch and incremental recommender systems in supermarket data under concept drifts and cold start ，于2021年8月发表在 Expert Systems With Applications 期刊上。

该期刊为JCR二区，近年来影响因子有所提升，处于二区中等位置。

本文的研究内容主要为streaming-based models与recommender systems的结合，使得在提升performance的同时可以减轻concept drift与cold start问题。

Ⅱ 论文框架

This paper is divided as follows. Section 2 describes recommender systems, types of feedback, the problem of concept drift, and the challenge of cold start. Section 3 describes existing works on positive-only recommender systems in both batch and stream learning scenarios. Section 4 details a new dataset we make available regarding supermarket transactions, which exhibits concept drift and cold start characteristics. Section 5 describes the experiments undertaken to perform the proposed analysis of recommender systems. Section 6 analyzes existing works in recommender systems to answer whether streaming recommender systems overcome batch approaches w.r.t. concept drifts and cold start. Finally, Section 7 concludes this paper and describes future works.

1. Introduction

CF由于所需信息少（只要past user-item interactions）成为研究主流
In CF, the only data required is a list of user-item interactions, while in CBF, items’ details are required. Consequently, CF is less restrictive and has been the target of many works over the years.
概念漂移在推荐系统中如何体现
In recommender systems, concept drift reflects changes in the interactions between customers and items, either because (i) customers’ preferences change, (ii) new items become available for purchase, etc.
batch fashion & streaming fashion
BATCH fashion:
Recommender systems are traditionally trained in a batch fashion, which means that given a training set composed of interactions between users and items, a static model is learned and deployed ad eternum.
STREAMING fashion:
Consequently, it is relevant to tailor recommender systems that can be incremented over time, assuming that the interactions between users and items are made available as a stream of events.
The incremental ability of the streaming-based recommender systems allows a better recovery when cold start is present.

2. Recommender systems

2.1 Concept drift

The most efficient way to deal with potentially drifting scenarios is to increment the model as user-item interactions are made available.

2.2 The cold start problem

In recommender systems, the cold start problem includes three cases:
(1) cold start of users (how to recommend items to a user recently entered the system);
(2) cold start of items (how to recommend a new item recently introduced into the system to interested users);
(3) cold start of the system (how to realize accurate recommendation in a new system).

3. Positive-only approaches for recommender systems

【positive-only】The feedback is positive only, for example 0-5.

3.1 Batch approaches

这里主要介绍了传统的MF模型 - SVD，BPRMF。此外还介绍了近期结合神经网络的方法 - GMF，MLP，NeuMF。

3.2 Streaming approaches

在streaming approaches中，我们更新推荐系统in a single pass manner，保持数据的自然到达顺序，这样不仅减少了计算成本，还能进行drift adaptation。

这一小节主要介绍了两种work incrementally的方法：Incremental Stochastic Gradient Descendent (ISGD) (Vinagre et al., 2014) 和 Incremental Bayesian Personalized Ranking for MF (IBPRMF) (Rendle et al., 2012)。

5. Experimental protocol

下图展示了对于batch方法和streaming方法，本文是如何进行数据集的分割和训练的。

四个月的数据集，前两个月用于training，后两个月用于testing，而不论什么阶段，stream protocol都会从输入的数据流中不断学习。
在这里插入图片描述
本文使用了RECALL@N metric来评估获得的推荐的质量。

同时，用了两种方法进行评估：
（1）basic evaluator：用整个test set进行测试，允许recommender systems与hypothesis testing之间的比较。
（2）window-based evaluator：基于window（test set的1%）汇报recall。背后的原理在于允许the assessment of recommender systems over time。

6. Experimental results and analysis

The discrepancy between streaming algorithms and the corresponding batch counterpart depicts the importance of constantly updating the recommender system as new data becomes available.
recommender models based on neural networks exhibit interesting behavior in cold-start scenarios even though they are not continuously updated
可能是神经网络中学到的higher-order embeddings能够generalize用户的潜在行为，使其性能比传统的MF更好。

Ⅲ 启发

Streaming recommender systems 可以有效地处理 cold start issue.
可以尝试结合 explicit drift detection on matrix factorization 和 neural models in terms of recommendation techniques.