sigir20-How to Retrain Recommender System A Sequential Meta-Learning Approach 论文解读
Introduction
Recommender systems play an increasingly important role in
the current Web 2.0 era which faces with serious information
overload issues. The key technique in a recommender system is the
personalization model, which estimates the preference of a user
on items based on the historical user-item interactions [14, 33].
Since users keep interacting with the system, new interaction data
is collected continuously, providing the latest evidence on user
preference. Therefore, it is important to retrain the model with
the new interaction data, so as to provide timely personalization
and avoid being stale [36]. With the increasing complexity of
recommender models, it is technically challenging to apply real-
time updates on the models in an online fashion, especially
for those expressive but computationally expensive deep neural
networks [13, 26, 43]. As such, a common practice in industry is to
perform model retraining periodically, for example, on a daily or
weekly basis. Figure 1 illustrates the model retraining process.
在当前面临严重信息过载问题的web2.0时代,推荐系统扮演着越来越重要的角色。推荐系统中的关键技术是个性化模型,它根据用户-项目的历史交互作用来估计用户对项目的偏好[14,33]。由于用户不断与系统交互,新的交互数据被不断收集,为用户偏好提供最新证据。因此,使用新的交互数据重新训练模型是很重要的,以便提供及时的个性化并避免过时[36]。随着推荐模型的日益复杂,以在线方式对模型应用实时更新在技术上具有挑战性,特别是对于那些表达能力强但计算代价昂贵的深层神经网络[13,26,43]。因此,行业中的常见做法是定期进行模型再培训,例如,每天或每周进行一次。图1说明了模型再培训过程。
interest and the newly collected interactions are more reflective of user short-term preference. To date, three retraining strategies are most widely adopted, depending on the data utilization:
直观地说,历史交互提供了更多关于用户长期(如内在)兴趣的证据,而新收集的交互更能反映用户的短期偏好。到目前为止,三种再训练策略被广泛采用,这取决于数据利用率:
- Fine-tuning, which updates the model based on the new interactions only [35, 41]. This way is memory and time efficient, since only new data is to be handled. However, it ignores the historical data that contains long-term preference signal, thus can easily cause overfitting and forgetting issues [6]
微调,仅根据新的交互作用更新模型[35,41]。这种方法节省内存和时间,因为只需要处理新数据。但是,它忽略了包含长期偏好信号的历史数据,容易造成过度拟合和遗忘问题[6] - Sample-based retraining, which samples historical interactions
and adds them to new interactions to form the training data [6,
42]. The sampled interactions are expected to retain long-
term preference signal, which need be carefully selected to
obtain representative interactions. In terms of recommendation
accuracy, it is usually worse than using all historical interactions
due to the information loss caused by sampling [42].
基于抽样的再训练,对历史交互进行抽样,并将其添加到新的交互中,以形成训练数据[6,42]。抽样的交互保留长期的偏好信号,需要仔细选择这些信号以获得具有代表性的交互信息。在推荐准确度方面,由于抽样造成的信息丢失,它通常比使用所有历史交互更差[42]。 - Full retraining, which trains the model on the whole data that
includes all historical and new interactions.This method costs most resources and training time, but it provides the highest model fidelity since all available interactions are utilized.
完全再训练,即在包括所有历史和新的交互作用的整个数据上训练模型。该方法花费了大量的资源和训练时间,但由于所有可用的交互都得到了利用,因此它提供了最高的模型保真度。
While the above three strategies have their pros and cons, we
argue a key limitation is that they lack an explicit optimization
towards the retraining objective — i.e., the retrained model should
serve well for the recommendations of the next time period. In
practice, user interactions of the next time period provide the
most important evidence on the generalization performance of
the current model, and are usually used for model selection or
validation. As such, an effective retraining method should take
this objective into account and formulate the retraining process
towards optimizing the objective, a much more principled way than
manually crafting heuristics to select data examples [6, 35, 40, 42]
虽然上述三种策略各有利弊,但它们缺乏对再训练目标的明确优化,即再训练模型应能很好地为下一个时间段的提供推荐服务。在实践中,下一时间段的用户交互是验证当前模型泛化性能的最重要方式,通常用于模型选择或验证。因此,有效的再培训方法应考虑到这一目标,并制定优化目标的再培训流程,这是一种比手工制作