论文速读|NoteLLM: A Retrievable Large Language Model for Note Recommendation.WWW24-CSDN博客

本文链接：https://blog.csdn.net/Romaga/article/details/145228825

论文地址：https://arxiv.org/abs/2403.01744
bib引用：

@misc{
   zhang2024notellmretrievablelargelanguage,
      title={
   NoteLLM: A Retrievable Large Language Model for Note Recommendation}, 
      author={
   Chao Zhang and Shiwei Wu and Haoxin Zhang and Tong Xu and Yan Gao and Yao Hu and Di Wu and Enhong Chen},
      year={
   2024},
      eprint={
   2403.01744},
      archivePrefix={
   arXiv},
      primaryClass={
   cs.IR},
      url={
   https://arxiv.org/abs/2403.01744}, 
}

Keywords：Large Language Model; Recommendation; Hashtag Generation

一些结论：

I2I推荐要么预先构造I2I索引，要么使用K近邻方法在线检索相关item。
传统的I2I推荐通常完全依赖于来自用户行为的协作信号。【但是会存在冷启动issue(缺少用户-item交互)】因此针对上述问题，很多工作从基于内容的推荐进行优化，本文从基于文本的角度进一步考虑。（问题定义：根据item的文本内容来衡量item之间的相似性）
基于文本的I2I推荐的研究现状：依赖于基于术语的稀疏矢量匹配机制[35,37]→基于NN将文本转换为相同潜在空间中的嵌入，以通过嵌入相似性来衡量它们的关系。→简单应用LLM（仅视为生成嵌入的编码器[10,26,28]）
将LLM与推荐相结合的主要方法有三种[18,50]。
- 第一种方法是利用LLM来增强数据[21,29,51]。由于LLM包含的丰富的世界知识，增强的数据比原始数据[23,46,51]更加突出和多样化。然而，这些方法需要对测试数据进行持续的预处理，以与增强的训练数据保持一致，并且高度依赖于LLM生成的质量。
- 第二种方法是利用LLM直接推荐。这些方法设计特殊提示[9,20,43]或使用监督微调[1,2,59]来诱导LLM回答给定的问题。尽管如此，由于上下文长度有限，这些方法只关注重排序阶段[7,59]，该阶段仅包含数十个候选项。
- 最后一种方法是采用LLM作为编码器来生成表示特定项目的嵌入[15,49]。虽然这些方法对提取信息很有效，但它们都丢弃了生成LLM 的功能。
基于文本生成辅助决策的方法【从文本中生成这些有助于为未标记的注释创建标识符或根据用户的首选项向用户推荐选项】：有三种主要方法：提取、分类和生成方法。
- 提取：将文本中的关键短语识别为主题标签或类别 [61， 63]，但无法获得原始文本中不存在的短语。
- 分类：将此任务视为文本分类问题 [14， 58， 60]。但是，由于人工生成的主题标签的多样性、自由形式性质，这些可能会产生次优结果。
- 生成：直接根据输入文本生成主题标签/类别 [4， 44， 45]。然而，这些方法仅限于解决主题标签/类别生成任务。
- NoteLLM ：LLM 执行多任务学习，同时执行 I2I 推荐和主题标签/类别生成。由于这两个任务的相似性，学习生成主题标签/类别也可以增强 I2I 推荐。

摘要

People enjoy sharing “notes” including their experiences within online communities. Therefore, recommending notes aligned with user interests has become a crucial task. Existing online methods only input notes into BERT-based models to generate note embeddings for assessing similarity. However, they may underutilize some important cues, e.g., hashtags or categories, which represent the key concepts of notes. Indeed, learning to generate hashtags/categories can potentially enhance note embeddings, both of which compress key note information into limited content. Besides, Large Language Models (LLMs) have significantly outperformed BERT in understanding natural languages. It is promising to introduce LLMs into note recommendation. In this paper, we propose a novel unified framework called NoteLLM, which leverages LLMs to address the item-to-item (I2I) note recommendation. Specifically, we utilize Note Compression Prompt to compress a note into a single special token, and further learn the potentially related notes’ embeddings via a contrastive learning approach. Moreover, we use NoteLLM to summarize the note and generate the hashtag/category automatically through instruction tuning. Extensive validations on real scenarios demonstrate the effectiveness of our proposed method compared with the online baseline and show major improvements in the recommendation system of Xiaohongshu.
人们喜欢在在线社区中分享“笔记”，包括他们的经验。因此，推荐符合用户兴趣的笔记已成为一项至关重要的任务。现有的在线方法仅将笔记输入到基于BERT的模型中，以生成用于评估相似性的笔记嵌入。然而，它们可能没有充分利用一些重要的线索，例如主题标签或类别，它们代表了笔记的关键概念。事实上，学习生成主题标签/类别可以潜在地增强笔记嵌入，这两种方法都将关键笔记信息压缩到有限的内容中。此外，大型语言模型（LLM）在理解自然语言方面明显优于BERT。将LLM引入笔记推荐很有希望。在本文中，我们提出了一个名为NoteLLM的新颖统一框架，它利用LLM来解决项目到项目（I2I）的笔记推荐。具体来说，我们利用笔记压缩提示将笔记压缩成一个特殊的标记，并通过对比学习方法进一步学习潜在相关笔记的嵌入。此外，我们使用NoteLLM对笔记进行总结，并通过指令调整自动生成主题标签/类别。在真实场景上的广泛验证证明了我们提出的方法与在线基线相比的有效性，并显示了小红书推荐系统的重大改进。

在这里插入图片描述

Introduction

Focused on user-generated content (UGC) and providing a more authentic and personalized user experience, social media like Xiaohongshu and Lemon8 have gained significant popularity among users. These platforms encourage users to share their product reviews, travel blogs, and life experiences, among other content, also referred to as “notes”. By providing more personalized notes based on user preferences, note recommendation plays a crucial part in enhancing user engagement [16, 34, 48, 64]. Item-to-item (I2I) note recommendation is a classic way to retrieve notes of potential interest to the user from the millions-level notes pool [19, 65]. Given a target note, I2I methods select the relevant notes according to content [65] or collaborative signals [19].

专注于用户生产内容（UGC）和提供更真实和个性化的用户体验，像小红书和Lemon8这样的社交媒体在用户中获得了显著的欢迎。这些平台鼓励用户分享他们的产品评论、旅行博客和生活经历等内容，也称为“笔记”。通过根据用户偏好提供更个性化的笔记，笔记推荐在增强用户参与方面发挥着至关重要的作用[16,34,48,64]。项目对项目（I2I）笔记推荐是从百万级笔记池中检索用户潜在感兴趣的笔记的经典方式[19,65]。给定一个目标笔记，I2I方法根据内容[65]或协作信号[19]选择相关笔记。

Existing online methods of I2I note recommendation usually input whole note content into BERT-based models [3] to generate embeddings of notes, and recommend relevant notes based on embedding similarity [11, 36]. However, these methods merely treat hashtags/categories as a component of note content, underutilizing their potential. As shown in Figure 1, hashtags/categories (e.g., # Singapore) represent the central ideas of notes, which are crucial in determining whether two notes contain related content. In fact, we find that generating hashtags/categories is similar to producing note embeddings. Both compress the key note information into limited content. Therefore, learning to generate hashtags/categories can potentially enhance the quality of embeddings. Besides, Large Language Models (LLMs) have recently exhibited powerful abilities in natural languages [10, 24, 42, 54] and recommendations [1, 2, 34, 59]. However, there is a scarcity of research investigating the application of LLMs in I2I recommendations. Utilizing LLMs to improve I2I note recommendations holds considerable promise.

I2I笔记推荐的现有在线方法通常将整个笔记内容输入到基于BERT的模型中[3]，以生成笔记的嵌入，并根据嵌入相似度推荐相关笔记[11,36]。然而，这些方法仅仅将主题标签/类别视为笔记内容的一个组成部分，没有充分利用它们的潜力。如图1所示，主题标签/类别（例如，#新加坡）代表了笔记的中心思想，这对于确定两个笔记是否包含相关内容至关重要。事实上，我们发现生成主题标签/类别类似于生成笔记嵌入。两者都将关键的笔记信息压缩成有限的内容。因此，学习生成主题标签/类别可以潜在地提高嵌入的质量。此外，大型语言模型（LLM）最近在自然语言[10,24,42,54]和推荐[1,2,34,59]中表现出强大的能力。然而，研究LLM在I2I建议中的应用的研究很少。利用LLM来改进I2I票据建议具有相当大的前景。

Inspired by the above insights, we propose a unified multi-task approach called NoteLLM in this paper. Based on LLMs, NoteLLM learns from the I2I note recommendation and hashtag/category generation tasks, aiming to enhance the I2I note recommendation ability by learning to extract condensed concepts. Specifically, we first construct a unified Note Compression Prompt for each note sample and then decode via pre-trained LLMs (e.g., LLaMA 2 [42]), which utilize a special token to compress the note content and generate hashtags/categories simultaneously. To construct the related note pairs, we count the co-occurrence scores for all note pairs from user behaviours, and form the set of co-occurrence scores for each note. We select notes with the highest co-occurrence scores in the set as the related notes for a given note. Further, to recommend the relevant notes for each sample, Generative-Contrastive Learning (GCL) utilizes the compressed tokens as the embedding of each note, and then trains the LLMs to identify the related notes from in-batch negatives. Simultaneously, we employ Collaborative Supervised Fine-tuning (CSFT) approach to train models to generate hashtags/categories for each note. Since both the compression token learned by the I2I note recommendation task and the hashtag/category generation task aim to extract the key concept of the note content, CSFT can enhance note embeddings effectively.

受上述见解的启发，我们在本文中提出了一种统一的多任务方法，称为NoteLLM。基于LLM，NoteLLM从I2I笔记推荐和主题标签/类别生成任务中学习，旨在通过学习提取浓缩概念来增强I2I笔记推荐能力。具体来说，我们首先为每个笔记样本构建一个统一的笔记压缩提示，然后通过预训练的LLM（例如，LLaMA 2[42]）进行解码，它利用一个特殊的令牌来压缩笔记内容并同时生成主题标签/类别。为了构建相关的笔记对，我们从用户行为中计算所有笔记对的共现分数，并为每个笔记形成共现分数集。我们选择集合中共现分数最高的笔记作为给定笔记的相关笔记。此外，为了推荐每个样本的相关笔记，Generative-Contrastive学习（GCL）利用压缩标记作为每个笔记的嵌入，然后训练LLM从批内底片中识别相关笔记。同时，我们采用协作监督微调（CSFT）方法训练模型为每个笔记生成主题标签/类别。由于I2I笔记推荐任务学习的压缩标记和主题标签/类别生成任务都旨在提取笔记内容的关键概念，CSFT可以有效地增强笔记嵌入。

2. 相关工作

2.1 I2I Recommendation

I2I recommendation is a crucial technique that can recommend a ranked list of items from a large-scale item pool based on a target item. I2I recommendation either pre-constructs the I2I index [55] or retrieves relevant items online using the approximate k-nearest neighbor method [12]. Traditional I2I recommendations typically rely solely on collaborative signals from user behaviors [55, 67]. However, these methods cannot manage cold-start items due to lack of user-item interaction [65]. To address this issue, numerous studies have investigated content-based I2I recommendations [8, 65]. We focus on the text-based I2I recommendation system, which measures the similarity of items based on their textual content. Initially, representation of text-based I2I recommendation relied on a termbased sparse vector matching mechanism [35, 37]. With the advent of deep learning, neural networks have proven more adept at representing text information [3, 27]. Previous works [13, 25, 52, 53] transform texts into embeddings in the same latent space to measure their relationship through embedding similarity. LLMs have recently gained great attention for their impressive abilities [33, 54, 56]. However, the application of LLMs in I2I recommendation remains unexplored. Besides, some studies treat LLMs solely as encoders for generating embeddings [10, 26, 28], failing to leverage their full potential for generation. In NoteLLM, we utilize LLMs to generate hashtags/categories, which can enhance note embeddings.

I2I推荐是一种至关重要的技术，可以基于目标项目从大规模项目池中推荐项目的排序列表。I2I推荐要么预先构造I2I索引[55]，要么使用近似k-最近邻方法在线检索相关项目[12]。传统的I2I推荐通常完全依赖于来自用户行为的协作信号[55,67]。然而，由于缺乏用户-项目交互，这些方法无法管理冷启动项目[65]。为了解决这个问题，许多研究调查了基于内容的I2I推荐[8,65]。我们关注基于文本的I2I推荐系统，它根据项目的文本内容来衡量项目的相似性。最初，基于文本的I2I推荐的表示依赖于基于术语的稀疏矢量匹配机制[35,37]。随着深度学习的出现，神经网络已被证明更擅长表示文本信息[3,27]。以前的作品[13,25,52,53]将文本转换为相同潜在空间中的嵌入，以通过嵌入相似性来衡量它们的关系。LLM最近因其令人印象深刻的能力而获得极大的关注[33,54,56]。然而，LLM在I2I推荐中的应用仍未得到探索。此外，一些研究将LLM仅视为生成嵌入的编码器[10,26,28]，未能充分利用它们的生成潜力。在NoteLLM中，我们利用LLM生成主题标签/类别，这可以增强笔记嵌入。

2.2 LLMs for Recommendation

LLMs have recently made significant advancements [31, 41, 42]. Consequently, numerous studies incorporate LLMs into recommendation tasks [5, 18, 50]. There are three main methods of integrating LLMs with recommendations [18, 50]. The first method is utilizing LLMs to augment data [21, 29, 51]. Due to the abundant world knowledge contained by LLMs, the augmented data are more prominent and diverse than the raw data [23, 46, 51]. However, these methods require continuous preprocessing of the testing data to align with the augmented training data and are highly dependent on the quality of LLMs’ generation. The second method is leveraging LLMs to recommend directly. These methods design special prompts [9, 20, 43] or use supervised finetuning [1, 2, 59] to induce LLMs to answer the given questions. Nevertheless, because of the limited context length, these methods only focus on the reranking stage [7, 59], which only contains dozens of candidate items. The last method is adopting LLMs as the encoders to generate embeddings representing specific items [15, 49]. Although these methods are effective to extract information, they all discard the generative capabilities of LLMs. In contrast to above methods, NoteLLM employs LLMs during the recall phase and learns hashtag generation to improve LLMs’ ability to produce embeddings.

LLM最近取得了重大进展[31,41,42]。因此，许多研究将LLM纳入推荐任务[5,18,50]。将LLM与推荐相结合的主要方法有三种[18,50]。第一种方法是利用LLM来增强数据[21,29,51]。由于LLM包含的丰富的世界知识，增强的数据比原始数据[23,46,51]更加突出和多样化。然而，这些方法需要对测试数据进行持续的预处理，以与增强的训练数据保持一致，并且高度依赖于LLM生成的质量。第二种方法是利用LLM直接推荐。这些方法设计特殊提示[9,20,43]或使用监督微调[1,2,59]来诱导LLM回答给定的问题。尽管如此，由于上下文长度有限，这些方法只关注重排序阶段[7,59]，该阶段仅包含数十个候选项。最后一种方法是采用LLM作为编码器来生成表示特定项目的嵌入[15,49]。虽然这些方法对提取信息很有效，但它们都丢弃了生成LLM 的功能。与上述方法相比，NoteLLM 在召回阶段使用 LLM 并学习标签生成以提高 LLM 生成嵌入的能力。

2.3 Hashtag/Category Generation from Text

Hashtags and categories, as tagging mechanisms on social media, streamline the identification of topic-specific messages and aid users in finding themed content. Generating these from text can assist in creating identifiers for untagged notes or suggesting options to users based on their preferences. In this domain, there are three main methods: extractive, classification, and generative methods. Extractive methods identify key phrases in texts as hashtags or categories [61, 63], but cannot obtain those not present in the original text. Classification methods view this task as a text classification problem [14, 58, 60]. However, these may yield sub-optimal results due to the diverse, free-form nature of human-generated hashtags. Generative methods generate the hashtags/categories directly according to input texts [4, 44, 45]. Whereas, these methods are limited to solving the hashtag/category generation task. In NoteLLM, LLMs perform multi-task learning, simultaneously executing I2I recommendation and hashtag/category generation. Due to the similarity of these two tasks, learning to generate the hashtag/category can also enhance the I2I recommendation.

主题标签和类别作为社交媒体上的标记机制，简化了特定主题消息的识别并帮助用户查找主题内容。从文本中生成这些有助于为未标记的注释创建标识符或根据用户的首选项向用户推荐选项。在这个领域，有三种主要方法：提取、分类和生成方法。①提取方法将文本中的关键短语识别为主题标签或类别 [61， 63]，但无法获得原始文本中不存在的短语。②分类方法将此任务视为文本分类问题 [14， 58， 60]。但是，由于人工生成的主题标签的多样性、自由形式性质，这些可能会产生次优结果。③生成方法直接根据输入文本生成主题标签/类别 [4， 44， 45]。然而，这些方法仅限于解决主题标签/类别生成任务。在 NoteLLM 中，LLM 执行多任务学习，同时执行 I2I 推荐和主题标签/类别生成。由于这两个任务的相似性，学习生成主题标签/类别也可以增强 I2I 推荐。

图 2：NoteLLM 框架使用统一的提示用于 I2I 笔记推荐以及标签/类别生成。笔记通过笔记压缩提示进行压缩，并由预训练的语言模型进行处理。利用共现机制构建相关笔记对，并使用生成对比学习来训练 I2I 推荐任务。NoteLLM 还提取笔记的关键概念用于标签/类别生成，从而增强 I2I 推荐任务。

问题定义

In this section, we introduce the problem definition. We assume $N=\{n_{1}, n_{2}, ..., n_{m}\}$ as note pool,where m is the number of notes. Each note contains a title, hashtag, category, and content. We denote the i -th note as $n_{i}=(t_{i}, t p_{i}, c_{i}, c t_{i})$ , where 𝑡𝑖, $t p_{i}$ , $c_{i}$ , $c t_{i}$ mean the title, the hashtag, the category and the content respectively. In the I2I note recommendation task, given a target note $n_{i}$ , the LLMbased retriever aims to rank the top- k notes, which are similar to the given note, from the note pool $N$ \ ${n_{i}\}$ .In the hashtag/category generation task, the ILM is utilized to generate the hashtag $t p_{i}$ according to $t_{i}$ and $c t_{i}$ .Besides, in the category generation task, the LLM is to generate the category $c_{i}$ according to $t_{\bar{i}}$ , $t p_{i}$ and $c t_{i}$
在本节中，我们介绍问题的定义。我们假设 $N = \{n_{1}, n_{2},..., n_{m}\}$ 为笔记池，其中(m)是笔记的数量。每篇笔记包含一个标题、一个话题标签、一个类别和内容。我们将第(i)篇笔记表示为 $n_{i}=(t_{i}, tp_{i}, c_{i}, ct_{i})$ ，其中 $t_{i}$