2022_WWW_Improving Graph Collaborative Filtering with Neighborhood-enriched Contrastive Learning

本文提出了一种新型的对比学习方法,称为邻域丰富对比学习(NCL),用于增强图协同过滤。NCL通过将用户和项目的结构邻居(基于图结构)和语义邻居(基于相似性)整合到对比对中,以克服数据稀疏性问题。结构对比目标利用GNN的层输出作为正对,而原型对比目标则通过K-Means挖掘语义邻域。NCL通过EM算法优化,并适用于各种图协同过滤方法,实验证实在多个数据集上显著提升了推荐性能。

[论文阅读笔记]2022_WWW_Improving Graph Collaborative Filtering with Neighborhood-enriched Contrastive Learning

论文下载地址: https://doi.org/10.1145/3485447.3512104
发表期刊:WWW
Publish time: 2022
作者及单位:

  • Zihan Lin1†, Changxin Tian1†, Yupeng Hou2†, Wayne Xin Zhao2,3B
  • 1School of Information, Renmin University of China, China
  • 2Gaoling School of Artificial Intelligence, Renmin University of China, China
  • 3Beijing Key Laboratory of Big Data Management and Analysis Methods, China {zhlin, tianchangxin, houyupeng}@ruc.edu.cn, batmanfly@gmail.com

数据集: 正文中的介绍

  • MovieLens-1M(ML-1M) ---- (2015. The movielens datasets: History and context.)
  • Yelp https://www.yelp.com/dataset
  • Amazon Books ---- (2015. Image-based recommendations on styles and substitutes.)
  • Gowalla ---- (2011. Friendship and mobility: user movement in location-based social networks.)
  • Alibaba-iFashion ---- ( 2019. POG: personalized outfit generation for fashion recommendation at Alibaba iFashion.)

代码:

其他:

其他人写的文章

简要概括创新点: 可以看做是对positive contrastive pairs的改进,negative还是 uniformly smaples
(1) 缝合了(i)structual space/neighbors (ii)semantic space/neighbors
(2)分别对应 structure-contrastive objective(用GNN提取,视为正对比对) + prototype-contrastive objective(用K-Means)
(3)用EM方法优化(因为这一过程无法进行端到端的优化,其实K-Means本就是EM算法的一个例子)
(4)是一个model-agnostic constative learning framework
(5)Loss 是3个loss的权重和,要用multi-task learning strategy
(6) 没用social, user-user靠 偶数步

  • we propose a novel contrastive learning approach, named Neighborhood-enriched Contrastive Learning, named NCL, which explicitly incorporates the potential neighbors into contrastive pairs. (为了解决上述问题,我们提出了一种新的对比学习方法,称为邻域丰富对比学习(NCL),它将潜在的邻域明确地整合到对比对中。)
    • Specifically, we introduce the neighbors of a user (or an item) from graph structure and semantic space respectively. (具体来说,我们分别从图结构语义空间引入用户(或项目)的邻居。)
    • For the structural neighbors on the interaction graph, we develop a novel structure-contrastive objective that regards users (or items) and their structural neighbors as positive contrastive pairs. (对于交互图上的结构邻居,我们提出了一种新的结构对比目标,将用户(或项目)及其结构邻居视为正对比对。)
    • In implementation, the representations of users (or items) and neighbors correspond to the outputs of different GNN layers. (在实现中,用户(或项目)和邻居的表示对应于不同GNN层的输出。)
    • Furthermore, to excavate the potential neighbor relation in semantic space, we assume that users with similar representations are within the semantic neighborhood, and incorporate these semantic neighbors into the prototype-contrastive objective. (此外,为了挖掘语义空间中潜在的邻域关系,我们假设具有相似表示的用户位于语义邻域内,并将这些语义邻域合并到原型对比目标中。)
    • The proposed NCL can be optimized with EM algorithm and generalized to apply to graph collaborative filtering methods. (所提出的NCL可以用EM算法进行优化,并可推广应用于图协同过滤方法。)
  • We propose a model-agnostic contrastive learning framework named NCL, which incorporates both structural and semantic neighbors for improving the neural graph collaborative filtering. (我们提出了一个模型不可知的对比学习框架NCL,它结合了结构语义邻域来改进神经图协同过滤。)
    • We propose to learn representative embeddings for both kinds of neighbors, such that the constative learning can be only performed between a node and the corresponding representative embeddings, which largely improves the algorithm efficiency. (我们建议学习这两种邻居的代表性嵌入,这样只能在一个节点和相应的代表性嵌入之间进行约束学习,这大大提高了算法的效率。)

ABSTRACT

  • (1) Recently, graph collaborative filtering methods have been proposed as an effective recommendation approach, which can capture users’ preference over items by modeling the user-item interaction graphs. (近年来,图协同过滤方法作为一种有效的推荐方法被提出,它通过对用户-项目交互图的建模来捕获用户对项目的偏好。)

    • Despite the effectiveness, these methods suffer from data sparsity in real scenarios. (尽管有效,但这些方法在实际场景中存在数据稀疏的问题)
    • In order to reduce the influence of data sparsity, contrastive learning is adopted in graph collaborative filtering for enhancing the performance. (为了减少数据稀疏性的影响,在图协同过滤中采用对比学习来提高性能。)
    • However, these methods typically construct the contrastive pairs by random sampling, which neglect the neighboring relations among users (or items) and fail to fully exploit the potential of contrastive learning for recommendation. (然而,这些方法通常是通过随机抽样来构建对比对,忽略了用户(或项目)之间的相邻关系,未能充分利用对比学习的潜力进行推荐。)
  • (2) To tackle the above issue, we propose a novel contrastive learning approach, named Neighborhood-enriched Contrastive Learning, named NCL, which explicitly incorporates the potential neighbors into contrastive pairs. (为了解决上述问题,我们提出了一种新的对比学习方法,称为邻域丰富对比学习(NCL),它将潜在的邻域明确地整合到对比对中。)

    • Specifically, we introduce the neighbors of a user (or an item) from graph structure and semantic space respectively. (具体来说,我们分别从图结构语义空间引入用户(或项目)的邻居。)
    • For the structural neighbors on the interaction graph, we develop a novel structure-contrastive objective that regards users (or items) and their structural neighbors as positive contrastive pairs. (对于交互图上的结构邻居,我们提出了一种新的结构对比目标,将用户(或项目)及其结构邻居视为正对比对。)
    • In implementation, the representations of users (or items) and neighbors correspond to the outputs of different GNN layers. (在实现中,用户(或项目)和邻居的表示对应于不同GNN层的输出。)
    • Furthermore, to excavate the potential neighbor relation in semantic space, we assume that users with similar representations are within the semantic neighborhood, and incorporate these semantic neighbors into the prototype-contrastive objective. (此外,为了挖掘语义空间中潜在的邻域关系,我们假设具有相似表示的用户位于语义邻域内,并将这些语义邻域合并到原型对比目标中。)
    • The proposed NCL can be optimized with EM algorithm and generalized to apply to graph collaborative filtering methods. (所提出的NCL可以用EM算法进行优化,并可推广应用于图协同过滤方法。)
    • Extensive experiments on five public datasets demonstrate the effectiveness of the proposed NCL, notably with 26% and 17% performance gain over a competitive graph collaborative filtering base model on the Yelp and Amazon-book datasets, respectively. Our implementation code is available at: https://github.com/RUCAIBox/NCL. (在五个公共数据集上进行的大量实验证明了所提出的NCL的有效性,与Yelp和Amazon book数据集上的竞争图协同过滤基础模型相比,性能分别提高了26%和17%。我们的实施代码可从以下网址获得:https://github.com/RUCAIBox/NCL.)

CCS CONCEPTS

• Information systems → Recommender systems.

KEYWORDS

Recommender System, Collaborative Filtering, Contrastive Learning, Graph Neural Network

1 INTRODUCTION

  • (1) In the age of information explosion, recommender systems occupy an important position to discover users’ preferences and deliver online services efficiently [23]. (在信息爆炸的时代,推荐系统在发现用户偏好和高效提供在线服务方面占据着重要地位[23]。)

    • As a classic approach, collaborative filtering (CF) [10, 24] is a fundamental technique that can produce effective recommendations from implicit feedback (expression, click, transaction et al.). (作为一种经典方法,协同过滤(CF)[10,24]是一种基本技术,可以从隐式反馈(表达、点击、交易等)中产生有效的建议。)
    • Recently, CF is further enhanced by the powerful graph neural networks (GNN) [9, 31], which models the interaction data as graphs (e.g., the user-item interaction graph) and then applies GNN to learn effective node representations [9,31] for recommendation, called graph collaborative filtering. (最近,功能强大的图神经网络(GNN)[9,31]进一步增强了CF,该网络将交互数据建模为图形(例如,用户项交互图),然后应用GNN学习有效的节点表示[9,31]进行推荐,称为图协同过滤。)
  • (2) Despite the remarkable success, existing neural graph collaborative filtering methods still suffer from two major issues. (尽管取得了显著的成功,但现有的神经图协同过滤方法仍然存在两个主要问题。)

    • Firstly, user-item interaction data is usually sparse or noisy, and it may not be able to learn reliable representations since the graph-based methods are potentially more vulnerable to data sparsity [33]. (首先,用户项交互数据通常是稀疏的或有噪声的,它可能无法学习可靠的表示,因为基于图形的方法可能更容易受到数据稀疏性的影响[33]。)
    • Secondly, existing GNN based CF approaches rely on explicit interaction links for learning node representations, while high-order relations or constraints (e.g., user or item similarity) cannot be explicitly utilized for enriching the graph information, which has been shown essentially useful in recommendation tasks [24, 27, 35]. (其次,现有的基于GNN的CF方法依赖于显式的交互链接来学习节点表示,而高阶关系或约束(例如,用户或项目相似性)不能明确用于丰富图形信息,这在推荐任务中基本上是有用的[24、27、35]。)
    • Although several recent studies leverage constative learning to alleviate the sparsity of interaction data [33, 39], they construct the contrastive pairs by randomly sampling nodes or corrupting subgraphs. (尽管最近的几项研究利用constative learning来缓解交互数据的稀疏性[33,39],但它们通过随机采样节点或损坏子图来构建对比对。)
    • It lacks consideration on how to construct more meaningful contrastive learning tasks tailored for the recommendation task [24, 27, 35]. (它缺乏对如何构建针对推荐任务的更有意义的对比学习任务的考虑[24,27,35]。)
  • (3) Besides direct user-item interactions, there exist multiple kinds of potential relations (e.g., user similarity) that are useful to the recommendation task, and we aim to design more effective constative learning approaches for leveraging such useful relations in neural graph collaborative filtering. (除了直接的用户项交互,还存在多种对推荐任务有用的潜在关系(例如,用户相似性),我们旨在设计更有效的约束学习方法,以便在神经图协同过滤中利用这些有用的关系。)

    • Specially, we consider node-level relations w.r.t. a user (or an item), which is more efficient than the graph-level relations. (特别地,我们考虑节点级关系W.R.T.一个用户(或一个项目),它比图级关系更有效。)
    • We characterize these additional relations as enriched neighborhood of nodes, which can be defined in two aspects: (我们将这些附加关系描述为节点的丰富邻域,可以从两个方面进行定义)
      • (1) structural neighbors refer to structurally connected nodes by high-order paths, (结构邻居是指通过高阶路径在结构上连接的节点,)
      • and (2) semantic neighbors refer to semantically similar neighbors which may not be directly reachable on graphs. (语义邻居指的是语义相似的邻居,它们在图上可能无法直接访问。)
    • We aim to leverage these enriched node relations for improving the learning of node representations (i.e., encoding user preference or item characteristics). (我们的目标是利用这些丰富的节点关系来改进节点表示(即编码用户偏好或项目特征)的学习。)
      在这里插入图片描述
  • (4) To integrate and model the enriched neighborhood, we propose Neighborhood-enriched Contrastive Learning (NCL for short), a model-agnostic constative learning framework for the recommendation.

    • As introduced before, NCL constructs node-level contrastive objectives based on two kinds of extended neighbors. (如前所述,NCL基于两种扩展邻居构建节点级对比目标。)
    • We present a comparison between NCL and existing constative learning methods in Figure 1. (我们在图1中比较了NCL和现有的限制性学习方法。)
    • However, node-level contrastive objectives usually require pairwise learning for each node pair, which is time-consuming for large-sized neighborhoods. (然而,节点级对比目标通常需要对每个节点对进行成对学习,这对于大型社区来说是非常耗时的。)
    • Considering the efficiency issue, we learn a single representative embedding for each kind of neighbor, such that the constative learning for a node can be accomplished with two representative embeddings (either structural or semantic). (考虑到效率问题,我们为每种邻居学习一个单个代表性嵌入,这样一个节点的约束性学习可以通过 两个代表性嵌入(结构或语义) 来完成。)
  • (5) To be specific,

    • for structural neighbors, we note that the outputs of k k k-th layer of GNN involve the aggregated information of k k k-hop neighbors. (对于结构邻居,我们注意到GNN的第 k k k层的输出包含 k k k-hop邻居的聚合信息。)
      • Therefore, we utilize the k k k-th layer output from GNN as the representations of k k k-hop neighbors for a node. (因此,我们利用GNN输出的第 k k k层作为节点的 k k k-hop邻居的表示。)
      • We design a structure-aware contrastive learning objective that pulls the representations of a node (a user or item) and the representative embedding for its structural neighbors. (我们设计了一个结构感知的对比学习目标,该目标提取节点(用户或项目)的表示及其结构邻居的代表性嵌入。)
    • For the semantic neighbors, we design a prototypical contrastive learning objective to capture the correlations between a node (a user or item) and its prototype. (对于语义邻居,我们设计了一个原型对比学习目标来捕捉节点(用户或项目)与其原型之间的相关性。)
      • Roughly speaking, a prototype can be regarded as the centroid of the cluster of semantically similar neighbors in representation space. (粗略地说,原型可以被视为表示空间中语义相似邻居簇的质心。)
      • Since the prototype is latent, we further propose to use an expectation maximization (EM) algorithm [19] to infer the prototypes. (由于原型是潜在的,我们进一步建议使用期望最大化(EM)算法[19]来推断原型。)
    • By incorporating these additional relations, our experiments show that it can largely improve the original GNN based approaches (also better than existing constative learning methods) for implicit feedback recommendation. (通过加入这些额外的关系,我们的实验表明,它可以在很大程度上改进原有的基于GNN的内隐反馈推荐方法(也优于现有的约束性学习方法)。)
  • (6) Our contributions can be summarized threefold: (我们的贡献可以概括为三个方面:)

    • We propose a model-agnostic contrastive learning framework named NCL, which incorporates both structural and semantic neighbors for improving the neural graph collaborative filtering. (我们提出了一个模型不可知的对比学习框架NCL,它结合了结构语义邻域来改进神经图协同过滤。)
    • We propose to learn representative embeddings for both kinds of neighbors, such that the constative learning can be only performed between a node and the corresponding representative embeddings, which largely improves the algorithm efficiency. (我们建议学习这两种邻居的代表性嵌入,这样只能在一个节点和相应的代表性嵌入之间进行约束学习,这大大提高了算法的效率。)
    • Extensive experiments are conducted on five public datasets, demonstrating that our approach is consistently better than a number of competitive baselines, including GNN and contrastive learning-based recommendation methods. (在五个公共数据集上进行了大量实验,证明我们的方法始终优于许多竞争基线,包括GNN和基于对比学习的推荐方法。)

2 PRELIMINARY

  • (1) As the fundamental recommender system, collaborative filtering (CF) aims to recommend relevant items that users might be interested in based on the observed implicit feedback (e.g., expression, click and transaction). (作为最基本的推荐系统,协同过滤(CF)旨在根据观察到的隐性反馈(如表达、点击和交易)推荐用户可能感兴趣的相关项目。)

    • Specifically, given the user set U = { u } \mathcal{U} = \{u\} U={ u} and item set I = { i } \mathcal{I} = \{i\} I={ i}, the observed implicit feedback matrix is denoted as R ∈ { 0 , 1 } ∣ U ∣ × ∣ I ∣ R \in \{0,1\} ^ {|U |×|I |} R{ 0,1}U×I,
      • where each entry R u , i = 1 R_{u, i} = 1 Ru,i=1 if there exists an interaction between the user u u u and item i i i, otherwise R u , i = 0 R_{u, i} = 0 Ru,i=0.
    • Based on the interaction data R R R, the learned recommender systems can predict potential interactions for recommendation. (学习的推荐系统可以预测推荐的潜在交互)
    • Furthermore, Graph Neural Network (GNN) based collaborative filtering methods organize the interaction data R R R as an interaction graph G = { V , E } \mathcal{G} = \{\mathcal{V}, \mathcal{E} \} G={ V,E}, (此外,基于图神经网络(GNN)的协同过滤方法将交互数据 R R R组织为交互图 G = { V , E } \mathcal{G} = \{\mathcal{V}, \mathcal{E} \} G={ V,E})
    • where V = { U ∪ I } \mathcal{V} = \{\mathcal{U} \cup \mathcal{I}\} V={ UI} denotes the set of nodes and E = { ( u , i ) ∣ u ∈ U , i ∈ I , R u , i = 1 } \mathcal{E} = \{(u,i) | u \in \mathcal{U}, i \in \mathcal{I}, R_{u, i} = 1 \} E={ (u,i)uU,iI,Ru,i=1} denotes the set of edges.
  • (2) In general, GNN-based collaborative filtering methods [9, 31, 32] produce informative representations for users and items based on the aggregation scheme, which can be formulated to two stages: (一般来说,基于GNN的协同过滤方法[9,31,32]基于聚合方案为用户和项目生成信息表示,可分为两个阶段:)
    在这里插入图片描述

    • where N u \mathcal{N}_u Nu denotes the neighbor set of user u u u in the interaction graph G \mathcal{G} G (表示交互图 G \mathcal{G} G中用户 u u u的邻居集)
    • and L L L denotes the number of GNN layers. ( L L L表示GNN层的数量。)
    • Here, z u ( 0 ) z^{(0)}_u zu(0) is initialized by the learnable embedding vector e u e_u eu. (由可学习的嵌入向量 e u e_u
评论 1
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值