2020_KDD_Dual Channel Hypergraph Collaborative Filtering

[论文阅读笔记]2020_KDD_Dual Channel Hypergraph Collaborative Filtering

论文下载地址: https://doi.org/10.1145/3394486.3403253
发表期刊:SIGKDD
Publish time: 2020
作者及单位:

  • Shuyi Ji Tsinghua University jisy19@mails.tsinghua.edu.cn
  • Yifan Feng Xiamen University evanfeng97@gmail.com
  • Rongrong Ji Xiamen University rrji@xmu.edu.cn
  • Xibin Zhao Tsinghua University zxb@tsinghua.edu.cn
  • Wanwan Tang∗ Baidu, Inc. tangwanwan01@baidu.com
  • Yue Gao∗ Tsinghua University gaoyue@tsinghua.edu.cn

数据集:

  • MovieLens100K,
    • The Movielens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems 5, 4 (2015), 1–19
  • CiteUlike-A,
    • Collaborative Topic Regression with Social Regularization for Tag Recommendation. In Proc. of the 23rd International Joint Conference on Artificial Intelligence (IJCAI)
  • Baidu-Feed, and (从手机APP弄的)
  • Baidu-news. (从手机APP弄的)

代码:

其他人写的文章

简要概括创新点:

  • (1) 基于超图
  • (2) 学习high-order correlations(高阶相关性)
  • (3) 提出了DHCF模型----dual channel hypergraph collaborative filtering (DHCF)
    • 分别对user和item进行建模,并用Collaborative Filtering.
    • 分而治之divideand-conquer methodology----只有分而治之,才有了双通道
  • (4) 提出了跳跃超图卷积JHConv方法–jump hypergraph convolution (JHConv)
  • To learn distinct representations for users and items, the divideand-conquer methodology is introduced to CF, which can integrate users and items together for recommendation while still maintaining their specific properties, leading to a dual channel collaborative filtering framework. (为了学习用户、物品不同的表示,分而治之的方法被用于CF中,它可以整合用户、物品的表示用于推荐中,同时保持其特定属性,最终导致双通道协同过滤模型。)
  • We propose to employ the hypergraph for explicitly modeling the high-order correlations among users and items. We also propose a novel jump hypergraph convolution (JHConv) method to efficiently propagate the embedding by additionally introducing prior information. (我们提出使用超图明确的对用户-物品之间的高阶相关性进行建模。同时提出了一个新的跳跃超图卷积JHConv方法传播表示同时包含先前的信息(先验信息)。)

ABSTRACT

  • Collaborative filtering (CF) is one of the most popular and important recommendation methodologies in the heart of numerous recommender systems today. (协同过滤(CF)是当今众多推荐系统中最流行、最重要的推荐方法之一)
  • Although widely adopted, existing CF-based methods, ranging from matrix factorization to the emerging graph-based methods, suffer inferior performance especially when the data for training are very limited. (现有的基于CF的方法,从矩阵分解到新兴的基于图的方法,虽然得到了广泛的应用,但是在训练的数据非常有限的情况下,表现都很差)
  • In this paper, we first pinpoint the root causes of such deficiency and observe two main disadvantages that stem from the inherent designs of existing CF-based methods, i.e., (在本文中,我们首先指出了这种不足的根本原因,并观察到现有CF-based方法内在设计的两个主要缺点):
    • 1)inflexible modeling of users and items and (对用户和物品的建模不够灵活)
    • 2)insufficient modeling of high-order correlations among the subjects. (实体之间的高阶相关性建模不足)
  • Under such circumstances, we propose a dual channel hypergraph collaborative filtering (DHCF) framework to tackle the above issues. (我们提出了一个二重的超图协同过滤(DHCF) 框架来解决这种问题)
    • First, a dual channel learning strategy, which holistically leverages the divide-and-conquer strategy, is introduced to learn the representation of users and items so that these two types of data can be elegantly interconnected while still maintaining their specific prop-erties. (首先,双通道学习策略,全面利用分而治之的策略,学习用户-物品的表示以至于这两种数据可以优雅的互联同时保持他们的特定属性)
    • Second, the hypergraph structure is employed for modeling users and items with explicit hybrid high-order correlations. The jump hypergraph convolution (JHConv) method is proposed to support the explicit and efficient embedding propagation of high-order correlations. (其次,使用超图结构对具有显式混合高阶相关性的用户和项进行建模。为支持高阶关联的显式、高效的嵌入传播,提出了跳超图卷积(JHConv) 方法)
    • Comprehensive experiments on two public benchmarks and two new real-world datasets demonstrate that DHCF can achieve significant and consistent improvements against other state-of-the-art methods. (在两个公共基准测试和两个新的真实数据集上的综合实验表明,与其他最先进的方法相比,DHCF可以取得显著和一致的改进)

1 INTRODUCTION

  • (1) The ever-growing demand for valuable, attractive and personalized information has driven the development and deployment of various recommender systems in diverse fields [6, 16]. The core of the recommender systems is a series of recommendation algorithms that efficiently sift the exploding information for users based on their personal characteristics. Collaborative filtering (CF) [22] is one of the most popular and widely-adopted methods in both industry and research communities. (不断增长的有价值、有吸引力和个性化的信息驱动着多种领域的多种推荐系统发展和部署。推荐系统的核心是一系列的推荐算法,这些算法根据用户的个人特点有效地筛选出爆炸性的信息。协同过滤是一种火爆且广泛应用的方法在工业和调查社区中。)

  • (2) In a nutshell, CF holds a basic assumption that when providing recommendations to users: those who behave similarly (e.g., visiting the same website frequently) are likely to share similar preferences on items (e.g., music, videos, websites). To fulfill this, a typical CF-based method performs in a two-step strategy: (总之,CF带着这个假设给用户推荐:两个行为相似的人对对物品的喜好相似。为了实现这个目标,一个标准的CF方法用下面两步)

    • it first distinguishes similar users and items by leveraging the historical interactions; (1.利用历史交互信息区分用户和物品)
    • then based on the information gathered above, it generates recommendations to specific users. (2.基于得出的信息为特定用户生成建议)
  • (3) Specially, existing CF methods can be classified into three categories.

    • The first kind of CF methods [29] are user-based methods that yield recommendations merely based on similarities between users, i.e., user-user correlations that describe the relationship between different users’ interactions with the same item. (第一种,基于用户的方法。仅仅使用用户之间的相似性,描述不同用户与同一项目交互之间关系的用户-用户相关性)
    • Similarly, the item-based methods, as the second kind of methods [20], only employ the item-item correlations for recommendation. (类似的,基于物品的方法。仅仅使用物品-物品之间的相似性)
    • Both user-based and item-based methods only adopt part of the historical information when predicting the items attractive to the users, thus inevitably bearing inferior performance. (这两种方法都只使用了历史交互数据的一部分,不可避免地承受着低劣的性能)
    • The third kind of CF methods, including matrix factorization (MF) [14] and graph-based methods [2, 25], have made efforts to indiscriminately integrate the users and the items together for recommendation. (第三种模型,包括MF和基于图的方法,不加区分地(不分青红皂白地)整合用户和物品的交互关系??? 用于推荐)
      • The matrix factorization method models both users and items in a shared space. (矩阵因子分解方法对共享空间中的用户和项目进行建模)
      • The graph-based methods [2, 25] formulate both users and items in a graph, which can jointly investigate the correlations among these users and items and further lead to performance improvement. (基于图的方法将用户和项目都用图表示出来,可以共同研究这些用户和项目之间的相关性,进而提高性能)
  • (4) Although CF methods have been investigated for years, limitations still exist, especially when the prior knowledge for training is very limited. To understand such deficiencies, we dig deep into the inherent mechanisms of existing CF methods and obtain the following observations of limitations: (虽然CF方法已经被研究了多年,但是局限性仍然存在,特别是当用于训练的先验知识非常有限的时候。为了了解这些不足之处,我们深入挖掘现有CF方法的内在机制,发现其局限性如下:)

    • Inflexible modeling of users and items. Although the graph-based CF methods model the users and items into indistinguishable nodes, there do not exist necessary distinctions between the users and the items. When an item is connected with plenty of users, such item can be rather popular. Contrarily, when a user is linked with diverse items, it does not indicate the user is popular. Under such circumstances, more flexible modeling of users and items is needed. (不灵活的用户和项目建模。尽管基于图的CF方法将用户和项目建模为没有区别的节点,但是在用户和项目之间不存在必要的区别。当一个商品与大量用户联系在一起时,这个商品会很受欢迎。相反,当用户与不同的项目链接时,并不表示该用户是受欢迎的。在这种情况下,需要对用户和项目进行更灵活的建模。)
    • Insufficient modeling of high-order correlations. High-order correlations between users and items are essential for data modeling. Existing methods attempt to incorporate with high-order correlations, while the employed graph structure has constraints on high-order correlation modeling and processing, as only pairwise connections can be represented in a graph. (对高阶相关的建模不足。用户和项目之间的高阶关联对于数据建模是必不可少的。现有的方法尝试加入高阶关联,而使用的图结构对高阶关联建模和处理有限制,因为只有成对的连接才能在图中表示。)
  • (5) These two issues may become more severe in the case where the training data is very limited. (在训练数据不足时,上面的这两点问题变得更加严重)

  • (6) To tackle the above issues, we propose a dual channel hypergraph collaborative filtering (DHCF) framework, which can learn better high-order representation for users and items. (我们提出了DHCF框架,它可以更好的学习物体-用户之间的高阶表示。)

  • (7) To handle the distinct representation issues for users and items, we employ the divide-and-conquer strategy into the modeling process so as to integrate users and items together while still maintaining their individual properties. (为了解决用户、物品明显的表示问题。在建模过程中采用分而治之的策略,将用户和项目整合在一起,但是维持着各自的属性)

  • (8) Specifically, as depicted in Figure 1, first we construct multiple connection groups based on the given data for both users and items. (特别的,如图1所示,首先对用户和物品采用了多重连接)

    • Here the connection generation rules can be regarded as a new perspective to describe the raw data, which can be defined flexibly. (在这里,连接生成规则可以看作是描述原始数据的一个新视角,可以灵活定义)
      • For instance, it can associate users who have similar behaviors but without direct connections, and thus the relation constructed based on such an association rule in a connection group can represent high-order correlation, leading to a hyperedge accordingly. (例如,它可以关联行为相似但没有直接连接的用户,因此在连接组中基于该关联规则构建的关系可以表示高阶关联,从而产生相应的超边)
    • Based on these generated connection groups, i.e., hyperedge, we can construct two hypergraphs for users and items, i.e., the two channels’ representation, respectively. (根据这些生成的连接组,即超边,我们可以分别为用户和项构造两个超图,即两个通道的表示。)
  • (9) Here, a novel jump hypergraph convolution (JHConv) is introduced by aggregating the embeddings of neighbors’ and additionally introducing prior information to efficiently conduct information propagation on hypergraph. The learned representation can be further integrated to generate recommendations. (本文提出了一种新的跳变超图卷积算法(JHConv),通过聚合邻域的嵌入信息附加的先验信息,有效地在超图上进行信息传播。学习到的表示可以进一步集成以产生对于物品的推荐建议。)
    在这里插入图片描述

  • (10) Figure 1 provides a comparison between graph-based CF and the proposed dual channel hypergraph CF.

    • As shown in the figure, given the raw user-item connections, graph-based methods generate a graph structure to learn the representation and the recommendation results. (如图所示,在给定原始用户-项目连接的情况下,基于图的方法生成一个图结构来学习表示和推荐结果)
    • Different from these methods, the proposed DHCF framework can learn the representation of users and items using the high-order information in two hypergraphs respectively. (与上述方法不同的是,DHCF框架可以分别使用两个超图中的高阶信息来学习用户和项的表示)
  • The two hypergraphs, i.e., the user hypergraph and the item hypergraph, can be more flexible on complex data correlation modeling and incorporation with different types of data. We have conducted extensive experiments on two public benchmarks and two new real-world datasets to evaluate the performance of the proposed DHCF framework. (两个超图,即用户超图和项目超图,可以更灵活地对复杂的数据进行关联建模并与不同类型的数据合并。我们进行了在两个公共基准和两个新的真实世界的数据集,以评估拟议的DHCF框架。)

  • (11) The main contributions are summarized as follows:

    • To learn distinct representations for users and items, the divideand-conquer methodology is introduced to CF, which can integrate users and items together for recommendation while still maintaining their specific properties, leading to a dual channel collaborative filtering framework. (为了学习用户、物品不同的表示,分而治之的方法被用于CF中,它可以整合用户、物品的表示用于推荐中,同时保持其特定属性,最终导致双通道协同过滤模型。)
    • We propose to employ the hypergraph for explicitly modeling the high-order correlations among users and items. We also propose a novel jump hypergraph convolution (JHConv) method to efficiently propagate the embedding by additionally introducing prior information. (我们提出使用超图明确的对用户-物品之间的高阶相关性进行建模。同时提出了一个新的跳跃超图卷积JHConv方法传播表示同时包含先前的信息(先验信息)。)

2 PRELIMINARY OF HYPERGRAPH

We first briefly introduce the preliminaries of hypergraph.

2.1 Hypergraph Definition

  • (1) For a graph structure, an edge only connects two vertices, while in a hypergraph, a hyperedge connects two or more vertices [1]. A hypergraph is usually defined as G = ( V , E ) \mathcal{G} = (\mathcal{V}, \mathcal{E}) G=(V,E), where V \mathcal{V} V represents the vertex set, and E \mathcal{E} E denotes the hyperedge set. An incidence matrix H ∈ { 0 , 1 } ∣ V ∣ × ∣ E ∣ H\in {\{0, 1\}}^{|\mathcal{V}|\times|\mathcal{E}|} H{0,1}V×E is used to represent connections among vertices on the hypergraph, with each entry h ( v , e ) h(v,e) h(v,e) indicating whether a vertexv is connected by a hyperedge e:
    在这里插入图片描述
  • (2) For a vertex v ∈ V v \in \mathcal{V} vV and a hyperedge e ∈ E e \in \mathcal{E} eE, their degree can be defined, respectively, as d ( v ) = ∑ e ∈ E h ( v , e ) d(v) = \sum_{e \in \mathcal{E}}h(v,e) d(v)=eEh(v,e) and δ ( e ) = ∑ v ∈ V h ( v , e ) \delta(e) = \sum_{v \in \mathcal{V}} h(v,e) δ(e)=vVh(v,e). Further, two diagonal matrices D v ∈ N ∣ V ∣ × ∣ V ∣ D_v \in N^{|\mathcal{V}| \times |\mathcal{V}|} DvNV×V and D e ∈ N E × E D_e \in N^{\mathcal{E} \times \mathcal{E}} DeNE×E represent the vertex degrees and hyperedge degrees respectively. (两个对角矩阵分别代表节点对角度矩阵和超边对角度矩阵:)

2.2 Hypergraph Convolution

  • (1) Given a hypergraph constructed from datasets, the incidence matrix H H H(关联矩阵) and vertex feature X ( l ) X^{(l)} X(l) are fed into hypergraph neural networks (HGNN) [3] with a hypergraph convolutional layer (HGNNConv) defined as:
    在这里插入图片描述

    • in which the Θ ( l ) ∈ R C ( l ) × C ( l + 1 ) \Theta ^{(l)} \in R^{C^{(l)} \times C^{(l+1)}} Θ(l)RC(l)×C(l+1) is a trainable parameter and
    • C ( l ) / C ( l + 1 ) C^{(l)}/C^{(l+1)} C(l)/C(l+1) is the input/output vertex feature dimension at layer l l l.
    • σ ( ⋅ ) \sigma (\cdot) σ() denotes an arbitrary nonlinear activation function (e.g., ReLU(·) ).
    • X ( l + 1 ) X^{(l+1)} X(l+1) is the output of layer l l l that can be further used for deeper vertex feature extraction or final vertex classification.
  • (2) It can also be viewed as a two-stage refinement performing vertex-hyperedge-vertex feature transform upon hypergraph structure. (它也可以看作是对超图结构进行顶点-超边-顶点特征变换的两阶段细化)

    • The hypergraph incidence matrix H H H defines the message passing paths from hyperedges (column) to vertices (row). (图关联矩阵H定义了从超边(列)顶点(行) 的消息传递路径)
    • Similarly, H T H^T HT defines the paths from vertices (column) to hyperedges (row). (类似的,H^{T}定义了从顶点(行)到超边(列)的消息传递路径)
  • (3) Then, we describe hypergraph convolution through the information propagation process on hypergraph. (随后,我们定义(描述)了超图图卷积)

    • Two diagonal matrices D v D_v Dv and D e D_e De are used for normalization in Eq. 2 and they have no impact on the path of message passing on hypergraph. Thus we do not discuss them here. (由于两个对角矩阵D_{v}.D_{e}在等式中用于归一化,没有消息传播,在这里不做讨论)
    • First, with message passing guided by H T H^T HT, the vertex feature is gathered according to the hyperedge to form the hyperedge features. (首先,伴随着 H T H^T HT 的消息传递,节点特征根据超边被聚合, 形成超边特征)
    • Then, by aggregating their related hyperedge features through H H H, refined vertex features are generated. (随后,通过 H H H聚合它们(节点)相关的超边特征,生成细化的节点特征)
    • Finally, the trainable Θ \Theta Θ and nonlinear activation function σ ( ⋅ ) \sigma (\cdot) σ() is applied.
  • (4) To sum up, compared with graph, hypergraph naturally possesses the ability to model higher-order connections. (综上所述,与普通图(Simple Graph相比,超图(HyperGraph)自然具有建模高阶连接的能力)

    • Besides, the hypergraph convolution can handle high-order correlations structures. (此外,超图卷积可以处理高阶相关结构)
    • As an effective and deep operation, hypergraph convolutions enable the high-level information interaction among vertices by leveraging the vertex-hyperedge-vertex transform. (为一种有效而深入的操作,超图卷积通过利用顶点-超边-顶点转换,使顶点之间的高级信息交互成为可能)

3 METHODOLOGY

  • In this section, the proposed DHCF framework is introduced. We first briefly introduce the general framework of DHCF. Then we discuss a series of individual components of DHCF in detail. Afterwards, we show the specific configurations and optimizations that we adopt in this method.

3.1 A General DHCF Framework

在这里插入图片描述

  • (1) Figure 2 sketches the overall framework of DHCF.
    • At a high level, DHCF first learns two groups of embeddings for users and items via a dual channel hypergraph framework, upon which DHCF further figures out the user-item preference matrix by calculating the inner product of users’ and items’ embedding look-up table. Based on such preference matrix, DHCF estimates how likely a user is interested in an item. (图2代表了DHCF的总体框架。在较高的层次上,DHCF首先通过双通道超图框架学习用户和项目的两组嵌入信息,在此基础上,通过计算用户和项目嵌入的内积,得出用户和项目的偏好矩阵。基于这样的偏好矩阵,DHCF估计用户对某一商品感兴趣的可能性。)
  • (2) In order to obtain accurate modeling and representation for both users and items, DHCF employs a dual channel hypergraph embedding framework. (为了对用户和项目进行准确的建模和表示,DHCF采用了双通道超图嵌入框架)
    • Specially, given the initial embedding vector e u ∈ R d / ( e i ∈ R d ) e_u \in R^d / (e_i \in R^d) euRd/(eiRd) with regard to a single user/item ( d d d denotes the dimension of the embedding vector), we refine the embeddings of the user and item separately with hypergraph convolution by propagating them on a hypergraph in two phases:
      • high-order message passing (超图高阶消息传递) and
      • joint message updating. (联合信息更新)

3.1.1 Initialization.

  • (1) Given user-item interactions with N N N users and M M M items, we first construct initial representations and hypergraph structures for users and items separately as inputs of DHCF framework.

    • Specifically, we create two embedding look-up tables for both users and items as
      • E u = [ e u 1 , . . . , e u N ] E_u = [e_{u_1}, ..., e_{u_N}] Eu=[eu1,...,euN]
      • E i = [ e i 1 , . . . , e i N ] E_i = [e_{i_1}, ..., e_{i_N}] Ei=[ei1,...,eiN]
  • (2) The k k k hyperedge groups { E r 1 , . . . , E r k } \{ \mathcal{E}_{r_1}, ... , \mathcal{E_{r_k}} \} {Er1,...,Erk} can be generated respectively based on a self-defined association rule list { r 1 , . . . , r k } \{ r_1, ..., r_k \} {r1,...,rk} for users and items. Beyond the interactions directly presented by the observed instances, such association rules can be regarded as a new perspective to describe the raw data. (除了被观察实例直接呈现的交互作用之外,这种关联规则可以被视为描述原始数据的新视角)

    • For example, it can associate users who have similar behaviors but without direct connections, and thus the constructed relation in a hyperedge group based on such association rule is able to capture high-order information rather than pair-wise relationships. (例如,它可以关联行为相似但没有直接联系的用户,因此基于这种关联规则在超边缘组中构建的关系能够捕获高阶信息,而不是成对的关系。)
  • (3) Then we can construct a hypergraph incidence matrix H H H with hybrid high-order correlations via fusing different hyperedge groups: (然后通过融合不同的超边群,可以构造具有混合高阶关联的超图关联矩阵 H H H)
    在这里插入图片描述

    • in which f ( ⋅ ) f(\cdot) f() indicates hyperedge fusion operation (for more details refer to Section 3.4.2).
      Now, the user’s/item’s initial embeddings and hypergraphs are prepared for later propagation.

3.1.2 Phase 1: High-order Message Passing.

  • (1) To aggregate the neighoring message upon pre-defined hybrid high-order relations, we perform high-order message passing as follows: (为了根据预定义的混合高阶关系聚合邻近消息,执行如下高阶消息传递:)
    在这里插入图片描述
    • where H C o n v ( ⋅ , ⋅ ) HConv(\cdot, \cdot) HConv(,) indicates arbitrary hypergraph convolution operations like HGNNConv. Specifically, the H C o n v ( ⋅ , ⋅ ) HConv(\cdot, \cdot) HConv(,) here is purely an information propagation on hypergraph without any trainable parameters ( Θ ( l ) \Theta^{(l)} Θ(l) in Equation 2). The output M u M_u Mu and M i M_i Mi have learned the complex correlations from its high-order neighbors, respectively. Note that the neighbor here is an abstract description beyond the direct connections in historical interactions, which can reveal the similarity in the latent behavior space. Then we will consider jointly update E u E_u Eu and E i E_i Ei using M u M_u Mu and M i M_i Mi. (这里的邻居是一个抽象的描述,它超越了历史相互作用中的直接联系,可以揭示潜在行为空间中的相似性)

3.1.3 Phase 2: Joint Message Updating.

  • (1) To distill discriminative information, we conduct joint message updating for users and items defined as: (为了提取具有辨别性的信息,对用户和物品定义为:)
    在这里插入图片描述
    • where J U ( ⋅ , ⋅ ) JU(\cdot , \cdot) JU(,) can be arbitrary learnable feed-forward neural networks designed to update the first parameter with the second one. (可以是任意可学习的前馈神经网络,设计用于用第二个参数更新第一个参数)
  • (2) To sum up, the two above-mentioned processes constitute an integrated DHCF layer, which allows explicit modeling and encoding high-order correlations on both users and items, and further update and generate more accurate embeddings through powerful hypergraph structure. Such refined embeddings can be further applied in various downstream tasks in recommender systems. (综上所述,上述两个过程构成了一个集成的DHCF层,该层允许对用户和项的高阶关联进行显式建模和编码,并通过强大的超图结构进一步更新和生成更准确的嵌入。这种精细化的嵌入可以进一步应用到推荐系统的各种下游任务中。)

3.2 Jump Hypergraph Convolution

  • Inspired by some instructive methods like Weisfeiler-Lehman algorithm [21] and GCN [12], especially GraphSAGE [7], which concatenates a node’s current representation with its aggregated neighborhood vectors to generate the learned representations, we introduce a novel JHConv operator:
    在这里插入图片描述
    • in which notations have been introduced in Section 2.2
Comparison with HGNN [3].
  • Compared with traditional HGNNConv in [3], JHConv allows the model to simultaneously consider both its original features and aggregated related representations, as we can see from Equation 2 and 6 ( JHConv允许模型同时考虑其原始特性和聚合的相关表示,从(2)和(6)可以看出)
  • On the other side, such resnet-like skip connection enables the model to avoid information dilution due to the integration of many additional connections.。另一方面,这种类似resnet的残差链接使模型避免了由于集成了许多附加连接而导致的信息稀释

3.3 High-order Connectivity Definition

  • In this subsection, we introduce a way of constructing high-order connectivity on user-item bipartite graph for users and items, respectively. A user-item bipartite graph can be indicated by an incidence matrix H ∈ { 0 , 1 } N × M H \in {\{0, 1\}^{N \times M}} H{0,1}N×M. (在本节中,我们介绍了一种高阶连通性在用户-物品二部图中,用户-物品二部图可以表示为关联矩阵)

3.3.1 On Users.

Definition 1: Item’s k-order reachable neighbors. (定义1:物品的k阶可达邻居)
  • In a user-item interaction graph, more specifically the bipartite graph, $ i t e m i ( i t e m j ) item_i (item_j) itemi(itemj) is i t e m j ( i t e m i ) item_j (item_i) itemj(itemi)’s k k k-order reachable neighbor if there exists a sequence of adjacent vertices (i.e., a path) between i t e m i item_i itemi and i t e m j item_j itemj, and the number of user in this path is smaller than k k k (在用户-物品交互矩阵中, i t e m i item_i itemi i t e m j item_j itemj的k阶邻居 如果存在一系列相邻节点在 i t e m i item_{i} itemi i t e m j item_{j} itemj中,并且路径上的用户数量小于 k k k)
Definition 2: Item’s k-order reachable users. (定义2:物品的k阶可达用户)
  • In an item-user bipartite graph, u s e r j user_j userj is k k k-order reachable from i t e m i item_i itemi if there exists direct interaction between u s e r j user_j userj and i t e m k item_k itemk, and i t e m k item_k itemk is i t e m i item_i itemi’s k k k-order reachable neighbor. (在用户-物品交互矩阵中, u s e r j user_{j} userj i t e m i item_{i} itemi的k阶可达如果 u s e r j user_{j} userj i t e m k item_{k} itemk存在直接交互,并且 i t e m k item_{k} itemk i t e m i item_{i} itemi的k阶可达邻居。)

  • (1) Foritemi, itsk-order reachable users set is termed as B u k ( i ) B^k_u(i) Buk(i). Mathematically speaking, a hypergraph can be defined upon a set family, in which each set represents a hyperedge. Thus, here a hyperedge can be built by an item’s k k k-order reachable users set. Then we can construct high-order hyperedge group upon the k-order reachable rule among users, which can be formulated as: (对于 i t e m i item_{i} itemi,他的 k k k阶可达用户集可以看作,超图可以在集合中定义,每个集都可以看作一个超边。因此,一条超边可以 k k k阶可达的用户集构成。然后可以在用户上构建超边组:)
    在这里插入图片描述

  • (2) The k k k-order reachable matrix of items can be denoted as A i k ∈ { 0 , 1 } M × M A^k_i \in {\{0, 1\}}^{M \times M} Aik{0,1}M×M which takes the form as:
    在这里插入图片描述

  • (3) where pow ( M , k ) (M,k) (M,k) is the function that calculates the k k k power of a given matrix M M M. H ∈ { 0 , 1 } N × M H \in {\{0, 1\}}^{N \times M} H{0,1}N×M denotes the incidence matrix of the user-item bipartite graph. Then the hyperedge group incidence matrix H B u k ∈ { 0 , 1 } N × M H_{B^k_u} \in {\{0, 1\}}^{N \times M} HBuk{0,1}N×M that constructed by k k k-order reachable rule among users can be formulated as:
    在这里插入图片描述

  • (4) Supposing we have a hyperedge groups built upon users via the k k k-order reachable rule, then the final hybrid high-order connections among users can be represented by the hypergraph G u \mathcal{G}_u Gu fusing a hyperedge groups. Due to the advantage of hypergraph in multi-modality fusion [3], a simple concatenation operation ⋅ ∥ ⋅ \cdot \parallel \cdot for hyperedge groups’ incidence matrix can be applied for hyperedge groups fusion f ( ⋅ ) f (\cdot) f(). Finally, the hypergraph incidence matrix H u H_u Hu for users can be formulated as:
    在这里插入图片描述

3.3.2 On Items.

  • (1) User’s k k k-order reachable neighbors and User’s k k k-order reachable items can be symmetrically defined in the similar way. Specifically, the k k k-order reachable matrix of user can be denoted as A u k ∈ { 0 , 1 } N × N A^k_u \in {\{0, 1\}}^{N \times N} Auk{0,1}N×N which can be written as:
    在这里插入图片描述
  • (2) The hyperedge group incidence matrix H B i k ∈ { 0 , 1 } M × N H_{B^k_i} \in {\{0, 1\}}^{M \times N} HBik{0,1}M×N constructed by k k k-order reachable rule among items can be formulated as:
    在这里插入图片描述
  • (3) Then, the hypergraph incidence matrix H i H_i Hi for items can be formulated as:

在这里插入图片描述

  • (4) In this way, we can define the high-order connectivity for both users and items. Figure 3 illustrates one example of user’s high-order connectivity when k k k = 1 and 2, respectively.
    在这里插入图片描述

3.4 DHCF Layer Configurations

Here we introduce a specific configuration of the DHCF framework.

3.4.1 Construction of Hybrid High-order Connections.

  • (1) For each user/item, we set k k k as 1 and 2 respectively to capture the k k k-order neighbors of it, namely B 1 B^1 B1 and B 2 B^2 B2, and then construct two kinds of high-order correlations for each user/item. Then we incorporate these two high-order correlations respectively to construct the hypergraph of user/item and the incidence matrix of it can there by be denoted as H u H_u Hu and H i H_i Hi respectively: (为每个用户/物品构造两种高阶关联。然后我们分别引入这两个高阶关联关系来构造用户/物品的超图,其关联矩阵分别为
  • (2) It is worth noting that the proposed DHCF framework is generally applicable and extensible. In other words, although we adopt B 1 B^1 B1 and B 2 B^2 B2 to explore high-order connectivity information here, our model has no specific requirements for that, which means other flexible high-order correlation definitions are also plausible to be incorporated into the proposed framework. (值得注意的是,提出的DHCF框架是普遍适用和可扩展的。换句话说,虽然采用 B 1 B^1 B1 B 2 B^2 B2来探索高阶连通性信息,但模型对此没有具体要求,这意味着其他灵活的高阶相关定义也可能被纳入到所提议的框架中。)

3.4.2 Configuration of an Integrated DHCF Layer.

  • Given user embeddings E u E_u Eu, item embeddings E i E_i Ei, hypergraph incidence matrix on user H u H_u Hu and item H i H_i Hi, we now provide a detailed definition of high-order message passing and joint message updating. ()
Phase 1:
  • For high-order message passing, we adopt JHConv introduced in Section 3.2 to propagate user/item embeddings on the hypergraph while retaining original information. Here the user representation and the item representation can be learned through this dual-channel way. (对于高阶消息传递,采用(6)中介绍的JHConv来传播超图上的用户/项目嵌入,同时保留原始信息。在这里,用户表示和物品表示可以通过这种双通道方式学习)
    在这里插入图片描述
Phase 2:
  • For joint message updating, we apply a shared fully connected layer on the output of the previous phase. The detailed configurations are formulated as: (对于联合消息更新,我们在前一阶段的输出上应用一个共享的完全连接层。具体配置如下:)
    在这里插入图片描述
    • where ·||· is the concatenation operation
    • M L P 1 ( ⋅ ) MLP1(\cdot) MLP1() denotes a fully connected layer with trainable Θ \Theta Θ, which only adopts the first parameter of J U ( ⋅ , ⋅ ) JU(·,·) JU(,).
Matrix Form of Propagation Rule
  • To provide a general understanding of embedding propagation rule on hypergraph, here we give the matrix form of it:
    在这里插入图片描述
    • where H ∈ { 0 , 1 } N × M H \in {\{0, 1\}^{N \times M}} H{0,1}N×M is the initial incidence matrix of a user-item bipartite graph.
    • D u v D_{u_v} Duv, D u e D_{u_e} Due and D i v D_{i_v} Div, D i e D_{i_e} Die are diagonal matrices representing the vertex degrees and hyperedge degrees of user and item hypergraph respectively.
    • E u ( l ) E^{(l)}_u Eu(l), E i ( l ) E^{(l)}_i Ei(l) and E u ( l + 1 ) E^{(l+1)}_u Eu(l+1), E i ( l + 1 ) E^{(l+1)}_i Ei(l+1) are the input/ouput user embeddings and item embeddings in layer l l l,
      respectively.
    • Given raw incidence matrix of a user-item bipartite graph H H H, hypergraph H u H_u Hu and H i H_i Hi with hybrid high-order correlations are firstly constructed on users and items, respectively (给出用户-项目二部图 H HH 的原始关联矩阵,首先在用户和项目上分别构造具有混合高阶关联的超图 H u H_u Hu H i H_i Hi)
    • Then E u ( l ) E^{(l)}_u Eu(l), H u H_u Hu, E i ( l ) E^{(l)}_i Ei(l), and H u H_u Hu are fed into phase 1 and phase 2 to generate updated user embeddings E u ( l + 1 ) E^{(l+1)}_u Eu(l+1) and item embeddings E i ( l + 1 ) E^{(l+1)}_i Ei(l+1) for next propagation or final link prediction.
    • More detailed descriptions of generating user/item embedding with DHCF layer can be found at Appendix A.

3.5 Optimization

  • In this paper we focus on limited implicit feedback, which is more pervasive in practice but also more deficient and inherently noisy, compared with explicit feedback like ratings and reviews. (在本文中,关注的是有限的隐式反馈,与评分和评论等外显反馈相比,这种反馈在实践中更普遍,但也更有缺陷和固有的噪声)
    • Specifically, in implicit feedback, only positive feedback is visible and the rest of the data are all treated as missing data, as there is no way to tell whether a user dislikes or just neglects the item.
    • (具体来说,在隐式反馈中,只有积极的反馈是可见的,其余的数据都被视为缺失数据,因为没有办法判断用户是否不喜欢或忽略了该产品)
    • Therefore, we opt for logistic optimization, i.e., pairwise optimization instead of point wise approaches. It holds the pairwise assumption that a given user u u u prefers the observed item i + i^+ i+ to the unobserved item i − i^- i and therefore the observed ones should be ranked higher.
    • (因此,选择logistic优化,即两两优化,而不是点的方法。它有一个成对的假设,即给定用户 u uu 更喜欢观察到的物品 i + i^+ i+而不是未观察到的物品 i − i^- i因此观察到的物品的排名应该更高)
    • Despite still inherently noisy, the sheer volume of unobserved items compensates for this drawback and conduces to build upon robust recommender systems. We further leverage the pairwise Bayesian Personalized Ranking (BPR) [18] optimization criterion as the loss function:
    • (管存在固有的噪声,但大量的未观测项弥补了这一缺陷,有助于建立鲁棒推荐系统。进一步利用成对贝叶斯个性化排序(BPR)优化准则作为损失函数:)
      在这里插入图片描述
    • where T \mathcal{T} T indicates the pairwise training data, in which it is assumed that user u u u prefers item i + i^+ i+ to item i − i^− i;
    • σ ( ⋅ ) \sigma(\cdot) σ() is the logistic sigmoid function;
    • Θ \Theta Θ denotes all model parameters and
    • λ \lambda λ is model specific regularization parameter to avoid overfitting.

4 EXPERIMENTS

  • To evaluate the effectiveness of the proposed method, we have conducted experiments on two public benchmarks and two newly-collected real-world datasets. Ablation studies are also presented for better investigation of our proposed model.

4.1 Datasets and Evaluation Metrics

  • (1) Four datasets are employed in our experiments, including
    • MovieLens100K
    • CiteUlike-A,
    • Baidu-Feed, and
    • Baidu-news.
  • (2) The former two datasets are publicly used benchmarks, and the latter two are newly-collected ones from Baidu app. For MovieLens-100K and CiteUlike-A datasets, we transform the original explicit feedback into implicit data, where each entry is marked as 0 or 1 indicating whether the user has rated or cited. The characteristics of the four datasets are summarized in Table 1.
    在这里插入图片描述
MovieLens-100K.
  • The MovieLens dataset [8] has been widely used in different recommendation tasks for evaluation. We select the version of implicit feedback of MovieLens-100K, including 1,682 ratings of 943 users on the Movielens website. (MovieLens数据集[8]已广泛用于不同的评估推荐任务。我们选择MovieLens-100K的隐式反馈版本,包括MovieLens网站上943名用户的1682评级。)
CiteUlike-A.
  • The CiteUlike dataset [24] is another commonly-used benchmark in recommender systems. It has two versions collected from CiteULike and Google Scholar independently, i.e., CiteUlike-A and CiteUlike-T. In our experiment, the CiteUlike-A data is used. (CiteUlike数据集[24]是推荐系统中另一个常用的基准。它有两个独立收集自CiteULike和Google Scholar的版本,即CiteULike-A和CiteULike-T。在我们的实验中,使用了CiteULike-A数据)
Baidu-feed.
  • Baidu app is a one-stop mobile application with more than 700 million active users, comprising a series of vertical services like web searching, social networking, and entertainment services. Here we first present a formal definition to the feed-stream recommendation. The feed-stream recommendation is a kind of service that continuously integrates the contents (e.g., news and stream media) that the users may be interested in together, and further pushes to the client side. We collected one-day feed-stream click traces of 2000 anonymous users on Baidu app in December 2019, i.e., Baidu-feed dataset. Different from other public benchmark datasets, user-item interactions in this dataset are not gathered from one specific vertical service, yet instead, from a one-stop mobile application (Baidu app). Therefore, the items in Baidu-feed dataset possess high heterogeneity in terms of data type (including videos, news, ads, and etc.), which is consistent with the real data that users can acquire in the wild. (百度app是一款一站式移动应用程序,拥有超过7亿活跃用户,包括一系列垂直服务,如网络搜索、社交网络和娱乐服务。在这里,我们首先为提要流推荐提供一个正式的定义。feed-stream推荐是一种将用户可能感兴趣的内容(如新闻和流媒体)持续集成在一起,并进一步推送到客户端的服务。2019年12月,我们在百度应用程序上收集了2000名匿名用户的一天订阅流点击痕迹,即百度订阅数据集。与其他公共基准数据集不同,此数据集中的用户项交互不是从一个特定的垂直服务收集的,而是从一站式移动应用程序(百度应用程序)收集的。因此,百度feed数据集中的项目在数据类型(包括视频、新闻、广告等)上具有高度的异构性,这与用户可以在野外获取的真实数据是一致的。)
Baidu-news
  • dataset is also collected from the mobile Baidu app but different from Baidu-feed dataset, where the Baidu-news item uniquely refers to the news. Note that Baidu-news dataset is not a subset of the Baidu-feed datase

  • (3) In experiments, 10% of all observed interactions of each user are randomly selected for training and the remaining data are used for testing. Such a setting increases the difficulty of the CF task, as the model can only fetch very limited observed interactions. In addition, due to the high sparsity of data, it can well evaluate models’ ability to dig out useful information from the limited implicit datasets. For all four datasets, we use the 20-core setting to ensure data quality, i.e., each user has at least two interactions for training. During the training process, each positive instance (observed interactions) is paired with a negative item sampled from unobserved interactions. (在实验中,随机选择每个用户10%的所有观察交互进行训练,剩余数据用于测试。这样的设置增加了CF任务的难度,因为模型只能获取非常有限的观察到的交互。此外,由于数据的高度稀疏性,它可以很好地评估模型从有限的隐式数据集中挖掘有用信息的能力。对于所有四个数据集,我们使用20个核心设置来确保数据质量,即每个用户至少有两个用于培训的交互。在培训过程中,每个积极的实例(观察到的互动)都与从未观察到的互动中取样的消极项目配对)

  • (4) For evaluation, four widely-used metrics, including precision@K, recall@K, normalized discounted cumulative gain (ndcg@K), and hit@K, are calculated to comprehensively compare the performance of different methods when acting on the top-K recommendation and the ranking task. In our experiments, K K K is set as 20.

4.2 Compared Methods

Four recent competitive methods are selected for comparison.

  • BPR-MF[18] is the classic and popular matrix factorization using the BPR as loss function, which optimizes the pairwise ranking between the positive instances and sampled negative items. (BPR-MF[18]是经典且流行的矩阵分解,使用BPR作为损失函数,优化了正实例和抽样负项之间的成对排序)
  • GC-MC [2] employs a graph auto-encoder framework for the matrix completion task in recommendation systems, in which a graph convolution layer is introduced into the encoder to generate user and item embeddings through message passing. (GC-MC[2]为推荐系统中的矩阵完成任务采用了一个图形自动编码器框架,在编码器中引入了一个图形卷积层,通过消息传递生成用户和项目嵌入。)
  • PinSage[28] learns node embeddings through random walk GCN on pins-boards graph. Here we apply it on the user-item interaction graph for comparison. (PinSage[28]通过随机游走GCN在pins-boards图上学习节点嵌入。在这里,我们将其应用于用户项交互图以进行比较。)
  • NGCF [25] is a state-of-the-art graph-based model, which encodes the collaborative signal into the user-item interaction graph structure. It adopts multiple graph convolution layers and performs embedding propagation to explore high-order connectivity. (NGCF[25]是一种基于图形的最先进模型,它将协作信号编码到用户项目交互图形结构中。它采用多个图卷积层并进行嵌入传播以探索高阶连通性。)
  • DHCF yields consistent best performance on all datasets. In particular, DHCF outperforms the strongest baseline by 6.1%, 28.4%,
    172.2%, and 186.8% in terms of recall@20 on MovieLens-100K, CiteUlike-A, Baidu-feed, and Baidu-news datasets, respectively. We attribute the improvement mainly to the flexible and explicit modeling of hybrid high-order connections for both users and items within the extensible DHCF framework. Meanwhile, it cut down the volume of learnable parameters in model to a large extent (one sixth of NGCF for comparison), which can effectively prevent overfitting on small datasets. Specifically, NGCF has three embedding propagation layers where each layer has two trainable matrices. By contrast, DHCF only adopts one propagation layer with one learnable matrix, and thus is more lightweight and efficient. We can also observe that when the data density becomes lower, the gains of the proposed method are higher. The lower data density indicates that the useful connection is less and the correlation exploration becomes even more difficult. This result justifies the effectiveness of the proposed dual channel hypergraph method on high-order correlation modeling. (DHCF在所有数据集上产生一致的最佳性能。特别是,DHCF在以下方面的表现优于最强基线:6.1%、28.4%、172.2%和186.8%recall@20分别在MovieLens-100K、CiteUlike-A、百度feed和百度新闻数据集上。我们将这一改进主要归因于在可扩展的DHCF框架内对用户和项目的混合高阶连接进行灵活和明确的建模。同时,它在很大程度上减少了模型中可学习参数的数量(用于比较的NGCF的六分之一),可以有效地防止小数据集上的过度拟合。具体而言,NGCF有三个嵌入传播层,其中每一层有两个可训练矩阵。相比之下,DHCF只采用一个具有一个可学习矩阵的传播层,因此更轻量级和高效。我们还可以观察到,当数据密度变低时,该方法的增益更高。较低的数据密度表明有用的连接较少,相关勘探变得更加困难。这一结果证明了所提出的双通道超图方法在高阶相关建模中的有效性。)
    在这里插入图片描述

4.4.2 Performance Comparison on Convergence.

  • Figure 4 shows the convergence curve of DHCF and all baselines, taking recall@20 and ndcg@20 as examples. From results we can observe that DHCF yields superior performance at each epoch with rapid increasing. The proposedDHCFframework requires about 60 epochs to achieve stable performance. (图4显示了DHCF和所有基线的收敛曲线,以recall@20和ndcg@20作为例子。从结果中,我们可以观察到,DHCF在快速增长的每个时期都具有优异的性能。建议的HCF框架需要大约60个epochs才能实现稳定的性能。)
    在这里插入图片描述

4.5 Ablation Studies

4.5.1 On the Effect of JHConv.

  • To investigate the effectiveness of JHConv, we consider the variants of DHCF with different hyper-graph convolution schemes, i.e., JHConv and HGNNConv [3]. The experimental results are demonstrated in Table 3. From these results we can observe that DHCF-JHConv consistently outperforms DHCF-HGNNConv, while DHCF-HGNNConv also yields comparable performance compared with other baselines. We attribute the improvement to the resnet-like skip connection during convolution, which HGNNConv does not consider. Such skip connection can prevent the current node’s information from being diluted when message passing process, thus achieving effective learning. (为了研究JHCONV的有效性,我们考虑不同的超图卷积方案DHCF的变体,即JHCONV和HGNNCONV〔3〕。实验结果如表3所示。从这些结果中,我们可以观察到DHCF JHConv始终优于DHCF HGNNConv,而DHCF HGNNConv与其他基线相比也具有可比性。我们将改进归因于在卷积过程中的RESNET类跳过连接,HGNNCOV不考虑。这种跳跃连接可以防止当前节点的信息在消息传递过程中被稀释,从而实现有效的学习。)
    在这里插入图片描述

4.5.2 On the Number of Layers.

  • To explore how the number of layers affects the performance, we vary the model depth in the range of { 1 , 2 , 3 } \{1, 2, 3\} {1,2,3}. Experimental results are shown in Table 4, where in DHCF-1 indicates the proposed model with one embedding propagation layers, and notations are similar for -2 and -3. We observe that DHCF-1 outperforms DHCF-2 on CiteUlike-A dataset while on MovieLens-100K DHCF-2 shows a slight improvement over DHCF-1. Such performance fluctuations are normal and depend on the intrinsic structure of the dataset. Usually, DHCF can yield satisfactory performances with only one hidden layer. Stacking multiple layers such as DHCF-3 in MovieLens-100K may introduce noise and further lead to overfitting. However, such DHCF framework has the potential to be applied in a large-scale sparse real-world scenario without the necessity of stacking a very deep GCN-based architecture to capture high-order connectivity. (为了探索层的数量如何影响性能,我们在{1,2,3}的范围内改变模型深度。实验结果如表4所示,其中DHCF-1中显示了具有一个嵌入传播层的拟议模型,符号与-2和-3相似。我们观察到,DHCF-1在CiteUlike-A数据集上优于DHCF-2,而在MovieLens-100K上,DHCF-2比DHCF-1略有改善。这种性能波动是正常的,取决于数据集的内在结构。通常,DHCF只需一个隐藏层就可以产生令人满意的性能。在MovieLens-100K中堆叠多层(如DHCF-3)可能会引入噪音,并进一步导致过度装配。然而,这种DHCF框架有潜力应用于大规模稀疏现实世界场景,而无需堆叠非常深的基于GCN的架构来捕获高阶连接性)
    在这里插入图片描述

4.5.3 On the Effect of High-order Connectivity.

  • We establish different high-order associations on users and items, and conduct experiments to explore the effect of different combinations of high-order correlations on performance. Experimental results are shown in Table 5. From these results we can observe that the configuration of simultaneously considering B 1 B^1 B1 and B 2 B^2 B2 for both users and items can yield the best performance, which verifies the effectiveness of modeling hybrid high-order connections. (我们对用户和项目建立了不同的高阶关联,并进行了实验,以探索不同的高阶关联组合对性能的影响。实验结果如表5所示。从这些结果可以看出,同时考虑用户和项目的 B 1 B^1 B1 B 2 B^2 B2的配置可以产生最佳性能,这验证了混合高阶连接建模的有效性。)
    在这里插入图片描述

5 RELATED WORK

5.1 Model-based CF.

  • (1) A typical model-based CF framework aims to develop a model through partially observed user-item interactions, enabling the recommender system to identify more complex patterns and further generate recommendations accordingly. (一个典型的基于模型的CF框架旨在通过部分观察到的用户项交互来开发一个模型,使推荐系统能够识别更复杂的模式并进一步生成相应的推荐。)
  • (2) There are several model-based CF algorithms, including
    • association algorithms [15],
    • clustering approaches [23],
    • Bayesian networks [27], and
    • latent factor models like singular value decomposition (SVD) [13, 17].
  • (3) Among these methods, Matrix Factorization (MF) [14, 19] is one of the most popular and effective CF methods due to its stable performance and high scalability. It projects user and item to a shared latent factor space, and then estimates possible interactions through inner products of their vector representations. Despite being widely-adopted, MF still bears the shortcoming where it fails to tell the unobserved positive pairs apart from the unobserved negative ones. (在这些方法中,矩阵分解(MF)[14,19]是最流行和有效的CF方法之一,因为它具有稳定的性能和高可扩展性。它将用户和项目投影到共享的潜在因素空间,然后通过向量表示的内积估计可能的交互。尽管被广泛采用,MF仍然有一个缺点,即它不能区分未观察到的正对和未观察到的负对。)
  • (4) Further, Rendleet al.devised BPR [18] which uses pairwise log-sigmoid function to directly optimize the vanilla AUC based on Bayesian AUC optimization. In addition, considering the MF’s linear inner product cannot comprehensively reflect the intricate inter-relationships between users and items, several works have made efforts to investigate the non-linear interaction function [10] with the support of deep neural networks. Typically, He et al. [10] presents Neural MF, which leverages multi-layer perception to learn a user-item interaction function, which serves as an alternative to the linear inner product. (此外,Rendleet al.设计了BPR[18],它使用成对log-sigmoid形函数直接优化基于贝叶斯AUC优化的香草AUC。此外,考虑到MF的线性内积不能全面反映用户和项目之间复杂的相互关系,有几项工作在深层神经网络的支持下致力于研究非线性交互函数[10]。通常,他等人[10]提出了神经MF,它利用多层感知来学习用户项交互功能,作为线性内积的替代品。)
  • (5) Although some of the aforementioned approaches achieve decent performance, they still fall short of generating the representative embeddings between users and items. Such deficiency lies in that they do not directly encode the collaborative signals, and thus cannot ensure users and items which are potentially connected own similar properties in the embedding space. (尽管前面提到的一些方法实现了不错的性能,但它们仍然无法在用户和项目之间生成具有代表性的嵌入。这种缺陷在于它们不直接编码协作信号,因此无法确保潜在连接的用户和项目在嵌入空间中具有相似的属性)

5.2 Graph-based CF.

  • Recently many researchers explore graph structure to model user-item interactions, with each user or item indicating a vertex in the graph and the interactions denote an edge between them. (最近,许多研究人员探索图形结构来模拟用户-项目交互,每个用户或项目表示图形中的一个顶点,交互表示它们之间的一条边。)
  • One typical prior work is SimRank [11], whose basic idea lies in that two objects referred by similar objects tend to be similar too. (一个典型的先前工作是SimRank[11],其基本思想在于相似对象所指的两个对象也趋于相似。)
  • In addition, some other works borrow the idea of label propagation, for example, ItemRank [5] and BiRank [9]. (此外,其他一些作品借用了标签传播的思想,例如ItemRank[5]和BiRank[9]。)
  • Recently, the emerging GCN [12] has injected new vitality into CF. (最近,新兴的GCN[12]为CF注入了新的活力)
    • GCMC [2] applies the GCN to user-item interactions. (GCMC[2]将GCN应用于用户项交互)
    • PinSage [28] embeds the GCN to commercial recommender systems through adopting the idea of GraphSAGE [7], coupled with employing multiple GCN layers in pins-boards graph. (PinSage[28]通过采用GraphSAGE[7]的思想,并在pins Board graph中使用多个GCN层,将GCN嵌入到商业推荐系统中。)
    • HOP-Rec [26] is proposed based on the original BPR-MF. It explicitly defines the high-order connectivities on the CF model, which, however, is only adopted for enriching the dataset. In order to endow GNN with the high-order connectivities, NGCF [25] empowers the collaborative signal with high-order connectivities during the embedding propagation process. (HOP Rec[26]是在原BPR-MF的基础上提出的。它明确地定义了CF模型上的高阶连接性,然而,它仅用于丰富数据集。为了赋予GNN高阶连通性,NGCF[25]在嵌入传播过程中赋予协作信号高阶连通性。)

6 CONCLUSIONS AND FUTURE WORK

  • In this work, we propose a DHCF framework , which can model latent collaborative signals in a high-order and divide-and-conquer way to address CF with implicit feedback. (在这项工作中,我们提出了一个DHCF框架,该框架能够以高阶分而治之的方式对潜在的协作信号进行建模,以解决带有隐式反馈的CF问题)
  • Specifically, it allows explicitly modeling hybrid high-order connections for users and items respectively in hypergraphs and thus can yield more accurate embeddings using the proposed JHConv operator, which can conduct the efficient information propagation on hypergraphs. (具体来说,它允许显式地为超图中的用户和项目分别建模混合高阶连接,因此可以使用所提出的JHConv算子生成更精确的嵌入,从而在超图上进行有效的信息传播)
  • Extensive experiments on two public benchmarks and two new real-world datasets demonstrate significant improvements over competitive baselines. As shown in the evaluations, we can conclude that the high-order information is useful for data representation and the proposed dual channel hypergraph model is effective.
  • Future work will focus on the learnable parameters that can balance the weight of hyperedge groups defined by different association rules to make DHCF more flexible.

Acknowledgment

References

Appendix

A ALGORITHM OF DHCF FRAMEWORK

==Algorithm 1 details how stacked DHCF layers generate user/item embeddings. We first construct hybrid high-order structures with specified rules (e.g.,k-order reachable rule) for user and item, respectively. Then their initial embeddings E u ( 0 ) E^{(0)}_u Eu(0), E i ( 0 ) E^{(0)}_i Ei(0) and hypergraph incidence matrices H u , H i H_u, H_i Hu,Hi are generated. After that, L L L DHCF layers are adopted to explore high-order correlation among users/items. Each layer includes two phases.
- Phase 1: High-order Message Passing. Here JHConv is applied for incorporating high-order structure information into user/item embeddings, respectively.
- Phase 2: Joint Message Updating. Shared learnable weight W ( l ) W^{(l)} W(l) and bias b ( l ) b^{(l)} b(l) are employed to jointly generate expressive embeddings for users and items. After L L L layer propagation, the initial embedding and the output embedding of each layer are combined to generate the final representation for the user/item. (在 L L L层传播之后,将每个层的初始嵌入和输出嵌入结合起来,以生成用户/项目的最终表示)

在这里插入图片描述

B HYPERPARAMETER SETTINGS

  • In our experiments, we adopt the same train settings for all methods as shown in Table 1. The initial learning rate is 0.001, which will be decayed 0.5 in epoch 10, 40, 60, and 80. Table 2 illustrates detailed hyperparameter settings of each model such as layer and embedding dimension. Note that in Table 2 dim. denotes the abbreviation of dimension. For all methods, we concatenate the output of each layer with the initial embeddings to generate the final representations of both users and items. (表2显示了每个模型的详细超参数设置,如图层和嵌入维度。注意,在表2中。表示维度的缩写。对于所有方法,我们将每个层的输出与初始嵌入连接起来,以生成用户和项目的最终表示。)

在这里插入图片描述

在这里插入图片描述

  • 0
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值