2019_RecSys_Deep Social Collaborative Filtering

[论文阅读笔记]2019_RecSys_Deep Social Collaborative Filtering

论文下载地址: https://doi.org/10.1145/3298689.3347011
发表期刊:RecSys
Publish time: 2019
作者及单位:

  • Wenqi Fan Department of Computer Science City University of Hong Kong wenqifan03@gmail.com
  • Yao Ma Data Science and Engineering Lab Michigan State University mayao4@msu.edu
  • Dawei Yin JD.com yindawei@acm.org
  • Jianping Wang Department of Computer Science City University of Hong Kong jianwang@cityu.edu.hk
  • Jiliang Tang Data Science and Engineering Lab Michigan State University tangjili@msu.edu
  • Qing Li Department of Computing The Hong Kong Polytechnic University csqli@comp.polyu.edu.hk

数据集: 正文中的介绍

代码:

其他:

其他人写的文章

简要概括创新点: user-embedding和item-embedding是传统CF的做法得到,(1)主要在rating-embedding上做文章,(2)又引入2个Attention机制,(3)又把NLP中的Bi-LSTM套进来

  • (1) We have presented a Deep Social Collaborative Filtering (DSCF) which can exploit the social information with various aspects for recommendations. (我们提出了一种深度社交协同过滤(DSCF),它可以利用社会信息的各个方面进行推荐。)
    • Particularly, we propose to utilize the random walk to generate item-aware social sequences, which consider information from not only direct neighbors but also distant neighbors. (特别是,我们提出利用随机游走产生项目感知的社会序列,它不仅考虑直接邻居的信息,也考虑遥远的邻居。)
    • In addition, we also introduce a novel way to capture neighbors’ opinions when modeling user-item interactions. (此外,我们还介绍了一种新的方法,在建模用户项交互时捕获邻居的意见。)
    • Finally, the Bi-LSTM with attention mechanism is proposed to extract feature for the social sequence. (最后,提出了带有注意力机制Bi-LSTM来提取社会序列的特征。)
    • Our experiments reveal that the item-aware sequences and the opinion information play a crucial role in modeling social information. (我们的实验表明,项目感知序列和意见信息在社会信息建模中起着至关重要的作用。)
  • As in the traditional collaborative fltering methods, we embed users and items to low-dimensional latent vectors. The embeddings for user u i u_i ui and item v j v_j vj are denoted as p [ i ] ∈ R d p_{[i]} \in R^d p[i]Rd , q [ j ] ∈ R d q_{[j]} \in R^d q[j]Rd respectively, where d d d is the length of the embedding. (与传统的协作过滤方法一样,我们将用户和项目嵌入到低维潜在向量中。用户 u i u_i ui和项目 v j v_j vj的嵌入分别表示为 p [ i ] ∈ R d p_{[i]} \in R^d p[i]Rd , q [ j ] ∈ R d q_{[j]} \in R^d q[j]Rd,其中 d d d是嵌入的长度。)

ABSTRACT

  • (1) Recommender systems are crucial to alleviate the information overload problem in online worlds. Most of the modern recommender systems capture users’ preference towards items via their interactions based on collaborative fltering techniques. In addition to the user-item interactions, social networks can also provide useful information to understand users’ preference as suggested by the social theories such as homophily and infuence. (推荐系统对于缓解网络世界中的信息过载问题至关重要。大多数现代推荐系统通过基于协作过滤技术的交互来捕获用户对物品的偏好。除了用户项目交互之外,社交网络还可以提供有用的信息,以了解用户的偏好,如同质性和影响等社会理论所建议的。)
  • (2) Recently, deep neural networks have been utilized for social recommendations, which facilitate both the user-item interactions and the social network information. (近年来,深度神经网络已被用于社会推荐,这有助于用户项目交互和社会网络信息。)
  • (3) However, most of these models cannot take full advantage of the social network information. They only use information from direct neighbors, but distant neighbors can also provide helpful information. (然而,这些模型大多不能充分利用社交网络信息。他们只使用 直接邻居 的信息,但 远邻 也可以提供有用的信息。)
    • Meanwhile, most of these models treat neighbors’ information equally without considering the specifc recommendations. (同时,这些模型大多平等地对待邻居的信息,而不考虑具体的建议。)
    • However, for a specifc recommendation case, the information relevant to the specifc item would be helpful. (然而,对于特定的推荐案例,与特定项目相关的信息会有所帮助。)
    • Besides, most of these models do not explicitly capture the neighbor’s opinions to items for social recommendations, while diferent opinions could afect the user diferently. (此外,这些模型中的大多数都没有明确地将邻居的意见捕获到社交推荐项目中,而不同的意见可能会对用户产生不同的影响。)
  • (4) In this paper, to address the aforementioned challenges, we propose DSCF, a Deep Social Collaborative Filtering framework, which can exploit the social relations with various aspects for recommender systems. (在本文中,为了应对上述挑战,我们提出了DSCF,这是一个深度社会协同过滤框架,它可以为推荐系统利用各个方面的社会关系。)
  • (5) Comprehensive experiments on two-real world datasets show the efectiveness of the proposed framework. (在两个真实数据集上的综合实验表明了该框架的有效性。)

CCS CONCEPTS

• Information systems → Social recommendation; Social recommendation; • Computing methodologies → Neural networks.

KEYWORDS

Social Recommendation, Recommender Systems, Social Network, Recurrent Neural Network, Random Walk, Neural Networks

1 INTRODUCTION

  • (1) Recommender systems play a crucial role to alleviate the information overload in the era of information explosion. Collaborative fltering is one of the most popular techniques to build modern recommender systems, which models users’ preference towards items by utilizing the history of user-item interactions such as ratings [27]. (在信息爆炸的时代,推荐系统对缓解信息过载起着至关重要的作用。协同过滤 是构建现代推荐系统的最流行技术之一,它利用用户项目交互的历史(如评分)来模拟用户对项目的偏好[27]。)

    • In addition to the user-item interactions, social relations between users provide another stream of potential information of users’ preference. (除了用户项交互之外,用户之间的社交关系还提供了另一个潜在的用户偏好信息流。)
    • As argued in social theories, people in social networks are infuenced by their social connections, which leads to the homophily phenomenon of similar preference in social neighbors [3, 8, 10, 23]. (正如社会理论所指出的那样,社交网络中的人受到他们的社会关系的影响,这导致了社会邻居中类似偏好的 同质现象 [3,8,10,23]。)
    • More specifcally, information difuses through social interactions and users tend to acquire and disseminate information through social networks. Thus, social relations can play an important role in describing the preferences of users, which, in turn, can help build good recommender systems. (更具体地说,信息通过社交互动传播,用户倾向于通过社交网络获取和传播信息。因此,社会关系可以在描述用户偏好方面发挥重要作用,这反过来有助于建立良好的推荐系统。)
    • In fact, social relations have been shown to boost the performance of recommender systems [8, 10, 16, 32]. (事实上,社会关系已经被证明可以提高推荐系统的性能[8,10,16,32]。)
  • (2) Recent years have witnessed the great success of deep neural networks on various areas such as computer vision (CV) [39], speech recognition [15] and Natural Language Processing (NLP) [17]. It is not surprising that deep neural networks are adopted to enhance recommender systems. (近年来,深度神经网络在计算机视觉(CV)[39]、语音识别[15]和自然语言处理(NLP)[17]等领域取得了巨大成功。采用深度神经网络来增强推荐系统也就不足为奇了。)

    • Some recent proposed recommender systems facilitate deep neural networks as feature learning tools to extract useful features from auxiliary information such as text description of items [5, 18, 37], audio of music [34, 42] and visual information of images [45], (最近提出的一些推荐系统有助于深度神经网络作为特征学习工具,从辅助信息中提取有用的特征,例如项目[5,18,37]的文本描述、音乐[34,42]的音频和图像的视觉信息[45],)
    • while others [14] try to utilize deep neural networks to capture the non-linearity between user-item interactions. There are some recent works utilizing deep neural networks for social recommendations [6, 8–10, 40]. (而其他人[14]则试图利用深层神经网络捕捉用户项目交互之间的非线性。最近有一些研究利用深度神经网络进行社会推荐[6,8-10,40]。)
      • For example, GraphRec [10] proposes a graph neural networks framework for social recommendation, which aggregates both user-item interactions information and social interaction information when performing prediction; (例如,GraphRec[10]提出了一个用于社会推荐的图神经网络框架,该框架在执行预测时聚合了用户项交互信息和社会交互信息;)
      • DASO [8] harnesses the power of adversarial learning to dynamically generate “difcult” negative samples, learn the bidirectional mappings between the social domain and item domain. (DASO[8]利用对抗性学习的力量,动态生成“difcult”负面样本,学习社交领域和项目领域之间的双向映射。)
  • (3) Although the aforementioned deep social recommender systems facilitate the social network information to enhance the recommendation performance, they do not fully take advantage of social network information. (尽管上述深度社交推荐系统有助于社交网络信息提高推荐性能,但它们并没有充分利用社交网络信息。)

    • First, most of them only involve direct neighbors, while information from users that are a few hops away could also be helpful [31, 32]. The reasons are as follows: (首先,它们中的大多数只涉及直接邻居,而来自几跳之外的用户的信息也可能有帮助[31,32]。原因如下:)
      • (1) information is difusing through the social network and users might be afected by indirect neighbors; and (信息通过社交网络传播,用户可能受到间接邻居的影响;和)
      • (2) users might refer to distant neighbors (or weak ties), when the direct neighbors cannot share useful information. Therefore, it is desired to consider the distant social relations for recommender systems. (当直接邻居不能分享有用的信息时,用户可能会提到遥远的邻居(或弱联系)。因此,需要考虑远程社会关系的推荐系统。)
    • Second, most of the aforementioned methods treat neighbors’ information equally for all recommendation cases. However, not all information from neighbors are useful when the recommender system is performing a specifc recommendation. (其次,对于所有推荐案例,上述大多数方法都平等地对待邻居的信息。然而,当推荐系统执行特定的推荐时,并非所有来自邻居的信息都有用。)
      • For example, when predicting whether a user will purchase an iPhone X, the interactions between his/her friends and iPhone X or other iPhone related items might be helpful while the interactions between his/her friends and Nike shoes might not be relevant. Therefore, it is necessary to flter information from neighbors. (例如,在预测用户是否会购买iPhone X时,他/她的朋友与iPhone X或其他iPhone相关物品之间的互动可能会有所帮助,而他/她的朋友与Nike鞋之间的互动可能并不相关。因此,有必要从邻居那里获取信息。)
    • Finally, most of the deep social recommender systems do not consider the users’ opinions towards items, which are usually expressed in the forms of reviews or ratings. It is obvious that bad and good opinions from a user’s friends would afect the user’s decision in tremendously diferent ways. Hence, it is desired to carefully consider the opinions of user-item interactions. (最后,大部分的深度社会推荐系统不考虑用户对项目的意见,通常以评论或评分的形式表达。很明显,用户朋友的好意见和坏意见会以截然不同的方式影响用户的决定。因此,需要仔细考虑用户项目交互的意见。)
  • (4) While it is of great potential to sufciently exploit the social network information for recommendations, it faces tremendous challenges. (虽然充分利用社交网络信息进行推荐有很大的潜力,但它面临着巨大的挑战。)

    • First, the social interactions in distant social relations are complex and it is difcult to properly extract helpful information for recommendations. (首先,在遥远的社会关系中,社会互动是复杂的,很难正确地提取有用的信息来进行推荐。)
    • Second, it is not trivial to select relevant information from neighbors, as they could have interactions with many diferent items. (其次,从邻居那里选择相关信息并不是件小事,因为他们可以与许多不同的项目进行交互。)
    • Finally, it is challenging to capture the user’s opinions while modeling the user-item interactions. (最后,在对用户项交互进行建模时,捕捉用户的意见是一项挑战。)
  • (5) In this paper, to tackle the aforementioned challenges, we propose a deep social collaborative fltering framework DSCF, which can sufciently exploit the social network information for recommendations. Our contributions can be summarized as follows: (在本文中,为了应对上述挑战,我们提出了一个深度社交协作框架DSCF,它可以充分利用社交网络信息进行推荐。 我们的贡献可以总结如下:)

    • We propose a principle way based on deep neural networks to extract helpful information from distant social relations for recommendations; (我们提出了一种基于深度神经网络的原则性方法,从遥远的社会关系中提取有用的信息进行推荐;)
    • We introduce a novel way to capture user’s opinions while modeling user-item interactions; (我们引入了一种新的方法来捕捉用户的意见,同时对用户项交互进行建模;)
    • We propose a deep social collaborative fltering framework which can sufciently exploit social network information for recommendations; and (我们提出了一个深度社会协作框架,可以充分利用社会网络信息进行推荐;)
    • We conduct comprehensive experiments on two real-world datasets to show the efectiveness of the proposed framework. (我们在两个真实数据集上进行了综合实验,以证明该框架的有效性。)
  • (6) The remainder of this paper is organized a follows. We introduce the proposed framework in Section 2. In Section 3, we conduct experiments on two real-work datasets to illustrate the efectiveness of the proposed method. In Section 4, we review work related to our framework. Finally, we conclude our work with future directions in Section 5. (本文的其余部分组织如下。我们将在第2节介绍拟议的框架。在第三节中,我们在两个实际工作数据集上进行了实验,以说明所提方法的有效性。在第4节中,我们回顾了与我们的框架相关的工作。最后,我们在第5节总结了我们的工作和未来的方向。)

2 THE PROPOSED FRAMEWORK

在这里插入图片描述

  • (1) In this section, we introduce the proposed deep social collaborative fltering framework DSCF. As discussed earlier, to exploit social networks for recommendations, we need to (在本节中,我们将介绍所提出的深度社会协作框架DSCF。如前所述,要利用社交网络进行推荐,我们需要)

    • (a) consider information from not only direct neighbors but also distant neighbors; (不仅要考虑直接邻居的信息,也要考虑远方邻居的信息;)
    • (b) select relevant information of each neighbor for recommending a specifc item; and (选择每个邻居的相关信息以推荐特定项目;和)
    • (c) capture neighbor’s opinions towards items when modeling user-item interactions. (在建模用户项目交互时,捕捉邻居对项目的看法。)
  • An overview of the proposed framework is demonstrated in Figure 1. It consists of four layers – the random walk layer that is designed for addressing challenges (a) and (b), the embedding layer that is designed for solving the challenge (c), the sequence learning layer and the output layer. Next we will give details of each layer. (图1展示了拟议框架的概述。它由四层组成——设计用于解决挑战(a)和(b)的随机行走层、设计用于解决挑战(c)的嵌入层、序列学习层和输出层。接下来我们将给出每一层的细节。)

  • (2) Before introducing the details of each layer, we frst introduce defnitions and notations that are used through the paper. (在介绍每一层的细节之前,我们首先介绍本文中使用的定义和符号。)

    • Let U = { u 1 , u 2 , . . . , u N } U = \{u_1,u_2, ...,u_N \} U={u1,u2,...,uN} and V = { v 1 , v 2 , . . . , v M } V = \{v_1,v_2, ...,v_M \} V={v1,v2,...,vM} denote the sets of users and items respectively,
    • where N N N is the number of users,
    • and M M M is the number of items.
    • Let R ∈ R N × M R \in R^{N\times M} RRN×M be the rating matrix (or the user-item interaction matrix),
      • where the i , j i, j i,j-th element r i , j r_{i, j} ri,j is the rating score of item v j v_j vj given by user u i u_i ui .
      • If the user u i u_i ui has not rated the item v j v_j vj , then r i , j r_{i, j} ri,j is set to 0, which means the rating is unknown.
    • The social network between users can be described by a matrix T ∈ R N × N T \in R^{N \times N} TRN×N,
      • where T i , j = 1 T^{i, j} = 1 Ti,j=1 if there is a social relation between user u i u_i ui and user u j u_j uj , otherwise 0.
    • Given the rating matrix R R R and the social network T T T, we aim to predict the unknown ratings in R R R.
    • As in the traditional collaborative fltering methods, we embed users and items to low-dimensional latent vectors. The embeddings for user u i u_i ui and item v j v_j vj are denoted as p [ i ] ∈ R d p_{[i]} \in R^d p[i]Rd , q [ j ] ∈ R d q_{[j]} \in R^d q[j]Rd respectively, where d d d is the length of the embedding. (与传统的协作过滤方法一样,我们将用户和项目嵌入到低维潜在向量中。用户 u i u_i ui和项目 v j v_j vj的嵌入分别表示为 p [ i ] ∈ R d p_{[i]} \in R^d p[i]Rd , q [ j ] ∈ R d q_{[j]} \in R^d q[j]Rd,其中 d d d是嵌入的长度。)

2.1 The Random walk layer: generating item-aware social sequences

  • (1) In social recommendation, when we try to perform recommendation for a given user u u u, not only his/her direct neighbors can provide useful information, but also his/her distant neighbors that are within a few hops (or neighbors in his/her local neighborhood) can help. (在社交推荐中,当我们尝试为给定的用户 u u u执行推荐时,不仅他/她的直接邻居可以提供有用的信息,而且他/她的几跳以内的远邻居(或他/她的本地邻居)也可以提供帮助。)

    • Furthermore, neighbors with diferent distance to the user u u u are likely to be of diferent importance for the recommendation. Thus, it is also necessary to diferentiate neighbors of u u u according to their distance to user u u u when including them for recommendations. (此外,与用户 u u u具有不同距离的邻居对于推荐可能具有不同的重要性。因此,在将 u u u的邻居纳入推荐时,也有必要根据它们与用户 u u u的距离来区分它们。)
    • Random walk is a popular tool to explore the local neighborhood of networks [20, 30]. Additionally, random walk explores the neighborhood in the form of node sequences (user sequences) [12, 24], which naturally maintains the order of neighbors according to the distance to the user u u u. Thus, we can efectively utilize random walk to generate distant user sequences from social networks. (随机游走是一种探索网络局部邻域的流行工具[20,30]。此外,random walk以节点序列(用户序列)[12,24]的形式探索邻域,这自然会根据与用户uu的距离维持邻域的顺序。因此,我们可以有效地利用随机游走从社交网络生成远程用户序列。)
    • More specifcally, the user sequence can be generated by a random walk starting from user u u u and ending after l l l steps, where l l l is the length of the random walk. (更具体地说,用户序列可以由从用户uu开始并在ll步之后结束的随机行走生成,其中ll是随机行走的长度。)
    • The generated user sequence can be denoted as S ( i ) u = { u [ 1 ] , . . . , u [ l ] } S^u_{(i)} = \{u_{[1]}, . . . ,u_{[l]}\} S(i)u={u[1],...,u[l]}, where the subscript ( i ) (i) (i) indicates S ( i ) u S^u_{(i)} S(i)u is the i i i-th user sequence generated for user u u u as we need to generate multiple user sequences to sufciently explore the neighborhood of u u u and [ k ] [k] [k] means that the user u [ k ] u_{[k]} u[k] is the k k k-th user in the user sequence. (生成的用户序列可以表示为 S u ( i ) = { u [ 1 ] , u [ l ] S^u{(i)}=\{u{[1]},u{[l]} Su(i)={u[1],u[l],其中下标 ( i ) (i) (i)表示 S u ( i ) S^u{(i)} Su(i)是为用户 u u u生成的第 i i i个用户序列,因为我们需要生成多个用户序列来充分探索 u u u的邻域, [ k ] [k] [k]意味着用户 u [ k ] u{[k]} u[k]是用户序列中的第 k k k个用户。)
  • (2) While the user sequences contain the information of neighbors, they are not specifed for a given recommendation case, i.e., predicting preference of user u u u on the item v v v, as such information is shared by all the recommendation cases involving user u u u. (虽然用户序列包含邻居的信息,但它们不是针对给定的推荐案例指定的,即预测用户 u u u v v v项的偏好,因为此类信息由涉及用户 u u u的所有推荐案例共享。)

    • However, not all information from the neighbors is helpful for recommending the specifc item v v v. Only that information related to this item v v v would be useful. (然而,并非来自邻居的所有信息都有助于推荐特定商品 v v v。只有与 v v v项目相关的信息才有用。)
    • Thus, we need to select an item related to item v v v for each user in the generated user sequences and form an item-aware social sequence, denoted as S ( i ) u , v = { ( u [ 1 ] , v [ 1 ] ) , . . . , ( u [ l ] , v [ l ] ) } S^{u, v}_{(i)} = \{(u_{[1]}, v_{[1]}), . . . , (u_{[l]}, v_{[l]})\} S(i)u,v={(u[1],v[1]),...,(u[l],v[l])}. (因此,我们需要为生成的用户序列中的每个用户选择一个与项 v v v相关的项,并形成一个项感知的社会序列,表示为 S u , v ( i ) = { ( u [ 1 ] , v [ 1 ] ) , ( u [ l ] , v [ l ] ) } S^{u,v}{(i)} = \{(u{[1]},v{[1]}),(u{[l]}, v{[l]})\} Su,v(i)={(u[1],v[1]),(u[l],v[l])}。)
    • Note that only the most relevant item is exploited for a specifc recommendation case. The reasons are two-fold. (请注意,对于特定的推荐案例,只利用最相关的项。原因有两方面。)
      • First, the most relevant item is most important to afect the decision making of a target item (item v v v), while other items may not be helpful since they may bring in noise. (首先,最相关的项目对影响目标项目(项目 v v v)的决策最重要,而其他项目可能没有帮助,因为它们可能会带来噪音。)
      • Second, multiple user sequences are generated by the random walk process to sufciently explore diferent relevant items for a specifc recommendation case, which, in turn, can help form these item-aware social sequences. (其次,通过随机游走过程生成多个用户序列,以充分探索特定推荐案例的不同相关项目,这反过来有助于形成这些项目感知的社会序列。)
    • More specifcally, for each user u [ k ] u_{[k]} u[k] in one user sequence, we choose the item v [ k ] v_{[k]} v[k] from the set of items that have been interacted with user u [ k ] u_{[k]} u[k] as: (更具体地说,对于一个用户序列中的每个用户 u [ k ] u_{[k]} u[k],我们从与用户 u [ k ] u_{[k]} u[k]交互的项目集中选择项目 v [ k ] v_{[k]} v[k],如下所示:)
      在这里插入图片描述
    • where V u [ k ] V_{u_{[k]}} Vu[k] denotes the set of items interacted with user u [ k ] u_{[k]} u[k] and s i m ( v h , v ) sim(v_h ,v) sim(vh,v) is a function to measure the similarity between item v h v_h vh and item v v v.
  • (3) In this paper, we empirically select cosine similarity as follows (其中, V u [ k ] V_{u_{[k]}} Vu[k] 表示与用户 u [ k ] u_{[k]} u[k]交互的项目集, s i m ( V h , V ) sim(V_h,V) sim(VhV)是一个函数,用于度量项目 V h V_h Vh和项目 V V V之间的相似性。在本文中,我们实证选择余弦相似性 如下)
    在这里插入图片描述

    • where f f f function is to generate appropriate features x m x_m xm for item v m v_m vm . Diferent features sources, such as the textual descriptions, the visual content of images and the user-item interactions, could be used to represent the items. (其中, f f f函数是为 v m v_m vm项生成适当的特征 x m x_m xm。不同的特征源,例如文本描述、图像的视觉内容和用户项目交互,可以用来表示项目。)
  • (4) In this paper, we adopt the user-item interactions to represent the items since the auxiliary information such as textual descriptions and visual content is not available. More specifcally, we use the item embeddings learned by NeuMF [14] as the item features to measure similarity between items. (在本文中,由于诸如文本描述视觉内容等辅助信息不可用,我们采用 用户项目交互来表示项目。更具体地说,我们使用NeuMF学习的项目嵌入作为项目特征来衡量项目之间的相似性。)

  • (5) The set of all item-aware social sequences generated for predicting the rating of ( u , v ) (u,v) (u,v) is S u , v = { S ( i ) u , v } i = 1 H S^{u,v} = \{S^{u,v}_{(i)} \}^H_{i =1} Su,v={S(i)u,v}i=1H, where H H H is the number of social sequences generated for this recommendation case. (为预测 ( u , v ) (u,v) (uv)的评级而生成的所有项目感知社会序列 的集合是 S u , v = { S ( i ) u , v } i = 1 H S^{u,v} = \{S^{u,v}_{(i)} \}^H_{i =1} Su,v={S(i)u,v}i=1H,其中 H H H是为该推荐案例生成的社会序列数。)
    在这里插入图片描述

  • (6) The advantages of the item-aware social sequences for predicting interaction between users u u u and items v v v are twofolds. (项目感知社交序列在预测用户 u u u和项目 v v v之间的交互时有两个优点。)

    • First, the social sequences contain not only direct neighbors but also distant neighbors. (首先,社会序列不仅包含直接邻居,还包含远邻。)
    • Second, these sequences are specifc for the recommendation from u u u to v v v. (第二,这些序列是从 u u u v v v推荐的特定序列。)
    • An illustration example of the process of generating item-aware social sequence is shown in Figure 2. We are predicting the rating of user u 1 u_1 u1 to item v 3 v_3 v3 (Spider-man). As shown in fgure, starting from source user u 1 u_1 u1, we perform our random walk on the direct neighbors. The random walk is employed to generate possible user sequence, denoted as S ( 1 ) u 1 = { u [ 2 ] , u [ 3 ] , u [ 6 ] , u [ 7 ] } S^{u_1}_{(1)} = \{u_{[2]},u_{[3]},u_{[6]}, u_{[7]}\} S(1)u1={u[2],u[3],u[6],u[7]} . (图2显示了生成项目感知社交序列的过程示例。我们预测用户 u 1 u_1 u1到物品 v 3 v_3 v3(蜘蛛侠)的评级。如图所示,从源用户 u 1 u_1 u1开始,我们对直接邻居执行随机游走。随机游动用于生成可能的用户序列,表示为 S ( 1 ) u 1 = { u [ 2 ] , u [ 3 ] , u [ 6 ] , u [ 7 ] } S^{u_1}_{(1)} = \{u_{[2]},u_{[3]},u_{[6]}, u_{[7]}\} S(1)u1={u[2],u[3],u[6],u[7]}。)
    • For each user in the user sequence, we need to collect the most similar item to v 3 v_3 v3. The generated item-aware social sequence is S ( 1 ) u 1 , v 3 = ( u [ 2 ] , v [ 3 ] ) , ( u [ 3 ] , v [ 5 ] ) , ( u [ 6 ] , v [ 5 ] ) , ( u [ 7 ] , v [ 3 ] ) S^{u_1, v_3}_{(1)} = {(u_{[2]}, v_{[3]}), (u_{[3]}, v_{[5]}), (u_{[6]}, v_{[5]}), (u_{[7]}, v_{[3]})} S(1)u1,v3=(u[2],v[3]),(u[3],v[5]),(u[6],v[5]),(u[7],v[3]). To prevent clutter, here, we suppose that item v 5 v_5 v5 (Captain America) is the most similar to the item v 3 v_3 v3 (Spider-Man) in our example, and the length of random walk is 4. (对于用户序列中的每个用户,我们需要收集与 v 3 v_3 v3最相似的项目。生成的项目感知社会序列 S ( 1 ) u 1 , v 3 = ( u [ 2 ] , v [ 3 ] ) , ( u [ 3 ] , v [ 5 ] ) , ( u [ 6 ] , v [ 5 ] ) , ( u [ 7 ] , v [ 3 ] ) S^{u_1, v_3}_{(1)} = {(u_{[2]}, v_{[3]}), (u_{[3]}, v_{[5]}), (u_{[6]}, v_{[5]}), (u_{[7]}, v_{[3]})} S(1)u1,v3=(u[2],v[3]),(u[3],v[5]),(u[6],v[5]),(u[7],v[3])。为了避免混乱,这里,我们假设物品 v 5 v_5 v5(美国队长)与我们示例中的物品 v 3 v_3 v3(蜘蛛侠)最相似,随机游动的长度为4)

2.2 The embedding layer: modeling user-item interactions

  • (1) The item-aware sequences consist of user-item interactions from the user’s neighbors, hence, we need to frst model the user-item interactions. (项目感知序列由来自用户邻居的用户项目交互组成,因此,我们需要首先对用户项目交互进行建模。)

    • When modeling the user-item interactions, it is important to carefully consider the opinions the users expressed on the interactions. Obviously, bad and good opinions from the user’s social neighbors can afect the user’s opinion towards the item in tremendously diferent ways. Thus, we propose to include the user’s opinion towards the item when modeling the user-item interaction in the sequence. The opinions are usually expressed in the form of ratings. (当建模用户项目交互时,重要的是仔细考虑用户在交互上表达的意见。显然,来自用户社交邻居的好意见和坏意见会以截然不同的方式影响用户对物品的看法。 因此,我们建议在对序列中的用户-项目交互进行建模时,包含用户对项目的意见。意见通常以评级的形式表达。)
    • For example, as shown in Figure 2, both user u 3 u_3 u3 and user u 6 u_6 u6 interact with the same item v 5 v_5 v5 (Captain America); however, user u 6 u_6 u6 likes v 5 v_5 v5 while user u 3 u_3 u3 dislikes v 5 v_5 v5. (例如,如图2所示,用户 u 3 u_3 u3和用户 u 6 u_6 u6都与相同的物品 v 5 v_5 v5(Captain America)交互;然而,用户 u 6 u_6 u6喜欢 v 5 v_5 v5,而用户 u 3 u_3 u3不喜欢 v 5 v_5 v5。)
  • (2) To model the ratings, we propose to embed each discrete rating value into a rating embedding vector. (为了对评级进行建模,我们建议将每个离散评级值嵌入到评级嵌入向量)

    • Therefore, if there are I I I diferent rating levels, there would be I I I rating embedding vectors. (因此,如果有 I I I不同的评级级别,则会有 I I I评级嵌入向量。)
    • Note that the rating embeddings are also parameters of the framework. The rating embedding of the rating value o o o is denoted as r { o } ∈ R d r_{\{o\}} \in R^d r{o}Rd , with d d d the embedding length.
    • For an interaction ( u [ k ] , v [ k ] ) (u_{[k]}, v_{[k]}) (u[k],v[k]) in the item-aware social sequence S ( i ) u , v S^{u, v}_{(i)} S(i)u,v, the non-zero rating score of this interaction can be found in the rating matrix R R R and let us denote it as o u [ k ] , v [ k ] o_{u_{[k]}, v_{[k]}} ou[k],v[k] . (对于项目感知社会序列 S ( i ) u , v S^{u, v}_{(i)} S(i)u,v中的互动 ( u [ k ] , v [ k ] ) (u_{[k]}, v_{[k]}) (u[k],v[k]),该互动的非零评分可以在评分矩阵 R R R中找到,并让我们将其表示为 o u [ k ] , v [ k ] o_{u_{[k]}, v_{[k]}} ou[k],v[k]。)
    • Then the corresponding rating embedding is r o u [ k ] , v [ k ] r_{o_{u_{[k]}, v_{[k]}}} rou[k],v[k], which we denote as r [ k ] r_{[k]} r[k] for convenience. (那么相应的评级嵌入是 r o u [ k ] , v [ k ] r_{o_{u_{[k]}, v_{[k]}}} rou[k],v[k],为了方便起见,我们将其表示为 r [ k ] r_{[k]} r[k]。)
    • The interaction between user and item is highly non-linear, and including the rating information further adds the complexity. (用户和项目之间的交互是高度非线性的,包含评级信息进一步增加了复杂性。)
    • Hence, we use a multi-layer erception (MLP) to fuse the interaction information with the rating information. The MLP takes the concatenation of user embedding p [ k ] p_{[k]} p[k], rating embedding r [ k ] r_{[k]} r[k], item embedding q [ k ] q_{[k]} q[k] as input and output the user-item interaction embedding e [ k ] e_{[k]} e[k] of interaction ( u [ k ] , v [ k ] ) (u_{[k]}, v_{[k]}) (u[k],v[k]). The procedure can be briefy represented as follows (因此,我们使用多层感知(MLP)来融合交互信息和评级信息。MLP将用户嵌入 p [ k ] p{[k]} p[k]评分嵌入 r [ k ] r{[k]} r[k]项目嵌入 q [ k ] q{[k]} q[k] 的串联作为输入,并输出用户项目交互嵌入 e [ k ] e{[k]} e[k]的交互 ( u [ k ] , v [ k ] ) (u_{[k]}, v_{[k]}) (u[k],v[k])。这个过程可以简单地表示如下)
      在这里插入图片描述
      • where [ p [ k ] , r [ k ] , q [ k ] ] [p_{[k]}, r_{[k]}, q_{[k]}] [p[k],r[k],q[k]] denotes the concatenation of p [ k ] p_{[k]} p[k], r [ k ] r_{[k]} r[k], q [ k ] q_{[k]} q[k]. (其中 [ p [ k ] , r [ k ] , q [ k ] ] [p{[k]},r{[k]},q{[k]}] [p[k],r[k],q[k]]表示 p [ k ] p{[k]} p[k], r [ k ] r{[k]} r[k] q [ k ] q{[k]} q[k]的串联。)
  • (3)Following this procedure, we process each sequence S ( i ) u , v = { ( u [ 1 ] , v [ 1 ] ) , . . . , ( u [ l ] , v [ l ] ) } S^{u, v}_{(i)} =\{(u_{[1]},v_{[1]}), . . . , (u_{[l]}, v_{[l]})\} S(i)u,v={(u[1],v[1]),...,(u[l],v[l])} and get a sequence of fused interaction embedding E ( i ) u , v = { e [ 1 ] , . . . , e [ l ] } E^{u, v}_{(i)} = \{e_{[1]}, . . . , e_{[l]}\} E(i)u,v={e[1],...,e[l]}. The set of all sequences of fused interaction embedding from neighbors for predicting the rating of ( u , v ) (u,v) (u,v) can be denoted as E u , v \mathcal{E}^{u,v} Eu,v . (按照这个过程,我们处理每个序列 S ( i ) u , v = { ( u [ 1 ] , v [ 1 ] ) , . . . , ( u [ l ] , v [ l ] ) } S^{u, v}_{(i)} =\{(u_{[1]},v_{[1]}), . . . , (u_{[l]}, v_{[l]})\} S(i)u,v={(u[1],v[1]),...,(u[l],v[l])}并得到嵌入 E ( i ) u , v = { e [ 1 ] , . . . , e [ l ] } E^{u, v}_{(i)} = \{e_{[1]}, . . . , e_{[l]}\} E(i)u,v={e[1],...,e[l]}。用于预测 ( u , v ) (u,v) (u,v)评级的来自邻居的融合交互嵌入的所有序列的集合可以表示为 E u , v \mathcal{E}^{u,v} Euv。)

2.3 The sequence learning layer: learning representation for item-aware social sequences

  • (1) After generating the item-aware social sequences for ( u , v ) (u,v) (u,v) and transforming each user-item interaction with opinions information in the sequences to fused interaction embedding, we proceed to the sequence learning layer. (为 ( u , v ) (u, v) (u,v)生成项目感知社交序列,并将序列中的每个用户项目交互以及观点信息转换为融合交互嵌入后,我们进入序列学习层。)

    • The sequence learning layer aims to extract features for each sequence and then combine the extracted features of all the sequences to obtain a unifed representation, which can be used to predict the rating for (u,v) in the output layer. (序列学习层的目标是为每个序列提取特征然后组合所有序列的提取特征,以获得统一的表示,该表示可用于预测输出层中 ( u , v ) (u, v) (u,v)的评分)
  • (2) As all the neighbors in the sequence would afect the prediction of ( u , v ) (u,v) (u,v), for distant neighbors, we need to capture the distant social information between them and the user u u u. (由于序列中的所有邻居都会影响 ( u , v ) (u, v) (u,v)的预测,对于远程邻居,我们需要捕获他们和用户 u u u之间的远程社交信息。)

    • Furthermore, in social networks, users infuence each other. (此外,在社交网络中,用户相互影响。)
    • Hence, we need to capture the bi-directional infuence in the model. (因此,我们需要捕捉模型中的双向影响。)
    • Recently, a bi-directional long short-term memory network (Bi-LSTM) based language model [1, 44] has been proposed to capture the long-range bi-directional semantic dependencies between words in sentence in NLP domain. (最近,有人提出了一种基于双向长期短期记忆网络(Bi-LSTM) 的语言模型[1,44],以捕捉 NLP域中句子中单词之间的长期双向语义依赖关系。)
    • Inspired by these model, we regard the sequence as a “sentence” and elements in this sequence as “words” and adopt a similar Bi-LSTM model to extract features from the sequence of fused interaction embeddings. The bi-directional LSTM contains the forward LSTM L S T M → \overrightarrow{LSTM} LSTM which reads the sequence E ( i ) u , v E^{u,v}_{(i)} E(i)u,v from e [ 1 ] e_{[1]} e[1] to e [ l ] e_{[l]} e[l], and a backward LSTM L S T M ← \overleftarrow{LSTM} LSTM which reads from e [ l ] e_{[l]} e[l] to e [ 1 ] e_{[1]} e[1], (受这些模型的启发,我们将序列视为“句子”,将序列中的元素视为“单词”,并采用类似的Bi-LSTM模型从融合交互嵌入序列中提取特征。双向LSTM包含前向LSTM L S T M → \overrightarrow{LSTM} LSTM 和后向LSTM L S T M ← \overleftarrow{LSTM} LSTM ,前者读取从 E [ 1 ] E{[1]} E[1] E [ l ] E{[l]} E[l]的序列 E u , v ( i ) E{u,v}{(i)} Eu,v(i),后者读取从 E [ l ] E{[l]} E[l] E [ 1 ] E{[1]} E[1],)
      在这里插入图片描述
      • where h [ k ] ( i ) → \overrightarrow{h^{(i)}_{[k]}} h[k](i) and h [ k ] ( i ) ← \overleftarrow{h^{(i)}_{[k]}} h[k](i) are hidden states of L S T M → \overrightarrow{LSTM} LSTM , L S T M ← \overleftarrow{LSTM} LSTM , respectively.
  • (3) These hidden states, which are corresponding to the neighbors in the sequence, are then combined using an attention mechanism [10, 35, 36] to generate the features s ( i ) u , v s^{u, v}_{(i)} s(i)u,v of the sequence E ( i ) u , v E^{u,v}_{(i)} E(i)u,v . (这些隐藏状态对应于序列中的相邻状态,然后使用注意力机制将它们组合起来,以生成序列 E ( i ) u , v E^{u,v}_{(i)} E(i)u,v的特征 s ( i ) u , v s^{u, v}_{(i)} s(i)u,v。)
    在这里插入图片描述

    • where h [ k ] ( i ) h^{(i)}_{[k]} h[k](i) is [ h [ k ] ( i ) → ⊕ h [ k ] ( i ) ← ] [\overrightarrow{h^{(i)}_{[k]}} \oplus \overleftarrow{h^{(i)}_{[k]}}] [h[k](i) h[k](i) ], the concatenation of h [ k ] ( i ) → \overrightarrow{h^{(i)}_{[k]}} h[k](i) and h [ k ] ( i ) ← \overleftarrow{h^{(i)}_{[k]}} h[k](i) .
    • Specially, we parameterize the attention weight α k \alpha_k αk with one-layer network, and extract these user (neighbor)-item interaction embeddings that are important to learn representation for the item-aware social sequence. (我们用一层网络注意权重 α k \alpha_k αk 参数化,并提取这些用户(邻居)-项目交互嵌入,这对于学习 项目感知社交序列 的表示非常重要)
  • (4) The normalized importance weight α k \alpha_k αk is calculated through a Softmax function follows (标准化的重要性权重 α k \alpha_k αk通过以下函数计算)
    在这里插入图片描述

    • where the neighbor-level context vector a u a_u au can be seen as a high level representation of a fixed query “what is the informative neighbor-item interaction embedding?” over all the neighbor-item interaction embeddings in the item-aware social sequence. Note that the neighbor-level context vector a u a_u au is parameters in the framework and needs to be jointly learned during the training process. (其中 邻居级上下文向量 a u a_u au 可以被视为固定查询“什么是信息性邻居项交互嵌入?”所有的邻居项目互动都嵌入到项目感知的社会序列中。请注意,邻居级上下文向量 a u a_u au是框架中的参数,需要在培训过程中共同学习。)
  • (5) We then combine the representations of all the user-item inter-action embedding sequences to generate the unifed representation of item-aware social sequences for ( u , v ) (u,v) (u,v) as (然后,我们结合所有用户项目交互嵌入序列的表示,生成 ( u , v ) (u,v) (u,v)的项目感知社会序列的统一表示)
    在这里插入图片描述

    • where we adopt an attention mechanism to diferentiate the importance weight β i \beta_i βi of item-aware social sequences as follows (其中我们采用注意机制区分重要性权重 β i \beta_i βi项目感知社交序列如下所示)
      在这里插入图片描述
  • (6) Similar to eq. (9), z u z_u zu can be seen as a high level representation query “which is the informative item-aware social sequence?” over all the social sequences. (与 等式(9) 类似, z u z_u zu可以被视为一个高级表示查询,“哪个是信息项感知的社交序列?”在所有的社交场合。)

  • The reason why we introduce two attentions is that (我们引入两种关注的原因是)

    • not all user-item interaction with opinions information in one item-aware social sequence contribute equally to the representation of this item-aware social sequence; (在一个项目感知的社会序列中,并非所有用户项目与意见信息的交互都对该项目感知的社会序列的表示有同等的贡献;)
    • and not all these sequences contribute equally to the unifed representation of item-aware social sequences for (u,v). (并不是所有这些序列都对 ( u , v ) (u, v) (u,v)的项目感知社会序列的统一表示有同样的贡献。)

2.4 The output layer: rating prediction

  • (1) In the output layer, we will design recommendation tasks to learn model parameters. (在输出层,我们将设计推荐任务来学习模型参数。)
    • There are various recommendation tasks such as item ranking and rating predation. (有各种各样的推荐任务,比如项目排名评分预测。)
  • (2) In this work, we apply the proposed DSCF model for the recommendation task of rating prediction. We fnally make the prediction of rating score of the user u u u to item v v v. (在这项工作中,我们将建议的DSCF模型应用于评级预测的推荐任务==。我们最终将预测用户 u u u对项目 v v v的评分。)
  • (3) The input of the output layer includes the user embedding p [ u ] p_{[u]} p[u], the item embedding q [ v ] q_{[v]} q[v] and the unifed item-aware social representations s u , v s^{u,v} su,v learned in the sequence learning layer. (输出层的输入包括用户嵌入 p [ u ] p_{[u]} p[u]项目嵌入 q [ v ] q_{[v]} q[v]在序列学习层中学习的统一项目感知社会表示 s u , v s^{u,v} su,v。)
  • (4) As shown in the output layer in Figure 1, a multi-layer perception (MLP) is frst used to combine the user embedding p [ u ] p_{[u]} p[u] and the unifed item-aware social representations s u , v s^{u,v} su,v . (如图1中的输出层所示,首先使用多层感知(MLP)来组合用户嵌入 p [ u ] p_{[u]} p[u]统一的项目感知社会表征 s u , v s^{u,v} su,v。)
    • Let us denote this MLP as f u , s f_{u,s} fu,s .
  • (5) Then, another MLP, which can be denoted as f u , v f_{u,v} fu,v , is used to predict the rating score of ( u , v ) (u,v) (u,v). The prediction procedure, which takes p [ u ] p_{[u]} p[u], q [ v ] q_{[v]} q[v], s u , v s^{u,v} su,v as input, can be represented as (然后,使用另一个MLP(可以表示为 f u , v f_{u,v} fu,v)来预测 ( u , v ) (u,v) (u,v)的评级分数。以 p [ u ] p{[u]} p[u] q [ v ] q{[v]} q[v] s u , v s{u,v} su,v作为输入的预测过程可以表示为)
    在这里插入图片描述
    • where [ , ] [,] [,] denotes the concatenation operation, and r u , v ′ r^{'}_{u, v} ru,v is the predicted rating from user u u u to item v v v. ( [ , ] [,] [,]表示连接操作, r ′ u , v r^{'}{u,v} ru,v是从用户 u u u到项目 v v v的预测评级。)

2.5 Model Training

  • (1) To estimate parameters of the framework DSCF, we need to specify an objective function to optimize. Since the task we focus on in this work is rating prediction, a commonly used objective function is formulated as, (为了估计框架DSCF的参数,我们需要指定一个目标函数进行优化。由于我们在这项工作中关注的任务是评级预测,一个常用的目标函数公式如下:)
    在这里插入图片描述

    • where O O O denotes all the observed user-item interactions, ∣ O ∣ |O| O is the number of interactions in O O O, and r i , j ′ r^{'}_{i,j} ri,j is the predicted rating while r i , j r_{i,j} ri,j is the ground truth rating assigned by the user u i u_i ui on the item v j v_j vj . (其中 O O O表示所有观察到的用户项交互, ∣ O ∣ |O| O是以 O O O为单位的交互数量, r i , j ′ r^{'}_{i,j} ri,j是预测评分, r i , j r_{i,j} ri,j是用户 u i u_i ui v j v_j vj分配的地面真实评分。)
  • (2) To optimize the objective function, we adopt the Adaptive Moment Estimation (Adam) [7] as the optimizer in our implementation. (为了优化目标函数,我们在实现中采用了自适应矩估计(Adam)[7]作为优化器。)

  • (3) We also adopt the dropout strategy [28] to alleviate the overftting issue in optimizing deep neural network models. (我们还采用了drop-out策略[28],以缓解深度神经网络模型优化中的过度问题。)

  • (3) There are three embedding in our model, including (在我们的模型中有三种嵌入,包括)

    • item embedding q j q_j qj,
    • user embedding p i p_i pi , and
    • rating embedding r o r_o ro .
  • They are randomly initialized and jointly learned during the training stage. (它们是在训练阶段随机初始化的和联合学习的)

  • We do not use one-hot vectors to represent each user and item, since the raw features are very large and highly sparse. By embedding high-dimensional sparse features into a low-dimensional latent space, the model can be easy to train [14, 38]. (我们不使用one-hot向量来表示每个用户和项目,因为原始特征非常大且非常稀疏通过将高维稀疏特征嵌入低维潜在空间,模型可以很容易训练[14,38]。)

  • Rating embedding matrix r \mathcal{r} r depends on the rating scale of the system. For example, for a 5-star rating system, rating embedding matrix r contains 5 diferent embedding vectors to denote scores in {1, 2, 3, 4, 5}. ( 评分嵌入矩阵 r \mathcal{r} r 取决于系统的评级规模。例如,对于一个五星评级系统,评级嵌入矩阵 r r r包含5个不同的嵌入向量来表示{1,2,3,4,5}中的分数。)

3 EXPERIMENTS

In this section, we conduct experiments to verify the efectiveness of our model. We frst introduce the experimental settings, then discuss the results of the performance comparison of various recommender systems, and fnally study the impact of diferent components in our model. (在这一部分中,我们进行实验来验证我们模型的有效性。我们首先介绍了实验设置,然后讨论了各种推荐系统的性能比较结果,最后研究了模型中不同组件的影响。)

3.1 Experimental Settings

3.1.1 Datasets.

  • (1) In our experiments, two representative datasets Ciao and Epinions1 are utilized to verify the efectiveness of our model. They are taken from the product review sites Ciao (www.ciao.co.uk) and Epinions (www.epinions.com). Each site allows users to rate items, and add friends to their ‘Circle of Trust’. Therefore, they provide a large amount of rating information and social information. The rating scale is from 1 to 5. We randomly initialize rating embedding with 5 diferent embedding vectors based on 5 scores in {1, 2, 3, 4, 5}. The statistics of these two datasets are presented in Table 1. (在我们的实验中,利用两个具有代表性的数据集Ciao和Epinions1来验证我们模型的有效性。它们来自产品评论网站Ciao(www.Ciao.co.uk)和Epinions(www.Epinions.com)。每个网站都允许用户对物品进行评分,并将朋友添加到他们的“信任圈”。因此,它们提供了大量的评级信息和社会信息。评分范围从1到5。我们根据{1,2,3,4,5}中的5个分数,用5个不同的嵌入向量随机初始化分级嵌入。这两个数据集的统计数据如表1所示。)
    在这里插入图片描述

3.1.2 Evaluation Metrics.

In order to evaluate the quality of the recommendation algorithms, two popular metrics are adopted to evaluate the predictive accuracy, namely (为了评估推荐算法的质量,采用了两种流行的度量来评估预测精度,即)

  • Mean Absolute Error (MAE) and
  • Root Mean Square Error (RMSE) [10].
  • Smaller values of MAE and RMSE indicate better predictive accuracy. Note that small improvement in RMSE or MAE terms can have a signifcant impact on the quality of the top-few recommendations [19]. (MAE和RMSE值越小,预测准确率越高。请注意,RMSE或MAE术语的微小改进可能会对前几条建议的质量产生显著影响[19]。)

3.1.3 Baselines.

  • (1) To evaluate the performance, we compared DSCF with three groups of methods including (为了评估性能,我们将DSCF与三组方法进行了比较,包括)

    • traditional recommender systems, (传统的推荐系统,)
    • traditional social recommender systems, and (传统的社会推荐系统,以及)
    • deep neural network based recommender systems. (基于深度神经网络的推荐系统)
  • (2)For each group, we select representative baselines and below we detail them. (对于每组,我们选择有代表性的基线,并在下面详细说明。)

    • PMF [26]: Probabilistic Matrix Factorization utilizes user-item rating matrix only and models latent factors of users and items by Gaussian distributions. (概率矩阵分解仅利用用户项目评级矩阵并通过 高斯分布 对用户和项目的潜在因素进行建模。)
    • SoRec [21]: Social Recommendation performs co-factorization on the user-item rating matrix and user-user social relations matrix. (社交推荐对用户项目评分矩阵用户社交关系矩阵执行协因数分解。)
    • SoReg [22]: Social Regularization models social network information as regularization terms to constrain the matrix factorization framework. (社交正则化将社交网络信息建模为 正则化项 以约束矩阵分解框架)
    • SocialMF [16]: It considers the trust information and propagation of trust information into the matrix factorization model for recommender systems. (它考虑信任信息,并将信任信息传播 到推荐系统的矩阵分解==模型中。)
    • TrustMF [43]: This method adopts matrix factorization technique that maps users into two low-dimensional spaces: truster space and trustee space, by factorizing trust networks according to the directional property of trust. (该方法采用矩阵分解技术,根据信任的方向性对信任网络进行分解,将用户映射到两个低维空间:信任者空间受托者空间。)
    • NeuMF [14]: This method is a state-of-the-art matrix factorization model with neural network architecture. The original implementation is for recommendation ranking task and we adjust its loss to the squared loss for rating prediction. (该方法是一种先进的矩阵分解神经网络结构模型。最初的实现是用于推荐排名任务,我们将其损失调整为用于评级预测的平方损失。)
    • DeepSoR [9]: This model employs deep learning to learn representations of each user from social relations, and to integrate them into probabilistic matrix factorization for rating prediction. (该模型采用深度学习从社会关系中学习每个用户的表示,并将其集成到概率矩阵分解中进行评级预测。)
    • GCMC+SN [4]: This model is a state-of-the-art recom-mender system with graph neural network architecture. In order to incorporate social network information into GCMC, we utilize the node2vec [12] to generate user embedding as user side information, instead of using the raw feature social connections ( T ∈ R n × n T \in R^{n\times n} TRn×n ) directly. The reason is that the raw feature input vectors is highly sparse and high-dimensional. Using the network embedding techniques can help compress the raw input feature vector to a low-dimensional and dense vector, then the model can be easy to train. (该模型是一个具有图形神经网络结构的最先进的推荐系统。为了将社交网络信息整合到GCMC中,我们利用节点2vec[12]生成用户嵌入作为用户端信息,而不是直接使用原始特征社交连接( T ∈ R n × n T \in R^{n\times n} TRn×n)。原因是原始特征输入向量是高度稀疏和高维的。利用网络嵌入技术可以将原始输入特征向量压缩为低维密集向量,从而使模型易于训练。)
  • (3) PMF and NeuMF are pure collaborative fltering model without social information for rating prediction, while the others are social recommendations. (PMF和NeuMF是**协同过滤模型,没有用于评分预测的社会信息,而其他是社会推荐。)

  • Besides, we compared DSCF with two state-of-the-art neural network based social recommender systems, i.e., DeepSoR, and CGMC+SN. (此外,我们还将DSCF与两种最先进的基于神经网络的社会推荐系统,即DeepSoR和CGMC+SN进行了比较。)

3.1.4 Parameter Setings.

  • We implemented our proposed model in Pytorch2. (我们在Pytorch2中实现了我们提出的模型。)
  • For each dataset, we used x x x% as a training set to learning parameters, (1 - x%)/2 as a validation set to tune hyper-parameters, and (1- x%)/2 as a testing set for the fnal performance comparison, where x was varied as {80%, 60%} [10]. (对于每个数据集,我们使用 x x x%作为学习参数的训练集,(1-x%)/2作为调整超参数的验证集,以及(1-x%)/2作为fnal性能比较的测试集,其中x的变化为{80%,60%}[10]。)
  • For the embedding size d d d, we tested the value of {8, 16, 32, 64, 128, 256}. (对于嵌入大小 d d d,我们测试了{8,16,32,64,128,256}的值。)
  • The batch size and learning rate were searched in {16, 32, 64, 128, 512}and {0.0005, 0.001, 0.005, 0.01, 0.05, 0.1}, respectively. (批次大小和学习率分别在{16,32,64,128,512}和{0.0005,0.001,0.005,0.01,0.05,0.1}中搜索。)
  • Moreover, we empirically set the size of the hidden layer the same as the embedding size (the dimension of the latent factor) and the activation function as ReLU. (此外,我们根据经验将隐藏层的大小设置为与嵌入大小(潜在因子的维度)相同的大小,并将激活函数设置为ReLU)
  • Without special mention, we employed three hidden layers for all the neural components. (在没有特别提及的情况下,我们为所有神经成分使用了三个隐藏层。)
  • The early stopping strategy was performed, where we stopped training if the RMSE on validation set increased for 5 successive epochs. The parameters for the baseline algorithms were initialized as suggested in the corresponding papers, and were then carefully tuned to achieve optimal performance. (执行早期停止策略,如果验证集上的RMSE连续5个时期增加,我们停止训练。基线算法的参数按照相应论文中的建议进行初始化,然后仔细调整以实现最佳性能。)

3.2 Performance Comparison

在这里插入图片描述

  • (1) We frst compare the recommendation performance of all methods. Table 2 shows the overall rating prediction error w.r .t . RMSE and MAE among the recommendation methods on Ciao and Epinions datasets, respectively. We have the following fndings: (我们首先比较了所有方法的推荐性能,表2显示了总体评级预测误差w.r。TRMSE和MAE分别是Ciao和EPIONS数据集的推荐方法。我们有以下几点)
    • SoRec, SoReg, SocialMF and TrustMF improve over PMF. All of these methods are based on matrix factorization. SoRec, SoReg, SocialMF and TrustMF leverage both the user-item interactions and social information; while PMF only utilizes user-item interactions. These improvements show the efectiveness of incorporating social information for recommender systems. (与PMF相比,SoRec、SoReg、SocialMF和TrustMF有所改善。所有这些方法都是基于矩阵分解的。SoRec、SoReg、SocialMF和TrustMF利用用户项目交互和社交信息;而PMF只利用用户项交互。这些改进显示了将社会信息纳入推荐系统的有效性。)
    • NeuMF achieves much better performance than PMF. Both of them utilize the user-item interactions only. NeuMF is based on deep architecture; while PMF is a traditional method with shallow architecture. This suggests the power of employing deep architecture on the task of recommendation. (NeuMF的性能比PMF好得多。它们都只利用用户项交互。NeuMF基于深层架构;而PMF是一种传统的浅层结构方法。这表明了在推荐任务中使用深层架构的力量。)
    • Two deep models, DeepSoR and GCMC+SN, obtain better performance than SoRec, SoReg, SocialMF, and TrustMF, which are based on matrix factorization with shallow architecture. These improvements further refect the power of employing deep architecture on the task of recommendation. (DeepSoR和GCMC+SN这两种深度模型的性能优于SoRec、SoReg、SocialMF和TrustMF,后者基于浅层结构的矩阵分解。这些改进进一步反映了在推荐任务中使用深层架构的能力。)
    • DSCF outperforms NeurMF. This result further supports that social information is complementary to user-item interactions for recommendation. (DSCF的表现优于NeurMF。这一结果进一步支持了社交信息对推荐用户项交互的补充作用。)
    • Our model DSCF consistently outperforms all the baseline methods. Compared with DeepSoR and GCMC+SN, our model proposes advanced model components to integrate user-item interactions and social information. In addition, our model introduces ways to capture user’s opinions while modeling user-item interactions. We will provide further investigations to better understand the contributions
      of model components to the proposed framework in the following subsection. (我们的DSCF模型始终优于所有基线方法。与DeepSoR和GCMC+SN相比,我们的模型提出了高级模型组件来集成用户项交互和社交信息。此外,我们的模型引入了在建模用户项目交互时捕捉用户意见的方法。我们将在下面的小节中提供进一步的调查,以更好地了解模型组件对提出的框架的贡献。)

3.3 Model Component Analysis

  • (1) In the previous subsection, we have demonstrated the efectiveness of the proposed framework. To deeply understand DSCF, we compare it with three variants, i.e., DSCF-Opinion, DSCF-Item&Opinion, DSCF-ATT, DSCF-Averaging and DSCF-Shufing, which are defned as follows: (在上一小节中,我们已经证明了所提议的框架的有效性。为了深入理解DSCF,我们将其与三个变量进行比较,)

    • DSCF-Opinion: This variant uses the item-aware social sequences to represent user’s social information; while ignoring the opinions on the user-item interaction. (这种变体使用项目感知的社交序列来表示用户的社交信息;而忽略对用户项目交互的意见。)
    • DSCF-Item&Opinion: Based on DSCF-Opinion, it further eliminates the associated items in the social sequence. (根据DSCF的意见,它进一步消除了社交序列中的相关项。)
    • DSCF-ATT: This variant is to study the impact of attention mechanisms on learning s ( i ) u , v s^{u,v}_{(i)} s(i)u,v and s u , v s^{u,v} su,v. The attention mechanisms α α α and β β β are removed in this variant. (这个变体是为了研究注意机制对学习 s ( i ) u , v s^{u,v}_{(i)} s(i)u,v s u , v s^{u,v} su,v的影响。在这个变体中,注意力机制 α α α β β β被删除)
    • DSCF-Averaging: This variant replaces Bi-LSTM with av-eraging the elements in the input of the sequence in the sequence learning layer. (这种变体将Bi LSTM替换为在序列学习层对序列输入中的元素进行平均化。)
    • DSCF-Shufing: This variant randomly shufes the order of elements in the sequence in the sequence learning layer. (这种变体在序列学习层中随机改变序列中元素的顺序。)
  • (2) The variant DSCF-Averaging considers that all users in the sequence have the same infuence to the target user; while DSCF-Shufing assumes that the infuence is not related to the distance to the target user. These two variants are designed to understand the beneft of adapting Bi-LSTM to capture the item-aware social sequences. (变异DSCF平均法认为序列中的所有用户对目标用户具有相同的影响;而DSCF Shufing则假设影响与目标用户的距离无关。这两种变体旨在理解采用Bi LSTM来捕获项目感知社交序列的好处。)

  • (3) The results on Ciao are given in Figure 3 and Figure 4. We do not show the results on Epinions since similar observations can be made. From the results, we have the following fndings: (Ciao的结果如图3和图4所示。由于可以进行类似的观察,因此我们不显示Epinions的结果。根据结果,我们得出以下结论:)
    在这里插入图片描述
    在这里插入图片描述

    • Item-aware Social Sequences with Opinions. We now focus on analyzing the efectiveness of opinions on interactions. From the Figure 3, we can see that the performance of DSCF reduces signifcantly when ignoring the opinions on the user-item interactions in the social sequence (i.e., DSCF-Opinion), which suggests that it is necessary to consider opinions on interactions. In other words, diferent opinions from a user’s friends would afect the user’s decision in tremendously diferent ways. (我们现在重点分析意见对互动的有效性。从图3我们可以看到,当忽略社交序列中用户项目交互的意见(即DSCF意见)时,DSCF的性能显著降低,这表明有必要考虑关于交互的意见。换句话说,用户朋友的不同意见会以截然不同的方式影响用户的决定。)
    • Item-aware Social Sequences. To recommend a specifc item, not all information from users in the sequence is useful; in other words, interactions of these users with related items are more useful. From the results in Figure 3, DSCF-Item&Opinion performs worse than DSCF and DSCF-Opinion. These observations support the importance to generate item-aware sequences. In other words, not all information from neighbors are useful for recommending a specifc item (e.g., Spider-man). Only the information related to this item would be useful (e.g., Captain America). (要推荐一个特定的项目,并非来自用户的所有信息都有用;换句话说,这些用户与相关项目的交互更有用。从图3的结果来看,DSCF项目和意见的表现比DSCF和DSCF意见差。这些观察结果支持生成项目感知序列的重要性。换句话说,并非所有来自邻居的信息都有助于推荐特定项目(例如蜘蛛侠)。只有与此项目相关的信息才有用(例如,美国队长)。)
    • Attention Mechanisms. We conducted experiments to verify the efectiveness of the attention mechanism. From the results in Figure 3, we can observe that DSCF-ATT obtains worse performance than DSCF. The reason is that not all the user (neighbor)-item interactions in one social sequence contribute equally to learn the representation of item-aware social sequence; and not all these item-aware social sequences have the same importance to the unifed representation of all item-aware social sequences. These results demonstrate the benefts of the attention mechanisms on learning s ( i ) u , v s^{u,v}_{(i)} s(i)u,v and s u , v s^{u,v} su,v (我们进行了实验来验证注意机制的有效性。从图3中的结果可以看出,DSCF-ATT的性能比DSCF差。原因是,在一个社交序列中,并不是所有的用户(邻居)-项目交互都对学习项目感知社交序列的表示有同等的贡献;并不是所有这些项目感知的社会序列对所有项目感知的社会序列的统一表示都具有相同的重要性。这些结果证明了注意机制对学习 s ( i ) u , v s^{u,v}_{(i)} s(i)u,v s u , v s^{u,v} su,v的益处)
    • Bi-LSTM. Figure 4 presents the efect of Bi-LSTM on Ciao dataset. The performance of both DSCF-Averaging and DSCF-Shufing reduces signifcantly. It suggests that the Bi-LSTM component is better to learn representations for item-aware social sequences. The reason is that the social sequence refects the information difusion to the target user and the infuence to the target user should be heterogeneous and related to the distance. (图4展示了Bi LSTM对Ciao数据集的影响。DSCF平均和DSCF去毛刺的性能都显著降低。这表明Bi LSTM组件更好地学习项目感知社会序列的表示。这是因为社会序列反映了对目标用户的信息融合,对目标用户的影响应该是异质的,并且与距离有关。)

3.4 Parameter Analysis

  • (1) There are two important parameters of the proposed framework, i.e., (该框架有两个重要参数,)
    • the length of each item-aware social sequence and (每个项目感知社会序列的长度)
    • the number of item-aware social sequences. 项目感知社会序列的数量
    • In this subsection, we investigate the impact of these parameters by examining how the performance changes when varying one parameter and fxing others. Similarly, we only show results on Ciao. (在本小节中,我们通过检查在改变一个参数和固定其他参数时性能如何变化来研究这些参数的影响。同样,我们只在Ciao上显示结果。)

Efect of the length of sequences l. Figure 5 shows the performance with the varied length of equences on Ciao. If the length of sequence is one, our model boils down to use the direct neighbors. When the length of sequence increases, the performance tends to increase first. This indicates that the direct neighbors cannot sufciently capture the useful social information and including distant neighbors could help. However, when the length of sequences becomes too large, the performance degrades as we may introduce too many noises with the distant neighbors. (图5显示了Ciao上不同序列长度的性能。如果序列长度为1,我们的模型归结为使用直接邻域。当序列长度增加时,性能倾向于先增加。这表明直系邻居无法充分获取有用的社会信息,包括远邻可能会有所帮助。然而,当序列的长度变得太大时,性能会下降,因为我们可能会在遥远的邻居中引入太多的噪声。)
在这里插入图片描述
Efect of the number of sequences H. Figure 6 shows how the number of sequences afects the performance of recommendations. Generally more sequences can sufciently explore the neighborhood of users, which can help us
understand social information better; however, it is also risky to generate too many since we may introduce noise as well. (图6显示了序列的数量如何影响建议的性能。一般来说,更多的序列可以充分探索用户的邻域,这有助于我们更好地理解社会信息;然而,由于我们也可能引入噪声,因此产生太多噪声也是有风险的。)
在这里插入图片描述

4 RELATED WORK

  • (1) In this section, we briefy review some researches related to our work. Collaborative fltering [11], which captures users’ preference towards items utilizing user-item interactions, is the most popular approach to build modern recommender systems. (在本节中,我们简要回顾了与我们工作相关的一些研究 Collaborative fltering [11]是构建现代推荐系统最流行的方法,它捕捉了用户对使用用户项目交互的项目的偏好。)

    • In addition to the user-item interactions, social relations also have potential to help understand users’ preference. Many social recommendation methods [13, 21, 25, 31, 33, 41, 46] have shown the efectiveness of including social relations for recommendations. Among them, (除了用户项目交互社交关系 也有可能帮助理解用户的偏好。许多社会推荐方法[13,21,25,31,33,41,46]都显示了包含社会关系推荐的有效性。)
      • SoRec [21] co-factorizes the rating matrix (user-item interaction matrix) and the social relation matrix for recommendation by sharing user latent vectors between them. (SoRec[21]将评分矩阵(用户项目交互矩阵)社会关系矩阵共同分解为推荐人在他们之间共享用户潜在向量 )
      • SoDimRec [33] utilizes the heterogeneity of social relations and the weak dependency connections in social networks for recommendation. A comprehensive survey on social recommendations can be found in [32]. (**SoDimRec[33]利用社会关系中的异质性和社交网络中的弱依赖关系进行推荐。关于社会推荐的全面调查可参见[32]。)
  • (2) Recently, deep neural networks have been adopted to enhance recommender systems [2, 47]. Most of them utilize deep neural networks as feature learning tools to extract features from auxiliary information such as text description of an item [5, 18, 37] and visual information of images [45]. (最近,深度神经网络 被用来增强推荐系统[2,47]。他们中的大多数人利用深度神经网络作为 特征学习工具,从辅助信息中提取特征,例如项目[5,18,37]的文本描述和图像[45]的视觉信息)

    • NeuMF [14], is a matrix factorization based deep recommendation method, which uses deep neural networks to explore the non-linearity in user-item interactions. (NeuMF [14]是一种基于 矩阵分解 的深度推荐方法,它使用深度神经网络来探索用户项目交互中的非线性。)
    • NSCR [40] extends the NeuMF model by utilizing the social network information as a graph regularization, which enforces nearby neighbors to have similar latent vectors. NSCR addresses the task of cross-domain recommendations for ranking metric, and focuses on how to distill useful signal from an external social network (e.g., Facebook and Twitter) on the cross-domain task, while our model focuses on how to learn the social information from the user-user interaction in the same e-commerce platform, rather than external social network. (NSCR[40]通过将社交网络信息用作图正则化,扩展了NeuMF模型,该正则化强制附近的邻居具有相似的潜在向量 。NSCR解决了排名指标的跨域推荐任务,并侧重于如何在跨域任务中从外部社交网络(如Facebook和Twitter)提取有用的信号,而我们的模型侧重于如何从同一电子商务平台的用户交互中学习社交信息,而不是外部社交网络。))
    • ARSE [29] proposes the problem of temporal social recommendation for ranking metric, which has dynamic and static part to model the dynamic and static preferences of users. ARSE targets on the dynamic preferences of the recommendation, rather that the social information. (ARSE[29]提出了排名度量的时态社会推荐问题,该问题具有动态和静态部分,用于建模用户的动态和静态偏好。ARSE的目标是 推荐的动态偏好 ,而不是社交信息。)
  • (3) Most related to our task with neural networks includes DLMF [6], GCMC [4], DeepSoR [9], GraphRec [10] and DASO [8]. (与我们的神经网络任务最相关的包括DLMF[6]、GCMC[4]、DeepSoR[9]、GraphRec[10]和DASO[8]。)

    • DeepSoR [9] first represents users using pre-trained node embedding technique, and further utilizes deep neural networks to capture non-linear features in social relations and integrate them into probabilistic matrix factorization. (DeepSoR[9]首先使用预先训练好的节点嵌入技术代表用户,然后进一步利用深度神经网络捕捉社会关系中的非线性特征,并将其集成到 概率矩阵分解 中。)
    • DASO [8] proposes a deep adversarial social recommendation framework, which adopts a bidirectional mapping method to transfer users’ information between social domain and item domain using adversarial learning. (DASO[8]提出了一个深度对抗式社会推荐框架,该框架采用 双向映射方法 ,通过 对抗式学习 在社会领域和项目领域之间传递用户信息。)
    • GraphRec [10] harness the power of graph neural networks (GNNs) techniques to model graph data in social recommendations by aggregating the both user-item interactions information and direct social neighbors. (GraphRec[10]利用 图形神经网络(GNNs) 技术的力量,通过聚合用户项交互信息和直接社交邻居,对社交推荐中的图形数据进行建模。)
    • However, these deep social recommendation methods cannot take full advantages of social networks. (然而,这些深度社交推荐方法无法充分利用社交网络。)
    • In this paper, we propose a deep social recommendation framework which can sufciently exploit the social network information for recommendations. (在本文中,我们提出了一个深度的社会推荐框架,可以充分利用社会网络信息进行推荐。)

5 CONCLUSION AND FUTURE WORK

  • (1) We have presented a Deep Social Collaborative Filtering (DSCF) which can exploit the social information with various aspects for recommendations. (我们提出了一种深度社交协同过滤(DSCF),它可以利用社会信息的各个方面进行推荐。)
    • Particularly, we propose to utilize the random walk to generate item-aware social sequences, which consider information from not only direct neighbors but also distant neighbors. (特别是,我们提出利用随机游走产生项目感知的社会序列,它不仅考虑直接邻居的信息,也考虑遥远的邻居。)
    • In addition, we also introduce a novel way to capture neighbors’ opinions when modeling user-item interactions. (此外,我们还介绍了一种新的方法,在建模用户项交互时捕获邻居的意见。)
    • Finally, the Bi-LSTM with attention mechanism is proposed to extract feature for the social sequence. (最后,提出了带有注意力机制Bi-LSTM来提取社会序列的特征。)
    • Our experiments reveal that the item-aware sequences and the opinion information play a crucial role in modeling social information. (我们的实验表明,项目感知序列和意见信息在社会信息建模中起着至关重要的作用。)
  • (2) Comprehensive experiments on two real-world datasets show the efectiveness of our model. (在两个真实数据集上的综合实验表明了该模型的有效性。)
  • (3) In this work, we only utilize the user-item interactions to measure the similarity between items, while rich side information may be associated with items, such as the textual description, and the visual content of images. Therefore, incorporating side information would be considered as an interesting future direction. (在这项工作中,我们只利用用户项目交互来衡量项目之间的相似性,而丰富的边信息可能与项目相关,例如文本描述和图像的视觉内容。因此,纳入辅助信息将被视为一个有趣的未来方向。)

ACKNOWLEDGMENTS

REFERENCES

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值