Paper Notes: Gemini

Gemini: A Novel and Universal Heterogeneous Graph Information Fusing Framework for Online Recommendations
  • LINK: https://doi.org/10.1145/3394486.3403388

  • CLASSIFICATION: RECOMMENDER-SYSTEM, HETEROGENEOUS NETWORK, GCN

  • YEAR: Submitted on Aug 2020

  • FROM: KDD 2020

  • WHAT PROBLEM TO SOLVE: Researchers have made efforts to utilize additional auxiliary information (e.g., social relations of users) to improve performance. However, such auxiliary information lacks compatibility for all recommendation scenarios, thus it is difficult to apply in some industrial scenarios where generality is required. Moreover, the heterogeneous nature between users and items aggravates the difficulty in network information fusion. In addition, the sparsity of user-item interactions is an urgent problem need to be solved.

  • SOLUTION: To solve the above problems, we propose a universal and effective framework named Gemini, which only relies on the common interaction logs, avoiding the dependence on auxiliary information and ensuring a better generality.

  • CORE POINT:

    • The Main Contributions:

      1. We propose a new heterogeneous graph fusing framework, Gemini, which does not rely on any auxiliary information, and handles heterogeneous graph more effectively through a novel and effective network transformation. Thus, Gemini can be applied to all kinds of recommended scenario and achieve satisfactory results. To our best knowledge, this is the first work to transform heterogeneous graph to two semi homogeneous graphs that does not miss any key topology information.
      2. We propose a GCN based algorithm which effectively processes graph edge consisting of heterogeneous nodes by capturing the global importance and local importance of these nodes. Simultaneously, through an attention function, the algorithm focuses on more important homogeneous neighbors in aggregation stage. In addition, adding edge information while aggregating information from neighbor nodes can exchange heterogeneous topology information between Gemini-U and Gemini-I. Thus, the information fusion processes on the two graphs are interdependent. To our best knowledge, this is also the first work to take into account the above optimizations.
      3. To some extent, Gemini solves the sparsity problem of user-item interactions. Because, in addition to the first-order neighbor relations of user-item, the second-order neighbor relations of user-user and item-item are introduced to Gemini-U and Gemini-I.
      4. We design a training algorithm, Gemini-Collaboration, that enables the Gemini framework to run on a large-scale dataset.
      5. We conduct extensive offline experiments and deploy an online A/B tests at DiDiChuxing. Experimental results show the superiority of our Gemini over state-of-the-art algorithms.
    • Network Transformation

      image.png

      We transforms user-item heterogeneous graph into two semi homogeneous graphs, Gemini-U and Gemini-I, from the perspective of users and items respectively.

    • Edge Embedding

      First of all, the edge attributes (i.e., nodes in Att-U and Att-I) describe the original first-order neighbor relationship and can be used to measure the strength of neighbor relationships of nodes in Gemini-U or Gemini-I. Second, through sharing node embeddings, the edge attributes can provide information of heterogeneous nodes in another graph.
      This brings two advantages: one is that the topology information of Gemini-U/Gemini-I, especially the high order neighbor relationship, is exchanged to each other; the other is that the separate graphs, Gemini-U and Gemini-I, are closely related to each other and their network information fusion affects each other.

      • Sum Pooling (quantity)

        image.png

        The downside of this approach is that it ignores the importance of the different attribute node.

      • Local & Global Information (quality)

        The number of times a node appears on an edge describes the importance of the node to the edge, which we call Local Information because that this is from the perspective of a single edge. Conversely, the more edges a node appears on, the less important it is. We call the IDF of node Global Information because that this is from the perspective of all edges.

      • TF-IDF Pooling

        image.png

    • Information Convolution

      • Attention based Aggregating

        Our aggregator function is an attention-layer that combines edge embeddings and node embeddings, which can be formulated as follows:

        image.png

        Edge Vectors:

        image.png

        The attention aggregator can be calculated as follow:

        image.png

      • Edge CONV

        We pass the neighbor information to self node by the following convolution function:

        image.png

    • Gemini Framework

      image.png

      Line 2-12 is the sampling stage of Gemini. Each set U k U^k Ukcontains the nodes that are needed to compute the representations of nodes u u u U k + 1 U^{k+1} Uk+1, the same for each set V k V^k Vk. Lines 15-20 and 21-27 correspond to the aggregation stage of the user nodes and the item nodes, respectively.

    • Gemini-Collaboration Framework

      The core idea of Gemini-Collaboration is that in one iteration, when the z z z-th layer z z z ∈ {1, · · · , K K K} embedding is calculated, the embedding of its attribute nodes is the calculated value of this or last iteration and the attribute embedding is not updated in this lteration.

      image.png

    • Experiments

      image.png

  • EXISTING PROBLEMS: 404

  • IMPROVEMENT IDEAS: Edge embedding with TF-IDF Pooling should be normalized or not?

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值