Neural Collaborative Filtering【论文记录】

最新推荐文章于 2024-03-24 09:56:41 发布

Novelin

最新推荐文章于 2024-03-24 09:56:41 发布

阅读量224

点赞数 1

分类专栏：推荐系统文章标签：协同过滤线性和非线性不同embedding

本文链接：https://blog.csdn.net/qq_40860934/article/details/110495225

版权

推荐系统专栏收录该内容

10 篇文章 1 订阅

订阅专栏

1 摘要

we strive to develop techniques based on neural networks to tackle the key problem in recommendation — collaborative filtering — on the basis of implicit feedback.
我们致力于发展基于神经网络的技术，在隐式反馈的基础上，来解决推荐中的关键问题–协同过滤
we propose to leverage a multi-layer perceptron to learn the user–item interaction function
我们想利用多层感知器来学习用户-项目交互特征

2 介绍

The key to a personalized recommender system is in modelling users’ preference on items based on their past interactions (e.g., ratings and clicks), known as collaborative filtering
个性化推荐系统的关键在于根据用户过去的互动情况（例如评分和点击次数）对用户的偏好进行建模，这称为协作过滤
The inner product may not be sufficient to capture the complex structure of user interaction data
内积可能不足以捕获用户交互数据的复杂结构

2.1 隐式数据

The recommendation problem with implicit feedback is formulated as the problem of estimating the scores of unobserved entries in user–item interaction matrix
具有隐式反馈的推荐问题被表述为估计用户与项目交互矩阵中未观察到的条目的分数的问题
they can be abstracted as learning $\hat{y}_{ui}=f(u,i|\Theta)$
$\hat{y_{ui}}$ denotes the predicted score of interaction $y_{ui}$ ,
$\Theta$ denotes model parameters,
and $f$ denotes the function that maps model parameters to the predicted score
To estimate parameters $\Theta$ , Two types of objective functions are most commonly used in literature – pointwise loss and pairwise loss
为了估计参数 $\Theta$ ，文献中最常用的两种目标函数是：逐点损失和成对损失
methods on pointwise learning usually follow a regression framework by minimizing the squared loss between $\hat{y}_{ui}$ and its target value $y_{ui}$ .
逐点点学习的方法通常遵循回归框架，即最小化 $\hat {y_ {ui}}$ 和其目标值 $y_ {ui}$ 之间的平方损失
For pairwise learning, the idea is that observed entries should be ranked higher than the unobserved ones. As such, instead of minimizing the loss between $\hat{y}_{ui}$ and $y_{ui}$ , pairwise learning maximizes the margin between observed entry $\hat{y}_{ui}$ and unobserved entry $\hat{y}_{ui}$ .
对于成对学习，其思想是观察到的条目的排名应高于未观察到的条目。因此，成对学习不是使 $\hat {y_ {ui}}$ 和 $y_ {ui}$ 之间的损失最小，而是使观察到的 $\hat{y_{ui}}$ 和未观察到的 $\hat {y_{ui}}$ 之间的余量最大化。

2.2 矩阵分解

MF models the two-way interaction of user and item latent factors, assuming each dimension of the latent space is independent of each other and linearly combining them with the same weight. As such, MF can be deemed as a linear model of latent factors.
MF 对用户和项目潜在因素的双向交互进行建模，假设潜在空间的每个维度彼此独立，并以相同的权重将它们线性组合。这样，MF 可以看作是潜在因子的线性模型。
若使用 Jaccard coefficient 作为两个用户向量的相似性度量

图中 $u_4$ 与 $u_1$ 最相似，然后是 $u_3$ 和 $u_2$ ，但在隐空间， $p_4$ 最接近 $p_1$ 的话， $p_4$ 会更接近 $p_2$ ，然后是 $p_3$
上述例子表示了内积操作在低维空间分析复杂的用户-物品交互的局限
We note that one way to resolve the issue is to use a large number of latent factors K, However, it may adversely hurt the generalization of the model, specially in sparse settings
我们注意到，解决问题的一种方法是使用大量潜在因子K，但是，这反而可能损害模型的泛化能力，尤其是在稀疏环境中

3 神经元协同过滤

3.1 Neural CF

Neural CF

we adopt a multi-layer representation to model a user–item interaction $y_{ui}$ as shown in Figure 2
The obtained user (item) embedding can be seen as the latent vector for user (item) in the context of latent factor model
在潜在因子模型的上下文中，可以将获得的用户（项目）Embedding 视为用户（项目）的潜在向量
The final output layer is the predicted score $\hat{y}_{ui}$ , and training is performed by minimizing the pointwise loss between $\hat{y}_{ui}$ and its target value $y_{ui}$ .
通过最小化 $\hat{y}_{ui}$ 与目标值 $y_{ui}$ 之间的逐点损失来进行训练。
another way to train the model is by performing pairwise learning, such as using the Bayesian Personalized Ranking and margin-based loss, we leave it as a future work
训练模型的另一种方法是执行成对学习，例如使用贝叶斯个性化排名和基于边距的损失，我们将其留作将来的工作

3.2 泛化的MF

MF can be interpreted as a special case of our NCF framework. We define the mapping function of the first neural CF layer as
$\phi_{1}\left(\mathbf{p}_{u}, \mathbf{q}_{i}\right)=\mathbf{p}_{u} \odot \mathbf{q}_{i} \tag{1}$ 其中 $\odot$ 是向量的逐元素相乘
再把向量输出
$\hat{y}_{ui}=a_{out}(h^T(p_u \odot q_i)) \tag{2}$ $a_{out}$ 代表激活函数，h 作为到输出层的权值
这里我们用 $\sigma(x)=1/(1+e^{-x})$ 作为 $a_{out}$ 并用 log loss 从数据中学到 $h$

3.3 多层感知机

a common solution is to follow a tower pattern, where the bottom layer is the widest
常见的解决方案是遵循塔式模式，其中最底层是最宽的
We empirically implement the tower structure, halving the layer size for each successive higher layer.
我们根据经验实现塔式结构，将每个连续的更高层的层大小减半。

3.4 GMF和MLP混合

sharing embeddings of GMF and MLP might limit the performance of the fused model; it implies that GMF and MLP must use the same size of embeddings; for datasets where the optimal embedding size of the two models varies a lot, this solution may fail to obtain the optimal ensemble.
共享GMF和MLP的 Embedding 可能会限制融合模型的性能；这意味着GMF和MLP必须使用相同大小的嵌入；对于两个模型的最佳嵌入大小相差很大的数据集，此解决方案可能无法获得最佳集合。

这与 DeepFM 正好不同

NMF Model

4 结果

4.1 深度学习有用吗

MLP useful1
MLP useful2
This result is highly encouraging, indicating the effectiveness of using deep models for collaborative recommendation
这一结果令人鼓舞，表明使用深度模型进行协作推荐的有效性

5 相关工作

While Wide&Deep has focused on incorporating various features of users and items, we target at exploring DNNs for pure collaborative filtering systems
虽然 Wide＆Deep 致力于整合用户和项目的各种特征，但我们的目标是探索 DNN 以实现纯协作过滤系统

6 总结

We devised a general framework NCF and proposed three instantiations — GMF, MLP and NeuMF — that model user–item interactions in different ways.
我们设计了一个通用框架NCF，并提出了三个实例化GMF，MLP和NeuMF，它们以不同的方式对用户-项目交互进行建模。
This work complements the mainstream shallow models for collaborative filtering, opening up a new avenue of research possibilities for recommendation based on deep learning.
这项工作补充了用于协同过滤的主流浅层模型，为基于深度学习的推荐研究开辟了一条新途径。
Another emerging direction is to explore the potential of recurrent neural networks and hashing methods [46] for providing efficient online recommendation
另一个新兴方向是探索递归神经网络和哈希方法的潜力，以提供有效的在线推荐