推荐系统概述5

最新推荐文章于 2023-04-23 17:56:37 发布

相国大人

最新推荐文章于 2023-04-23 17:56:37 发布

阅读量990

点赞数

分类专栏： MachineLearning Dive into ML/DL

本文链接：https://blog.csdn.net/github_36326955/article/details/71407058

版权

Dive into ML/DL 同时被 2 个专栏收录

55 篇文章 71 订阅

订阅专栏

MachineLearning

24 篇文章 3 订阅

订阅专栏

写这篇博文用了很多时间和精力，如果这篇博文对你有帮助，希望您可以打赏给博主相国大人。哪怕只捐1毛钱，也是一种心意。通过这样的方式，也可以培养整个行业的知识产权意识。我可以和您建立更多的联系，并且在相关领域提供给您更多的资料和技术支持。

赏金将用于拉萨儿童图书公益募捐
手机扫一扫，即可：
这里写图片描述

附：《春天里，我们的拉萨儿童图书馆，需要大家的帮助》

本篇是第5篇

本节主要内容：

- 参考文献
- 基于圈的在线社交推荐

1. 参考文献

[1.139] N. Srebro and T. Jaakkola, “Weighted low-rank approximations,” in Proc. 20th Int. Conf. Mach. Learn., 2003, pp. 720–727.

[1.141] X. Yang, H. Steck, and Y. Liu, “Circle-based recommendation in online social networks,” in Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2012, pp. 1267–1275.

2. 基于圈的在线社交推荐

2.1 本文贡献

孙相国说：

对于豆瓣数据集来说，情景感知这一方面，是可以明显得到的，即用户的动作流就是一种。而对于trust-aware（信任感知），我以为，除了VIP之间的网络也许有体现外，非VIP也许并没有什么体现。因为豆瓣总体上来看，社交功能并不是非常发达。用户在豆瓣上进行社交的周期也不长。

本文在方法上，隶属于trust-aware这一范畴。这里的“圈”，circle，实际上就是基于特定用户视角下的group，由于从已有数据提取circle难度非常大(Unfortunately, in most existing multi-category rating datasets, a user’s social connections from all categories are mixed together.even if the circles were explicitly known,they may not correspond to particular item categories that a recommender system may be concerned with. )因此，本文的贡献在于，提供了推断circle的方法。并在此基础上，建立基于trust-aware的推荐系统。

We propose a set of algorithms to infer category specific circles of friends and to infer the trust value on each link based on user rating activities in each category. To infer the trust value of a link in a circle, we first estimate a user’s expertise level in a category based on the rating activities of herself as well as all users trusting her. We then assign to users trust values proportional to their expertise levels. The reconstructed trust circles are used to develop a low-rank matrix factorization type of RS.

2.2 相关工作

关于矩阵分解，论文在这里表达为：

$\hat{R}=r_m+QP^T$

这样最小化目标函数为

$\frac{1}{2}\sum_{(u,i)\in obs.}(R_{u,i}-\hat{R}_{u,i})^2+\frac{\lambda}{2}(\left \|P \right \|_F^2)+\left \|Q \right \|_F^2$

其中 $r_m$ 是基准预测，相关概念可以参考《推荐系统：技术、评估及高效算法》104页的5.3节.

本文在实验关节对比的baseline model是第36篇参考文献[18]A. P. Singh and G. J. Gordon. Relational learning via collective matrix factorization. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’08), pages 650–658, 2008. 中提出的SocialMF Model.

The social network information is represented by a matrix $S \in \mathbb{R}^{u_0 \times u_0}$ , where $u_0$ is the number of users. The directed and weighted social relationship of user $u$ with user $v$ (e.g. user $u$ trusts/knows/follows user $v$ ) is represented by a positive value $S_{u,v} \in (0, 1]$ . An absent or unobserved social relationship is reflected by $S_{u,v} = s_m$ , where typically $s_m = 0$ . Each of the rows of the social network matrix $S$ is normalized to 1, resulting in the new matrix $S^∗$ with $S_{u,v}^∗ \propto S_{u,v}$ and $\sum_v S_{u,v}^∗ = 1$ for each user $u$ . The idea underlying SocialMF is that neighbors in the social network may have similar interests. This similarity is enforced by the second term in the following objective function, which says that user profile $Q_u$ should be similar to the (weighted) average of his/her friends’ profiles $Q_v$ (measured in terms of the square error):

1 2 \sum (u, i) \in o b s . (R u, i - R^u, i) 2 + β 2 \sum u ((Q u - \sum v S * u, v Q v) (Q u - \sum v S * u, v Q v) T) + λ 2 (∥ P ∥ 2 F) + ∥ Q ∥ 2 F (2)

$\frac{1}{2}\sum_{(u,i)\in obs.}(R_{u,i}-\hat{R}_{u,i})^2+\frac{\beta}{2}\sum_{u}\left((Q_u-\sum_vS^*_{u,v}Q_v)(Q_u-\sum_vS^*_{u,v}Q_v)^T\right)+\frac{\lambda}{2}(\left \|P \right \|_F^2)+\left \|Q \right \|_F^2 \tag{2}$

上式第二项中， $Q_u$ 是用户 $u$ 在隐因子空间中的点， $\sum_vS^*_{u,v}Q_v$ 表达的是用户 $u$ 的邻居加权后的点（权重是 $u$ 对邻居的信任值），论文认为用户的邻居的加权应该与用户本身相似，因此第二项用最小平方误差来表达。我们可以采用随机梯度下降法来对目标函数进行参数估计。

2.3 基于圈的推荐模型

本文认为，一个用户在某些类别中，可能会信任他的某个朋友，但未必在其他类别中，也信任同样的这个朋友。（孙相国：我们也许可以根据电影类别，来划分R矩阵，正如同本文根据category来划分R矩阵一样）

2.3.1 信任圈的推断

2.3.2 信任值的设定

这一节，论文提供了三种获取设定信任值的方法。

(1)均衡信任

S (c) * u, v = 1 | C ( c ) u |, \forall v \in C (c) u (3)

$S^{(c)*}_{u,v}=\frac{1}{|\mathcal{C}_u^{(c)}|},\forall v \in \mathcal{C}_u^{(c)} \tag{3}$

(2)基于专家的信任

The goal is to assign a higher trust value or weight to the friends that are experts in the circle / category. As an approximation to their level of expertise, we use the numbers of ratings they assigned to items in the category. The idea is that an expert in a category may have rated more items in that category than users who are not experts in that category.

设用户 $u$ 在 $c$ 领域信任的朋友集合为 $\mathcal{C}_u^{(c)}$ ( $u$ 的信任圈circle)，在 $c$ 领域信任 $u$ 的用户集合为 $\mathcal{F}_u^{(c)}$ , $u$ 在领域 $c$ 的专家水平定义为 $E_u^{(c)}$ .

方法1：

认为用户 $v$ 的专家水平等于其在领域的ratings数量： $E_v^{(c)}=N_v^{(c)}$ ,这样有：

S (c) u, v = {N (c) v, v \in C c u 0, o t h e r w i s e (4)

$S_{u,v}^{(c)}=\left\{\begin{matrix} N_v^{(c)},v\in \mathcal{C}_u^c\\ 0,otherwise \end{matrix}\right. \tag{4}$

归一化得到：

S (c) * u, v = S ( c ) u , v \sum v S ( c ) u , v (5)

$S_{u,v}^{(c)*}=\frac{S_{u,v}^{(c)}}{\sum_v S_{u,v}^{(c)}}\tag{5}$

方法2：

In this case, the expertise level of user $v$ in category $c$ is the product of two components: the first component is the number of ratings that $v$ assigned in category $c$ , the second component is some voting value in category $c$ from all her followers in $F_v (c)$ . The intuition is that if most of $v$ ’s followers have lots of ratings in category $c$ , and they all trust $v$ , it is a good indication that $v$ is an expert in category $c$ .

(3)信任分割

从(1)(2)我们看到， $u$ 对 $v$ 的信任值，是领域独立的。也就是说，我们考虑 $u$ 在领域 $c$ 信任 $v$ 时，只考虑 $c$ 领域，但是领域之间独立真的好吗？看一个例子：如果 $u$ 和 $v$ 都在 $c_1,c_2$ 有过ratings.并且 $v$ 在 $c_1$ 的rating数量远多于其在 $c_2$ 中的数量，那么， $u$ 对 $v$ 的信任似乎在 $c_1$ 领域应该多于在 $c_2$ 领域。可是我们之前的两种方案，都没有考虑这一点，这就会造成，两个领域的信任值相比较于实际不符合。因此这里提出第三种方案：

S (c) u, v = ⎧ ⎩ ⎨ ⎪ ⎪ ⎪ ⎪ N ( c ) v \sum c N ( c ) v, v \in C c u 0, o t h e r w i s e (6)

$S_{u,v}^{(c)}=\left\{\begin{matrix} \frac{N_v^{(c)}}{\sum_c N_v^{(c)}},v\in \mathcal{C}_u^c\\ 0,otherwise \end{matrix}\right. \tag{6}$
上面公式的意思是，不仅仅只考虑当前领域

c $c$ ，还要对共同领域做一个横向比较，以此来确定相对大小。当然同样的，接下来还是要做归一化：

S (c) * u, v = S ( c ) u , v \sum v S ( c ) u , v (7)

$S_{u,v}^{(c)*}=\frac{S_{u,v}^{(c)}}{\sum_v S_{u,v}^{(c)}}\tag{7}$

2.3.3 模型训练

单个领域模型训练

参见2.2节的公式2，我们现在要用这个公式分别对各个领域 c <script type="math/tex" id="MathJax-Element-71">c</script>进行训练，因此有:

采用梯度下降法进行参数估计

全局模型训练

考虑到数据的稀疏性，我们希望把上面的训练公式中第一行扩展到全局。即：

后面的章节，请阅读《[推荐系统概述6](http://blog.csdn.net/github_36326955/article/details/71408429》

如果这篇博文对你有帮助，希望您可以打赏给博主相国大人。我可以和您建立更多的联系，并且在相关领域提供给您更多的资料和技术支持。
手机扫一扫，即可：
这里写图片描述

相国大人

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
推荐系统概述5

写这篇博文用了很多时间和精力，如果这篇博文对你有帮助，希望您可以打赏给博主相国大人。哪怕只捐1毛钱，也是一种心意。通过这样的方式，也可以培养整个行业的知识产权意识。我可以和您建立更多的联系，并且在相关领域提供给您更多的资料和技术支持。赏金将用于拉萨儿童图书公益募捐手机扫一扫，即可：附：《春天里，我们的拉萨儿童图书馆，需要大家的帮助》本节主要内容：参考文献基于圈的在线社交推荐1 本文
复制链接

扫一扫

专栏目录