推荐系统概述6

最新推荐文章于 2023-04-26 07:56:17 发布

相国大人

最新推荐文章于 2023-04-26 07:56:17 发布

阅读量987

点赞数 1

分类专栏： MachineLearning Dive into ML/DL

本文链接：https://blog.csdn.net/github_36326955/article/details/71408429

版权

Dive into ML/DL 同时被 2 个专栏收录

55 篇文章 71 订阅

订阅专栏

MachineLearning

24 篇文章 3 订阅

订阅专栏

写这篇博文用了很多时间和精力，如果这篇博文对你有帮助，希望您可以打赏给博主相国大人。哪怕只捐1毛钱，也是一种心意。通过这样的方式，也可以培养整个行业的知识产权意识。我可以和您建立更多的联系，并且在相关领域提供给您更多的资料和技术支持。

赏金将用于拉萨儿童图书公益募捐
手机扫一扫，即可：
这里写图片描述

附：《春天里，我们的拉萨儿童图书馆，需要大家的帮助》

本篇是第6篇

本节主要内容：

- 参考文献
- 基于社会信任混合推荐

参考文献

[1.140] H. Ma, I. King, and M. R. Lyu, “Learning to recommend with social trust ensemble,” in Proc. 32nd Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2009, pp. 203–210

基于社会信任混合推荐

问题的引入

现代推荐算法普遍存在的问题：

用户-项目矩阵数据稀疏。很多商用推荐中的用户-项目矩阵数据密度不到1%。
一般假定各个用户独立同分布。即所有用户采取同一个模型进行商品评分。忽略了用户之间的社交关系，这与真实世界不符合。
冷启动（孙相国）

因此，单纯的挖掘用户-项目矩阵，并不能很好的进行推荐。为此，本文引入信任感知(trust-aware),将其与用户自己的tastes结合，即：一个用户的最终决定是他自己的tastes与其trusted friends’ favors 的调和。在用户-项目矩阵 $R$ 中，本文认为 $R_{i,j}$ 是用户 $u_i$ 的tastes和其trusted friends tastes on the item $v_j$ 。

In terms of the users’ own tastes, we factorize the useritem matrix and learn two low-dimensional matrices, which.are user-specific latent matrix and item-specific latent matrix. For the social trust graph, based on the intuition that users always prefer the items recommended by the friends they trust, we infer and formulate the recommendation problem purely based on their trusted friends’ favors.

【问题：】论文已经假定R为用户tastes和其trusted friends’ favor的中和，那么为什么对R分解得到的却只是用户的tastes？得到的难道不应该是用户tastes和其trusted friends’ favor的中和吗？

改进：用户的tastes从用户的动作流中获得。

相关研究工作

本文主要梳理了两方面的推荐：1. 基于协同过滤的推荐；2. 基于社会信任（social trust-based）的推荐

1.基于协同过滤的推荐

（更详尽的信息，可以参考笔者的另一篇文档《推荐算法概述》）

传统的协同过滤主要聚焦于用户-项目矩阵，为此有三个分支：基于用户；基于商品；基于模型。基于用户和基于商品又统称为基于记忆/邻域的协同过滤。

基于记忆的协同过滤：通过相似用户的评分来预测待推荐用户对某一未知商品的预测评分（基于用户），或者通过待推荐用户使用过的其他相似商品的评分来预测特定未知商品的评分（基于商品），或者将两者结合起来。

基于模型：

aspect models [7, 8, 21], the latent factor model [4], the Bayesian hierarchical model [24] and the ranking model [11]. matrix factorization methods [16, 18, 19, 22]

These methods focus on factorizing the user-item rating matrix using lowrank representations, and then utilize them to make further predictions. The motivation behind a low-dimensional factorization model is that there is only a small number of factors that are important, and a user’s preference vector is determined by how each factor applies to that user.

2.基于社会信任的推荐

[1,2,13,14,15]

[1] developed a set of five natural axioms that a trust-based recommendation system might be expected to satisfy, and then proved that no system can simultaneously satisfy all the axioms.

[14, 15]studied the trust-aware recommender systems. Their work replaces the similarity finding process with the use of a trust metric, which is able to propagate trust over the trust network and to estimate a trust weight. The experiments on a large real dataset shows that this work increases the coverage (number of ratings that are predictable) while not reducing the accuracy (the error of predictions).

[2] proposed a trust-based recommender system for the Semantic Web; this system runs on a server with the knowledge distributed over the network in the form of ontologies, and uses the Web of trust to generate the recommendations.
[13]developed a factor analysis method based on the probabilistic graphical model which fuses the user-item matrix with the users’ social trust networks by sharing a common latent low-dimensional user feature matrix.

1.问题描述

In the real world, the process of recommendation scenario includes two central elements:

**1. the trust network and the favors of these friends**in Fig. 1(a).

2. the useritem rating matrix in Fig. 1(b).

The problem we study in this paper is how to predict the missing values for the users effectively and efficiently by employing the trust graph and the user-item rating matrix.

2.用户特征描述

本文中的用户特征是从user-item矩阵分解后得到的。事实上，我们也可以借助第三方信息，通过自编码器来学习用户特征。矩阵分解做推荐的一般流程和本质，请见2017/3/21。

我们之前（2017/3/21)说过，矩阵分解做推荐的本质是寻找最好的 $U,V$ 矩阵，使得下式最小化（上面的公式，在2017/3/21中我们讲过，此处不多做解释）：

L o s s (U, V) = D (R, g (U T V)) + R (U, V)

$Loss(U,V)=\mathcal{D}(R,\mathcal{g}(U^TV))+\mathcal{R}(U,V)$
为了找到合适的

U,V $U,V$ ，一种可行的方法是运用贝叶斯估计。即我们希望最大化：

p (U, V | R, θ \to)

$\mathcal{p}(U,V|R,\overrightarrow{\theta})$
得到

U,V $U,V$ 的极大似然估计。

运用贝叶斯定理可知：

p (U, V | R, θ \to) \propto p (R | U, V, θ 1 \to) p (U, V | θ 2 \to)

$\mathcal{p}(U,V|R,\overrightarrow{\theta})\propto \mathcal{p}(R|U,V,\overrightarrow{\theta_1})\mathcal{p}(U,V|\overrightarrow{\theta_2})$
本文假定

U,V $U,V$ 相互独立，则上式进一步化简为：

p (U, V | R, θ \to) \propto p (R | U, V, θ 1 \to) p (U | θ 2 \to) p (V | θ 2 \to)

$\mathcal{p}(U,V|R,\overrightarrow{\theta})\propto \mathcal{p}(R|U,V,\overrightarrow{\theta_1})\mathcal{p}(U|\overrightarrow{\theta_2})\mathcal{p}(V|\overrightarrow{\theta_2})$
由此可见，我们需要找到三个概率分布:

p(R|U,V,θ1→),p(U|θ2→),p(V|θ2→) $\mathcal{p}(R|U,V,\overrightarrow{\theta_1}),\mathcal{p}(U|\overrightarrow{\theta_2}),\mathcal{p}(V|\overrightarrow{\theta_2})$ .这篇论文接下来就是按照这个思维进行论述的。

本文根据[1.140.19]，令观测矩阵的条件分布为：

p (R | U, V, σ 2 R) = Π m i = 1 Π n j = 1 [N (R i j | g (U T i V j), σ 2 R)] I R i j

$\mathcal{p}\left(R|U,V,\sigma_R^2\right)=\Pi_{i=1}^m\Pi_{j=1}^n\left[ \mathcal{N}\left(R_{ij}|\mathcal{g}\left(U_i^TV_j\right),\sigma_R^2\right)\right]^{I_{ij}^R}$
公式解读：

$\mathcal{N}\left(x|\mu, \sigma^2\right)$ is the probability density function of the Gaussian distribution ^note1^ mean $\mu$ and variance $\sigma^2$ σ2, and $I_{ij} ^R$ is the indicator function that is equal to 1 if user $u_i$ rated item $v_j$ and equal to 0 otherwise. The function $\mathcal{g}\left(x\right)$ is the logistic function**^note2^** $\mathcal{g}(x) = 1/(1 + exp(−x))$ which makes it possible to bound the range of $U_i^TV_j$ within the range [0,1].

note1:为什么要用高斯分布呢？

我们注意到 $R\approx U^TV$ ，一般的，我们认为 $U^TV$ 是 $R$ 的主要成分,对应的有 $R_{ij}\approx U_i^TV_j$ ，即认为 $U_i^TV_j$ 是 $R_{ij}$ 的主要成分。表达“主要成分”这一意涵，用高斯分布是合理的。

note2:*logistic函数可以用于人工构造模拟概率和归一化*。这里使用logistic函数目的是因为本文假定

Rij∈(0,1] $R_{ij}\in (0,1]$ ，为了防止

UTiVj $U_i^TV_j$ 超过范围，需要用logistic函数进行归一化。

$U,V$ 的生成分布：

p (U | σ 2 U) = Π m i = 1 N (U i | 0, σ 2 U I), p (V | σ 2 V) = Π n j = 1 N (V j | 0, σ 2 V I)

$\mathcal{p}(U|\sigma_U^2)=\Pi_{i=1}^m\mathcal{N}(U_i|0,\sigma_U^2I),\mathcal{p}(V|\sigma_V^2)=\Pi_{j=1}^n\mathcal{N}(V_j|0,\sigma_V^2I)$
其中

σ2UI $\sigma_U^2I$ 和

σ2VI $\sigma_V^2I$ ，表示

σ2U $\sigma_U^2$ 和

σ2U $\sigma_U^2$ 的第

i,j $i,j$ 个分量。

这样根据公式 $(6)$ 我们有：

p (U, V | R, σ 2 R, σ 2 U, σ 2 V) \propto p (R | U, V, σ 2 R) p (U | σ 2 U) p (V | σ 2 V) = Π m i = 1 Π n j = 1 [N (R i j | g (U T i V j), σ 2 R)] I R i j \times Π m i = 1 N (U i | 0, σ 2 U I) \times Π n j = 1 N (V j | 0, σ 2 V I)

$\mathcal{p}(U,V|R,\sigma_R^2,\sigma_U^2,\sigma_V^2)\propto \mathcal{p}(R|U,V,\sigma_R^2)\mathcal{p}(U|\sigma_U^2)\mathcal{p}(V|\sigma_V^2)\\=\Pi_{i=1}^m\Pi_{j=1}^n\left[ \mathcal{N}\left(R_{ij}|\mathcal{g}\left(U_i^TV_j\right),\sigma_R^2\right)\right]^{I_{ij}^R}\times \Pi_{i=1}^m\mathcal{N}(U_i|0,\sigma_U^2I)\times \Pi_{j=1}^n\mathcal{N}(V_j|0,\sigma_V^2I)$
公式

(9) $(9)$ 对应的概率图模型见

Fig.2(a) $Fig.2(a)$

3.基于trusted friends的推荐

令 $\mathcal{G} = (\mathcal{U},\mathcal{E})$ 表示社会信任拓扑， $\mathcal{U}$ 表示用户集合(m个用户)， $\mathcal{E}$ 表示用户之间的信任关系。令 $S = \{S_{ij}\}$ 为 $m \times m$ 的矩阵，叫作社会信任矩阵(social trust matrix). $S_{ij} \in (0,1]$ 表示用户 $u_i$ 对用户 $u_j$ 的信任程度。需要说明的是，矩阵 $S$ 是非对称的，因为 $u_i$ 信任用户 $u_j$ 并不意味着用户 $u_j$ 一定信任用户 $u_i$ . 借助上一节的思想，

基于trusted friends的推荐，核心思想是，我们对受信好友的评分进行加权平均。从而得到目标用户的估计评分。具体来说，用户 $i$ 的trusted friends用集合 $\mathcal{T}(i)$ 表示，那么用户 $i$ 对商品 $k$ 的估计评分为：

R^i k = \sum j \in T ( i ) R j k S i j | T ( i ) |

$\hat{R}_{ik}=\frac{\sum_{j\in \mathcal{T}(i)}R_{jk}S{ij}}{\left |\mathcal{T}(i) \right |}$
事实上，

|T(i)| $|\mathcal{T}(i)|$ 对于

Si,: $S_{i,:}$ 来说是相同的，因此可以将其分配到

S $S$ 中，即令

Si,j=Si,j/|T(i)|,j∈T(i) $S_{i,j}=S_{i,j}/|\mathcal{T}(i)|,j \in \mathcal{T}(i)$ .这样，式

(10) $(10)$ 可以简化为：

R^i k = \sum j \in T (i) R j k S i j

$\hat{R}_{ik}=\sum_{j\in \mathcal{T}(i)}R_{jk}S{ij}$

⎛ ⎝ ⎜ ⎜ ⎜ ⎜ R^i 1 R^i 2 \dots R^i n ⎞ ⎠ ⎟ ⎟ ⎟ ⎟ = ⎛ ⎝ ⎜ ⎜ ⎜ R 11, R 21, \dots, R m 1 R 11, R 22, \dots, R m 2 \dots, \dots, \dots, \dots R 1 n, R 2 n, \dots, R m n ⎞ ⎠ ⎟ ⎟ ⎟ = ⎛ ⎝ ⎜ ⎜ ⎜ S i 1 S i 2 \dots S i m ⎞ ⎠ ⎟ ⎟ ⎟

$\begin{pmatrix} \hat{R}_{i1}\\ \hat{R}_{i2}\\ \cdots\\ \hat{R}_{in}\\ \end{pmatrix}=\begin{pmatrix} R_{11},R_{21},\cdots,R_{m1}\\ R_{11},R_{22},\cdots,R_{m2}\\ \cdots,\cdots,\cdots,\cdots\\ R_{1n},R_{2n},\cdots,R_{mn}\\ \end{pmatrix}=\begin{pmatrix} S_{i1}\\ S_{i2}\\ \cdots\\ S_{im}\\ \end{pmatrix}$
由此得：

R^= S R

$\hat{R}=SR$
由于

R^ $\hat{R}$ 也是对

R $R$ 的近似(认为是主要成分)，因此可以令

R $R$ 的观测概率分布为高斯分布.需要注意的是，公式

(10) $(10)$ 中的

Rjk $R_{jk}$ 我们用

UTiVj $U_i^TV_j$ 来替代，因为

i $i$ 的trusted friends并不总是都会对商品

k $k$ 打分。另外，我们的目标也是希望通过本小节来得到

U,V $U,V$ 矩阵。因此，

R $R$ 的观测概率分布为：

p (R | S, U, V, σ 2 S) = Π m i = 1 Π n j = 1 ⎡ ⎣ N ⎛ ⎝ R i j | g ⎛ ⎝ \sum k \in T (i) R j k U T k V j ⎞ ⎠, σ 2 S ⎞ ⎠ ⎤ ⎦ I R i j

$\mathcal{p}(R|S,U,V,\sigma_S^2)=\Pi_{i=1}^m\Pi_{j=1}^n\left[ \mathcal{N}\left(R_{ij}|\mathcal{g}\left(\sum_{k\in \mathcal{T}(i)}R_{jk}U_k^TV_j\right),\sigma_S^2\right)\right]^{I_{ij}^R}$
where

Sik $S_{ik}$ is normalized by

|T(i)| $|\mathcal{T} (i)|$ , which is the number of trusted friends of user

ui $u_i$ in the set

T(i) $\mathcal{T} (i)$ .

IRij $I_{ij}^R$ is the indicator function that is equal to 1 if user

i $i$ rated item

j $j$ and equal to 0 otherwise.

基于公式 $(14)$ ，我们同样可以得到与上一节类似的贝叶斯估计：

p (U, V | R, S, σ 2 S, σ 2 U, σ 2 V) \propto p (R | S, U, V, σ 2 S) p (U | S, σ 2 U) p (V | S, σ 2 V)

$\mathcal{p}(U,V|R,S,\sigma_S^2,\sigma_U^2,\sigma_V^2)\propto \mathcal{p}(R|S,U,V,\sigma_S^2)\mathcal{p}(U|S,\sigma_U^2)\mathcal{p}(V|S,\sigma_V^2)$
假定用户信任网与

U,V $U,V$ 独立（这个假定的现实意义是用户彼此信任网络与user-movie的评分矩阵无关，这个假定是自然的），那么上式进一步简化为：

p (U, V | R, S, σ 2 S, σ 2 U, σ 2 V) \propto p (R | S, U, V, σ 2 S) p (U | S, σ 2 U) p (V | σ 2 V) = Π m i = 1 Π n j = 1 ⎡ ⎣ N ⎛ ⎝ R i j | g ⎛ ⎝ \sum k \in T (i) R j k U T k V j ⎞ ⎠, σ 2 S ⎞ ⎠ ⎤ ⎦ I R i j \times Π m i = 1 N (U i | 0, σ 2 U I) \times Π n j = 1 N (V j | 0, σ 2 V I)

$\mathcal{p}(U,V|R,S,\sigma_S^2,\sigma_U^2,\sigma_V^2)\propto \mathcal{p}(R|S,U,V,\sigma_S^2)\mathcal{p}(U|S,\sigma_U^2)\mathcal{p}(V|\sigma_V^2)\\=\Pi_{i=1}^m\Pi_{j=1}^n\left[ \mathcal{N}\left(R_{ij}|\mathcal{g}\left(\sum_{k\in \mathcal{T}(i)}R_{jk}U_k^TV_j\right),\sigma_S^2\right)\right]^{I_{ij}^R} \times \Pi_{i=1}^m\mathcal{N}(U_i|0,\sigma_U^2I)\times \Pi_{j=1}^n\mathcal{N}(V_j|0,\sigma_V^2I)$
公式

(16) $(16)$ 对应的概率图模型见

Fig.2(b) $Fig.2(b)$

4.social trust ensemble

有了第2节和第3节的基础，这一小节，我们希望能够将2,3节的两个模型结合在一起,即“ensemble”.

p (U, V | R, S, σ 2, σ 2 U, σ 2 V) = Π m i = 1 Π n j = 1 ⎡ ⎣ N ⎛ ⎝ R i j | g ⎛ ⎝ α U T i V j + (1 - α) \sum k \in T (i) R j k U T k V j ⎞ ⎠, σ 2 S ⎞ ⎠ ⎤ ⎦ I R i j \times Π m i = 1 N (U i | 0, σ 2 U I) \times Π n j = 1 N (V j | 0, σ 2 V I)

$\mathcal{p}(U,V|R,S,\sigma^2,\sigma_U^2,\sigma_V^2)\\ =\Pi_{i=1}^m\Pi_{j=1}^n\left[ \mathcal{N}\left(R_{ij}|\mathcal{g}\left(\alpha U_i^TV_j+(1-\alpha)\sum_{k\in \mathcal{T}(i)}R_{jk}U_k^TV_j\right),\sigma_S^2\right)\right]^{I_{ij}^R} \\ \times \Pi_{i=1}^m\mathcal{N}(U_i|0,\sigma_U^2I)\times \Pi_{j=1}^n\mathcal{N}(V_j|0,\sigma_V^2I)$
对应的概率图模型为

Fig.2(c) $Fig.2(c)$

这里的参数有： $\sigma,\sigma_U,\sigma_V$ 超参是 $\alpha$ .接下来，我们采用梯度下降法来进行参数估计：

5.复杂度分析

这里指的注意的是，在分析复杂度时，论文考虑到实际online social network具有幂律分布的特点，由此来估计复杂度更好，这个经验值得你借鉴。

6.改进

正如 $Fig.2(c)$ 展示的，你之前读过一篇论文是基于多个上下文（context），因此你可以把那个概率图模型与这个模型融合。

另外从用户的动作流中预测用户行为，相当于通过动作流得到用户的时间倾向序列，通过加入衰减函数，来得到用户-商品相似度。

后面的章节，请阅读《[推荐系统概述7](http://blog.csdn.net/github_36326955/article/details/71409631》

如果这篇博文对你有帮助，希望您可以打赏给博主相国大人。我可以和您建立更多的联系，并且在相关领域提供给您更多的资料和技术支持。
手机扫一扫，即可：
这里写图片描述

相国大人

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
推荐系统概述6

写这篇博文用了很多时间和精力，如果这篇博文对你有帮助，希望您可以打赏给博主相国大人。哪怕只捐1毛钱，也是一种心意。通过这样的方式，也可以培养整个行业的知识产权意识。我可以和您建立更多的联系，并且在相关领域提供给您更多的资料和技术支持。赏金将用于拉萨儿童图书公益募捐手机扫一扫，即可：附：《春天里，我们的拉萨儿童图书馆，需要大家的帮助》本节主要内容：参考文献基于社会信任混合推荐问题的引
复制链接

扫一扫