推荐系统概述2

最新推荐文章于 2022-12-14 14:08:08 发布

anfenyue4733

最新推荐文章于 2022-12-14 14:08:08 发布

阅读量291

点赞数

文章标签：数据库人工智能开发工具

原文链接：http://www.cnblogs.com/xiangguosun/p/6785401.html

版权

这是一个测试微博

凸共轭函数

扩展实值函数$f: \mathbb{R}^n \mapsto [-\infty, \infty]$的共轭凸函数$f^*: \mathbb{R}^n \mapsto [-\infty, \infty]$为：
\[ \begin{align*} f^*(\boldsymbol{y}) = \sup_{\boldsymbol{x} \in \mathbb{R}^n} \{ \boldsymbol{x}^\top \boldsymbol{y} - f(\boldsymbol{x}) \}, \ \boldsymbol{y} \in \mathbb{R}^n. \end{align*} \]

布雷格曼散度

[6]这篇文章描述了布雷格曼散度的适用场景。

If you have some abstract way of measuring the “distance” between any two points and, for any choice of distribution over points the mean point minimises the average distance to all the others, then your distance measure must be a Bregman divergence.

布雷格曼散度在矩阵上的扩充定义：

矩阵分解

矩阵分解在推荐系统中的通用流程：

SVD分解及其在推荐系统中的应用

SVD分解的原理推导和python实现在我之前的博客里详细介绍过。请参考下面这个链接：

《关于奇异值以及奇异值分解SVD的思考》

异构网络中，矩阵分解做推荐

In domains with more than one relation matrix, one could fit each relation separately; however, this approach would not take advantage of any correlations between relations. For example, a domain with users, movies, and genres might have two relations: an integer matrix representing users’ ratings of movies on a scale of 1{5, and a binary matrix representing the genres each movie belongs to. If users tend to rate dramas higher than comedies, we would like to exploit this correlation to improve prediction.

(From reference [3])

CMF

运用牛顿法进行参数估计。得到U，V，Z。

HeteroMF

CMF一个缺点是：

\[ X=UV^T\\ Y=VW^T \]
这意味着，中间那个类型的实体，其latent factor在不同的contexts中是相同的V。

1.如果它在其中一个contexts中存在冷启动，那么这个V的学习主要是从另一个上下文中获得的。

2.即便没有冷启动，如果两个上下文数据不均衡，V也是由dominant context决定。

读论文中，关于似然函数书写的一点心得。

你能猜出来接下来他要干嘛吗？

EM algorithm

论文研读2017/3/22

参考文献

[1] X. Yu, et al., “Recommendation in heterogeneous information networks with implicit user feedback,” in Proc. 7th ACM Conf. Recommender Syst., 2013, pp. 347–350.

[2] Y. Sun, J. Han, X. Yan, S. P. Yu, and T. Wu. PathSim: Meta Path-Based Top-K Similarity Search in Heterogeneous Information Networks. In VLDB, 2011

[3] S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. Bpr: Bayesian personalized ranking from implicit feedback. In UAI,2009.

文献依赖关系

[1]首先把观测到的用户倾向沿着可能的元路径进行扩散[2]。然后使用矩阵分解的技术对扩散矩阵进行分解，并得到相应用户和项目的隐含特征。接着，基于这些隐含特征构建一个混合推荐模型，并采用贝叶斯排名优化技术[3]进行参数估计。

[1]的缺点是没有像HeteroMF那样考虑到多个上下文

PathSim[2]

参考前面说过的HeteSim。这个公式中$:$相当于$|$,分母是自己到自己的路径。分母是到彼此的路径（Tom->Mary, Mary->Tom，所以要乘以2）。

补全稀疏矩阵

定义user feedback matrix

用PathSim或类似的相似性测量方法得到item pairs的相似性矩阵

By measuring the similarity of all item pairs with one meta-path, we can generate a symmetric similarity matrix, denoted as $S \in R^{n×n}$. With $L$ different meta-paths, we can calculate $L$ different similarity matrices with different semantics accordingly, denoted as $S^{(1)}, S^{(2)}, \cdots, S^{(L)}$.
对原始的稀疏矩阵$R$进行补全（原文把它称为diffuse）

补充：

这个方法也可以用在传统SVD推荐中。可以参考一下。传统SVD对缺失值补全方法：
1，赋值为0
2，赋值为均值
3，本文的方法

目标函数

By repeating the above process for all L similarity matrices, we can now generate L pairs of representations of users and items$ \left(U^{(1)}, V ^{ (1)},\cdots, U^{ (L)}, V ^{(L)}\right)$. Each low-rank pair represent users and items under a specific similarity semantics due to the user preference diffusion process. Considering that different similarity semantics could have different importance when making recommendations, we define the recommendation model as follows：

参数估计

在上一节的公式3中，我们得到的是用户对某一个商品的估计评分。

我们的最终目标是希望向用户推荐topk个商品，并且这些商品要按照可能性来进行rank。因此接下来的任务，是构建一个衡量用户对给定排序认可程度的指标。也就是特定排序下的概率[3]。

We use $p(e_a > e_b; u_i|\theta)$ to denote the probability that user $u_i$ prefers $e_a$ over $e_b$.

具体的说：
\[ p(e_a > e_b; u_i|\theta)=logistic\left( \hat {r}\left(u_i,e_a\right)- \hat {r}\left(u_i,e_b\right)\right) \]
likelihood function:
objective function:
随机梯度下降估计参数

论文研读2017/3/23

参考文献

[1] C. Shi, C. Zhou, X. Kong, P. S. Yu, G. Liu, and B. Wang, “HeteRecom: A semantic-based recommendation system in heterogeneous networks,” in Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2012, pp. 1552–1555.

[2] C. Shi, X. Kong, P. S. Yu, and S. Xie. Relevance search in heterogeneous networks. In EDBT, 2012.

[3] C. Shi, Z. Zhang, P. Luo, P. S. Yu, Y. Yue, and B. Wu, “Semantic path based personalized recommendation on weighted heterogeneous information networks,” in Proc. 24th ACM Int. Conf. Inf. Knowl. Manage., 2015, pp. 453–462

文献依赖关系

[1]这篇论文中，作者运用基于路径的相关性测量构建了一个异构网络下的 non-personalized recommendation.论文测量相似性的方法来自于[2]HeteSim

基于语义的大众化推荐

[1]这篇论文中，作者运用基于路径的相关性测量构建了一个异构网络下的 non-personalized recommendation. 该推荐系统可以做semantic recommendation和relevance recommendation .如下图所示：

Data extraction: it extracts data from different data source (e.g., database and web) to construct the network.
Network modeling: it constructs the HIN with a given network schema. According to the structure of data, users can specify the network schema (e.g., bipartite, star or arbitrary schema) to construct the HIN database. The database provides the store and index functions of the node table and edge table of the HIN.
Network analysis: it analyzes the HIN and provides the recommendation services. It first computes and stores the relevance matrix of object pairs by the path-based relevance measure. Based on the relevance matrix and efficient computing strategies, the system can provide the online semantic recommendation service. Through the weight learning method, it can combine the relevance information from different semantic paths and provide online relevance recommendation service.
Recommendation service: it provides the succinct and friendly interface of recommendation services.

HeteSim[2]

Essentially, HeteSim(s; t|P) is a pair-wise random walk based measure, which evaluates how likely s and t will meet at the same node when s follows along the path and t goes against the path.
Since relevance paths embody different semantics, users can specify the path according to their intents. The semantic recommendation calculates the relevance matrix with HeteSim and recommends the top k objects.

举一个例子：

weight learning method

There are many relevance paths connecting the query object and related objects, so the relevance recommendation should comprehensively consider the relevance measures based on all relevance paths. It can be depicted as follows.

Although there can be infinite relevance paths connecting two objects, we only need to consider those short
paths, since the long paths are usually less important

现在的问题是我们如何决定$\omega_i$.论文认为，$\omega_i$由关联路径的重要性来表达。而关联路径的重要性可以用这个路径的长度和强度来表达。路径的强度可以由组成该路径的关系强度来表达。

关系强度：

where $O(A|R)$ is the average out-degree of type $A$ and $I(B|R)$ is the average in-degree of type $B$ based on relation $R$

路径强度：

关联路径重要性：

权重：

Efficient Computing Strategies

对于频繁使用的关联路径，进行离线计算，线上查询的策略。
快速矩阵乘法
矩阵稀疏化：去掉那些不太重要的结点。

基于语义的个性化推荐[3]

相似度测量方案的改进

在第2部分中，我们介绍过PathSim，但是它的测量是没有权重的。[3]将PathSim的方案扩展到了加权元路径中。

但是，u1和u2从情景中来看，应该是最不相同的（u1喜欢的，u2都不喜欢），可是在这里它们的相似度为1，这是因为我们仅仅考虑了路径个数，却没有考虑路径的分数值（权重）。因此本文对传统的这种相似性测量方案提出了改进。改进的措施是：我们将路径按照评分值（权重）进行分类，每一个类别的路径叫做atomic meta path。当我们考虑评分值为1的路径时，我们就假定其他路径不存在。然后用传统的计算方案得到u1和u2在评分值为1这个路径集下的相似度。接着考虑评分值为2,...,5的路径。可以一次计算u1和u2对应的相似度。最后我们把这些相似度进行加和，得到改进后的相似度，如图：

论文研读2017/3/24

1. 参考文献

[1.139] N. Srebro and T. Jaakkola, “Weighted low-rank approximations,” in Proc. 20th Int. Conf. Mach. Learn., 2003, pp. 720–727.

[1.141] X. Yang, H. Steck, and Y. Liu, “Circle-based recommendation in online social networks,” in Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2012, pp. 1267–1275.

2. 基于圈的在线社交推荐

2.1 本文贡献

孙相国说：

对于豆瓣数据集来说，情景感知这一方面，是可以明显得到的，即用户的动作流就是一种。而对于trust-aware（信任感知），我以为，除了VIP之间的网络也许有体现外，非VIP也许并没有什么体现。因为豆瓣总体上来看，社交功能并不是非常发达。用户在豆瓣上进行社交的周期也不长。

本文在方法上，隶属于trust-aware这一范畴。这里的“圈”，circle，实际上就是基于特定用户视角下的group，由于从已有数据提取circle难度非常大(Unfortunately, in most existing multi-category rating datasets, a user’s social connections from all categories are mixed together.even if the circles were explicitly known,they may not correspond to particular item categories that a recommender system may be concerned with. )因此，本文的贡献在于，提供了推断circle的方法。并在此基础上，建立基于trust-aware的推荐系统。

We propose a set of algorithms to infer category specific circles of friends and to infer the trust value on each link based on user rating activities in each category. To infer the trust value of a link in a circle, we first estimate a user’s expertise level in a category based on the rating activities of herself as well as all users trusting her. We then assign to users trust values proportional to their expertise levels. The reconstructed trust circles are used to develop a low-rank matrix factorization type of RS.

2.2 相关工作

关于矩阵分解，论文在这里表达为：

$\hat{R}=r_m+QP^T$

这样最小化目标函数为

$\frac{1}{2}\sum_{(u,i)\in obs.}(R_{u,i}-\hat{R}_{u,i})^2+\frac{\lambda}{2}(\left \|P \right \|_F^2)+\left \|Q \right \|_F^2$

其中$r_m$是基准预测，相关概念可以参考《推荐系统：技术、评估及高效算法》104页的5.3节.

本文在实验关节对比的baseline model是第36篇参考文献[18]A. P. Singh and G. J. Gordon. Relational learning via collective matrix factorization. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’08), pages 650–658, 2008. 中提出的SocialMF Model.

The social network information is represented by a matrix $S \in \mathbb{R}^{u_0 \times u_0}$, where $u_0$ is the number of users. The directed and weighted social relationship of user $u$ with user $v$ (e.g. user $u$ trusts/knows/follows user $v$ ) is represented by a positive value $S_{u,v} \in (0, 1]$. An absent or unobserved social relationship is reflected by $S_{u,v} = s_m$, where typically $s_m = 0$. Each of the rows of the social network matrix $S$ is normalized to 1, resulting in the new matrix $S^∗$ with $S_{u,v}^∗ \propto S_{u,v}$and $\sum_v S_{u,v}^∗ = 1$ for each user $u$ . The idea underlying SocialMF is that neighbors in the social network may have similar interests. This similarity is enforced by the second term in the following objective function, which says that user profile $Q_u$ should be similar to the (weighted) average of his/her friends’ profiles $Q_v$ (measured in terms of the square error):

\[ \frac{1}{2}\sum_{(u,i)\in obs.}(R_{u,i}-\hat{R}_{u,i})^2+\frac{\beta}{2}\sum_{u}\left((Q_u-\sum_vS^*_{u,v}Q_v)(Q_u-\sum_vS^*_{u,v}Q_v)^T\right)+\frac{\lambda}{2}(\left \|P \right \|_F^2)+\left \|Q \right \|_F^2 \tag{2} \]

上式第二项中，$Q_u$是用户$u$在隐因子空间中的点，$\sum_vS^*_{u,v}Q_v$表达的是用户$u$的邻居加权后的点（权重是$u$对邻居的信任值），论文认为用户的邻居的加权应该与用户本身相似，因此第二项用最小平方误差来表达。我们可以采用随机梯度下降法来对目标函数进行参数估计。

2.3 基于圈的推荐模型

本文认为，一个用户在某些类别中，可能会信任他的某个朋友，但未必在其他类别中，也信任同样的这个朋友。（孙相国：我们也许可以根据电影类别，来划分R矩阵，正如同本文根据category来划分R矩阵一样）

2.3.1 信任圈的推断

2.3.2 信任值的设定

这一节，论文提供了三种获取设定信任值的方法。

(1)均衡信任

\[ S^{(c)*}_{u,v}=\frac{1}{|\mathcal{C}_u^{(c)}|},\forall v \in \mathcal{C}_u^{(c)} \tag{3} \]

(2)基于专家的信任

The goal is to assign a higher trust value or weight to the friends that are experts in the circle / category. As an approximation to their level of expertise, we use the numbers of ratings they assigned to items in the category. The idea is that an expert in a category may have rated more items in that category than users who are not experts in that category.

设用户$u$在$c$领域信任的朋友集合为$\mathcal{C}_u^{(c)}$($u$的信任圈circle)，在$c$领域信任$u$的用户集合为$\mathcal{F}_u^{(c)}$,$u$在领域$c$的专家水平定义为$E_u^{(c)}$.

方法1：

认为用户$v$的专家水平等于其在领域的ratings数量：$E_v^{(c)}=N_v^{(c)}$,这样有：
\[ S_{u,v}^{(c)}=\left\{\begin{matrix} N_v^{(c)},v\in \mathcal{C}_u^c\\ 0,otherwise \end{matrix}\right. \tag{4} \]

归一化得到：
\[ S_{u,v}^{(c)*}=\frac{S_{u,v}^{(c)}}{\sum_v S_{u,v}^{(c)}}\tag{5} \]

方法2：

In this case, the expertise level of user $v$ in category $c$ is the product of two components: the first component is the number of ratings that $v$ assigned in category $c$, the second component is some voting value in category $c$ from all her followers in $F_v (c)$. The intuition is that if most of $v$’s followers have lots of ratings in category $c$, and they all trust $v$, it is a good indication that $v$ is an expert in category $c$.

(3)信任分割

从(1)(2)我们看到，$u$对$v$的信任值，是领域独立的。也就是说，我们考虑$u$在领域$c$信任$v$时，只考虑$ c$领域，但是领域之间独立真的好吗？看一个例子：如果$u$和$v$都在$c_1,c_2$有过ratings.并且$v$在$c_1$的rating数量远多于其在$c_2$中的数量，那么，$u$对$v$的信任似乎在$c_1$领域应该多于在$c_2$领域。可是我们之前的两种方案，都没有考虑这一点，这就会造成，两个领域的信任值相比较于实际不符合。因此这里提出第三种方案：
\[ S_{u,v}^{(c)}=\left\{\begin{matrix} \frac{N_v^{(c)}}{\sum_c N_v^{(c)}},v\in \mathcal{C}_u^c\\ 0,otherwise \end{matrix}\right. \tag{6} \]
上面公式的意思是，不仅仅只考虑当前领域$c$，还要对共同领域做一个横向比较，以此来确定相对大小。当然同样的，接下来还是要做归一化：
\[ S_{u,v}^{(c)*}=\frac{S_{u,v}^{(c)}}{\sum_v S_{u,v}^{(c)}}\tag{7} \]

2.3.3 模型训练

单个领域模型训练

参见2.2节的公式2，我们现在要用这个公式分别对各个领域$c$进行训练，因此有:

采用梯度下降法进行参数估计

全局模型训练

考虑到数据的稀疏性，我们希望把上面的训练公式中第一行扩展到全局。即：

转载于:https://www.cnblogs.com/xiangguosun/p/6785401.html

anfenyue4733

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
推荐系统概述2

这是一个测试微博凸共轭函数扩展实值函数$f: \mathbb{R}^n \mapsto [-\infty, \infty]$的共轭凸函数$f^*: \mathbb{R}^n \mapsto [-\infty, \infty]$为：\[\begin{align*} f^*(\boldsymbol{y}) = \sup_{\boldsymbol{x} \in \mathbb{...
复制链接

扫一扫