http://blog.csdn.net/pipisorry/article/details/39507699
推荐系统之非个性化推荐Non-personalized recommendation,主要包含aggregated opinion recommenders,产品关联推荐,时间敏感推荐。
个性化推荐参考[推荐系统:个性化推荐-协同过滤 ][海量数据挖掘MMDS week4: 推荐系统Recommendation System]
Aggregated opinion recommenders
The Story of Zagat
。。。
Drawback of Zagat
Some early Zagat fans argue the guide has been getting worse. Why?
All the restaurants with the similar scores
Too many mediocre restaurants with good scores
Too many excellent restaurants with mediocre scores
Reasons
Self-selection bias
ratings from recent users
Users of B&B
Increased diversity of raters for fancy hotels and restaurant
No free water, free spaghetti at fancy restaurants
Free WIFI, free breakfast, free parking at fancy hotels
French meals at Guide Michelin
How to Compare
Item A: {5, 5, 4, 5, 5, 5}
Item B: {5}
Ranking Scores: Damped Means
Problem: few ratings
E.g., Only one 5-star rating
Solution
Assume that everything is average without evidence
Ratings are evidence of non-averageness
k: a parameter
mu: the average rating of all the items in the system
Here, n is the number of votings
Ranking Scores: Confidence Interval
Assume certain data distribution for the rating
Lower and upper bound of statistical confidence interval (95%)
Choice of bounds
Lower bound: conservative recommendations
Upper bound: risky for amazing recommendations
产品关联推荐Product association recommendation
{一种简单的Context-aware Non-personalized recommendation, 其实就是一种数据挖掘方法,类似于频繁项集挖掘}
交易数据库Transaction Database
关联计算
X: take the treatment; !X: not take the treatment
Y: the output occurs; !Y: the output does not occur
如何度量treatment的效果?
两种方法:
计算因果关系Causal Effect:lift
P(Y|X)-P(Y|!X) or P(Y|X)/P(Y|!X)
Adjust by looking at whether X makes Y more likely than not X (!X)
Focus on the increase in Y associated with X
如计算X = “anchovy paste” (凤尾鱼酱), Y=“banana”,lift值很小,接近1;而计算X = “anchovy paste”, Y=“garlic paste” (蒜泥)lift值比1大得多。
关联Association
Note: 即互信息吧。
讨论
上下文感知推荐Context-aware recommendation
Simple context: conditioned on one item(产品关联推荐就是其中一种)
Complex context: conditioned on a set of items, or considering the sequence of these items
非个性化和个性化推荐Non-personal or Personal recommendation
Computing P(Y|X) and P(Y|!X) over all the data
Computing P(Y|X) and P(Y|!X) over the data from the users who have similar tastes to a target user(lift计算中,如果计算数据只来自与当前用户相关的用户,则非个性化推荐就变成了个性化推荐了)
时间敏感排序Time Sensitive Ranking for non-personalized recommendation
{尤其对新闻推荐有效}
时移Time-shifting
News aggregator: 旧新闻是不有趣的,即使有很多好评net up votes。
时移推荐公式
Ad-hoc methods
Reddit新闻排名
reddit是一个新闻推荐网站,其推荐公式如下
U: #upvotes; D: #downvotes; t_post: 新闻发布时间
第2部分:对时间的简单处理。Buries items with negative votes。
[Reddit]
HackerNews新闻排名
U: #upvotes; D: #downvotes; t_now: time for now; t_post: time when this news published;
α: 0.8; γ: 1.8; P: penalty term for different items
个人觉得hackernews的新闻排名算法更靠谱。
from:http://blog.csdn.net/pipisorry/article/details/39507699
ref: [ICT Luoping's recsys lessons, summer 2016]*