论文《Empirical Analysis of Predictive Algorithm for Collaborative Filtering》总结
1. 摘要
只分析了基于用户的协同过滤算法(Memory-Based and Model-Based Algorithms); 实验分析了显式和隐式数据(Explicit and Implicit Data)
Memory-based algorithms operate over the entire user database to make predictions. Model-based collaborative filtering, in contrast, uses the user database to estimate or learn a model, which is then used for predictions.
2. Memory-based algorithms
2.1 基本定义
基本公式
为什么要减去用户平均值? 系数干什么用? 计算用户之间的权重用什么方法?
2.2 Correlation
2.3 Vector Similarity
2.4 Extension to Memory-based algorithms
2.4.1 Default Voting
注:d是默认填充值;n是用户a和用户i的投票项目并集; k是额外填充项目个数
这个思想为什么只在公式(2)上面改进?
2.4.2 Inverse User Frequency
借鉴Inverse Item Frequency思想[1]
The idea is to reduce weights for commonly occurring words, capturing the intuition that they are not as useful in identifying the topic of a document, while words that occur less frequently are more indicative of topic.
注: 这个思想为什么只在公式(2)上运用?
3. Model-Based Algorithms
3.1 基本定义
注:用户的投票分数从0~m, 为整数
3.2 Cluster Model
TODO
没看懂,2017:10:20:20:29
3.3 Bayesian Network Model
TODO
没看懂,2017:10:20:20:29
Experiment
TODO
因为3.2和3.3没看懂, 所以等弄懂后再看
References
[1] Introduction to Modern Information Retrieval, Salton and McGill, 1983