Lecture 7: Scalable matrix Factorisation for collaborative Filtering in Recommender System
1. two classes of RecSys
Content-based
Collaborative filtering
2. Challenges of Collaborative filtering
- Cold Start
- Sparsity
- scalability
3. Matrix Factorisation for CF
Characterise items/users by vectors of factors learned from the rating matrix user * item
High correlation between item and user factors -> good recommendation
steps:
- SVD for Factorise rating matrix
- reduce the matrix to low dimension
- the rest features are useful
4. How to fill Missing values
- using the average ratings for user and item: expensive and inaccurate
- modelling directly the observed ratings only
5. key in Scalable ML
Computation and storage should be linear -> low-cost computation
perform parallel and in-memory computation -> Many working + reduce disk I/O
Minimise network communication -> reduce overhead in parallelisation not the more the better
6. Group users and items into blocks
reduce communication: one user one item in one block
precompute info: out-links of each user and in-links of each item
implicit feedback modelling
implicit feedback: view, clicks, purchase, likes, shares:
- rating = strength in user actions -> confidence level of user preference
- construct a preference matrix P: if r>0, return 1, if r=0, return 0
- Factorisation of P -> latent factors to predict the preference of a user for an item