In this class, we'll talk about following 2 things:
- Vectorization of the Collaborative filtering algorithm
- One user has recently been looking at one product, can you find other products that are related to this so that you could recommend to the user
Vectorization
Let's work out an alternative way of writing out the predictions of the Collaborative filtering algorithm. Above in figure-1 is our data set. I'll take all the ratings by all the users and group them into a matrix. We have five movies () and four users(). So the matrix is . The element of this matrix is the rating to movie by user .
Given the matrix of all the ratings, we have an alternative way of writing out all the predictive ratings of the algorithm as Figure-2. In particular, what user predicts on movie is given by . For example, means the predictive rating to movie one by user one, and so on. Also, in figure-2, we have shown pairs of real ratings and predictive ratings in different colors.
Now, given this matrix of predictive ratings, there is a simpler or vectorized way.
As what can be seen in figure-3, define the matrix by taking all the features of my movies and stack them in rows. And define matrix by taking each of the per user parameter vector, and also stack them in rows. Then we have:
So, we can give the Collaborative filtering algorithm another name: low rank matrix factorization.
Product recommendation
Having run above Collaborative filtering algorithm or low rank matrix factorization, you can use the learned features to find related movies or products. Specifically, for each product or movie , we learned a feature vector . Maybe you end up with a feature as figure-4.
Note that after you have learned features, it's actually pretty difficult to go in to the learned features and come up with a human understandable interpretation of what these features really are. But in practice, usually it will learn features that are very meaningful for capturing whatever are the most important or the most salient properties of a movie/product that causes users to like or dislike it.
Now, let's say we want to address the following problem:
Say you have some specific movie , how to find movies related to that movie?
Fot this, we can find movie , so that the distance between feature vectors of movie and is small:
This is a pretty strong indication that movie and are somehow similar. Then someone like movie maybe like movie as well.
One specific example: if your user is looking at some movie and if you want to find the 5 most similar movies to that movie in order to recommend 5 new movies to them, what you do is find the 5 movies with the smallest distance between the features of these movies.
<end>