Recommender Systems - Collaborative filtering

In this class, we'll talk about an approach to build a recommender system that's called collaborative filtering. This algorithm has a very interesting property that it does what is called feature learning. By that I mean it can start to learn for itself what features to use.

Problem motivation

figure-1

Previously, as what shown in figure-1, we assumed that for each movie someone had come and told us how romantic that movie was and how much action there was in that movie. But actually it can be very difficult and time consuming to try to get someone to watch each movie and tell you these info. And often you'll need even more features than just these two. So where do you get these features from?

figure-2

So, as figure-2, let's change the problem a bit and suppose we don't know the values of these features. Instead let's say we've gone through each of our users and each one has told us how much they like the romantic movies and how much they like action packed movies. For example, \theta ^{(1)}=\begin{bmatrix} 0\\ 5\\ 0 \end{bmatrix} means Alice told us she really likes romantic movies and so there is a "5" associated with feature "x_{1}" there. And she really doesn't like action movies, so there's a "0" for "x_{2}" there. Similarly for other users Bob, Carol, and Dave. In general, we can go to our users and each user j told us what is the value of \theta ^{(j)} for them, then we can infer the value of the features of x_{1} and x_{2} for each movie.

For example, for the first movie "Love at last" which associated with feature vector x^{(1)}. Let's ignore its title and pretend we don't know what this movie is about. Because:

  1. Both Alice and Bob rated it 5 and they told us they like romantic movies
  2. Both Carol and Dave rated it 0 and they like action movies and hate romantic movies

We might reasonably conclude that this is probably a romantic movie. Thus it's possible that x^{(1)}_{1}=1.0 and x^{(1)}_{2}=0

This example is mathematically simplified. What we're really asking is what feature vector should x^{(1)} be so that

(\theta ^{(1)})^{T}x^{(1)}\approx 5

Similarly:

(\theta ^{(2)})^{T}x^{(1)}\approx 5

(\theta ^{(3)})^{T}x^{(1)}\approx 0

(\theta ^{(4)})^{T}x^{(1)}\approx 0

From above, we can infer that x^{(1)}=\begin{bmatrix} 1\\ 1.0\\ 0 \end{bmatrix}

Similarly, we can infer values for feature vectors of other movies.

Optimization algorithm

figure-3

Let's say our users have given us their preferences as \theta ^{(1)}, \theta ^{(2)},...,\theta ^{(n_{u})}. We can pose the above optimization problem to estimate x^{(i)} which is the feature for movie i. We want to sum over all the indices j for which we have a rating for movie i and try to choose feature x^{(i)} to minimize the regularized cost function in figure-3. This is how we would learn the features for one specific movie.

figure-4

To learn all the features for all the movies, we'll sum over all the n_{m} movies and minimize the objective cost function in figure-4. You end up with a reasonable set of features for all the movies.

Collaborative filtering

figure-5

In last class, if we have a set of movie ratings (r(i,j), y(i,j)), then given features of different movies (x^{(i)}, i=1,..,n_{m}), we can learn the parameters \theta ^{(j)}, j=1,..,n_{u} which are the preference of different users. What we've shown earlier in this class is that if users are willing to give us parameters \theta ^{(j)}, j=1,..,n_{u}, then we can estimate features x^{(i)}, i=1,..,n_{m} for different movies. So, this is kind of a chicken and egg problem.

Then what we can do is:

  1. Randomly guess some initial value of \theta^{(j)}, j=1,2,...,n_{u}
  2. Predict the features x^{(i)}, i=1,2,...,n_{m} for different movies
  3. Then based on these x^{(i)} in step 2, we can then get even better \theta ^{(j)}
  4. Then based on better \theta ^{(j)}, we can get bettern x^{(i)} again
  5. Keep on iterating above and going back and forth.

It proved that this actually works and will cause your algorithm to converge to a reasonable set of features for your movies and a reasonable set of parameters for your users. This is a basic collaborative filtering algorithm. We'll be able to improve this later to make it quite a bit more computationally efficient. But hopefully this gives you a sense of how you can formulate a problem where you can simultaneously learn the parameters of different users and the features of different movies.

The term collaborative filtering refers to the observation that when you run this algorithm with a large set of users, what all of these users are effectively doing are sort of collaborating to get better movie ratings for everyone because for every user rating some subset of the movies, every user is helping the algorithm a little bit to learn better features, and then by rating a few movies myself, I will be helping the system learn better features, and these features can be used by the system to make better movie prediction for everyone else.  So there is a sense of collaboration where every user is helping the system learn better features for the common good. This is the collaborative filtering.

Next, we'll try to develop an even bettern technique for collaborative filtering.

<end>

 

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值