Netflix评分数据集提供了1亿条“用户X在2005年2月12日将电影Y评分为4.0”的记录。
The Netflix Prize data set gives 100 million records of the form “user X rated movie Y a 4.0 on 2/12/05”.
项目思路:
基于一位用户的历史评分行为,你能预测该用户对未来某部电影的评分吗?
Can you predict the rating a user will give on a movie from the movies that user has rated in the past, as well as the ratings similar users have given similar movies?
你能找出类似的电影或用户群体吗?
Can you discover clusters of similar movies or users?
你能预测2006年哪些用户给哪些电影的评分吗?
Can you predict which users rated which movies in 2006?
换言之,你的任务是预测2006年每一对被评分的可能性。
In other words, your task is to predict the probability that each pair was rated in 2006.
请注意,实际的评分是不相关的,我们只想知道2006年某个时候该用户是否对这部电影进行了评分。
Note that the actual rating is irrelevant, and we just want whether the movie was rated by that user sometime in 2006.
2006年用户给出评分的具体日期也无关紧要。
The date in 2006 when the rating was given is also irrelevant.
测试数据可以在以下网站找到:
The test data can be found at this website.
更多精彩文章请关注微信号: