0x00 下载MovieLens数据
1) 从网站http://grouplens.org/datasets/movielens/下载数据
2) u.item文件包含两列数据,为电影id和电影名称的对应关系;u.data文件包含四列数据,为用户id,电影id,评价,时间
2016-07-20 21:25:24 的屏幕截图.png
3)读入数据
movie_list = {} #id:title
with open("./u.item") as f:
for line in f.readlines():
(mid, title) = line.split('|')[0:2]
movie_list[mid] = title
pref_by_people = {}
with open("./u.data") as f:
for line in f.readlines():
(uid, mid, rating) = line.split('\t')[0:3]
if not uid in pref_by_people.keys():
pref_by_people[uid] = {}
pref_by_people[uid][movie_list[mid]] = int(rating)
4)数据类型转换 {people:{movie:1}} –>> {movie:{people:1}}
def TransfromPref(pref):
re_pref = {}
fo