推荐系统：MovivLens20M数据集解析

最新推荐文章于 2024-04-12 15:11:57 发布

wishchin

最新推荐文章于 2024-04-12 15:11:57 发布

阅读量4.2k

点赞数 3

分类专栏：推荐/Rank系统 ReinforceLearning

本文链接：https://blog.csdn.net/wishchin/article/details/78060662

版权

ReinforceLearning 同时被 2 个专栏收录

33 篇文章 1 订阅

订阅专栏

推荐/Rank系统

22 篇文章 0 订阅

订阅专栏

MovieLens 是历史最悠久的推荐系统。它由美国 Minnesota 大学计算机科学与工程学院的 GroupLens 项目组创办，是一个非商业性质的、以研究为目的的实验性站点。MovieLens 主要使用 Collaborative Filtering 和 Association Rules 相结合的技术，向用户推荐他们感兴趣的电影。

参考资料：movieLens-百度百科、电影数据集总结

数据集地址： https://grouplens.org/datasets/movielens/

This dataset (ml-20m) describes 5-star rating and free-text tagging activity from [MovieLens](http://movielens.org), a movie recommendation service. It contains 20000263 ratings and 465564 tag applications across 27278 movies. These data were created by 138493 users between January 09, 1995 and March 31, 2015. This dataset was generated on March 31, 2015, and updated on October 17, 2016 to update links.csv and add genome-* files.

Users were selected at random for inclusion. All selected users had rated at least 20 movies. No demographic information is included. Each user is represented by an id, and no other information is provided.

The data are contained in six files, `genome-scores.csv`, `genome-tags.csv`, `links.csv`, `movies.csv`, `ratings.csv` and `tags.csv`. More details about the contents and use of all these files follows.

This and other GroupLens data sets are publicly available for download at <http://grouplens.org/datasets/>.。

此数据集描述了5星之内的电影不受限制的标记，用于给出用户推荐。数据集包含了138493个用户对27278个电影的20000263个评分和465564个标签。此评价收集于1995年1月到2015年3月之间，并在2016年10月17日更新为csv格式。