ML之MF:基于MovieLens电影评分数据集利用基于矩阵分解算法(NMF)实现对用户进行Top5电影推荐案例
目录
基于MovieLens电影评分数据集利用基于矩阵分解算法(NMF)实现对用户进行Top5电影推荐案例
# 3.3、模型推理:基于评分表对用户进行推荐最高的5部电影
# 3.3.2、对指定用户预测,再该用户对未评分电影的评分的情况下
相关文章
ML之MF:基于MovieLens电影评分数据集利用基于矩阵分解算法(NMF)实现对用户进行Top5电影推荐案例
ML之MF:基于MovieLens电影评分数据集利用基于矩阵分解算法(NMF)实现对用户进行Top5电影推荐案例实现代码
基于MovieLens电影评分数据集利用基于矩阵分解算法(NMF)实现对用户进行Top5电影推荐案例
# 1、定义数据集
userId | movieId | rating | timestamp |
1 | 1 | 4 | 964982703 |
1 | 3 | 4 | 964981247 |
1 | 6 | 4 | 964982224 |
1 | 47 | 5 | 964983815 |
1 | 50 | 5 | 964982931 |
1 | 70 | 3 | 964982400 |
1 | 101 | 5 | 964980868 |
1 | 110 | 4 | 964982176 |
1 | 151 | 5 | 964984041 |
1 | 157 | 5 | 964984100 |
movieId | title | genres |
1 | Toy Story (1995) | Adventure|Animation|Children|Comedy|Fantasy |
2 | Jumanji (1995) | Adventure|Children|Fantasy |
3 | Grumpier Old Men (1995) | Comedy|Romance |
4 | Waiting to Exhale (1995) | Comedy|Drama|Romance |
5 | Father of the Bride Part II (1995) | Comedy |
6 | Heat (1995) | Action|Crime|Thriller |
7 | Sabrina (1995) | Comedy|Romance |
8 | Tom and Huck (1995) | Adventure|Children |
9 | Sudden Death (1995) | Action |
10 | GoldenEye (1995) | Action|Adventure|Thriller |
11 | American President, The (1995) | Comedy|Drama|Romance |
userId movieId rating timestamp
0 1 1 4.0 964982703
1 1 3 4.0 964981247
2 1 6 4.0 964982224
3 1 47 5.0 964983815
4 1 50 5.0 964982931
... ... ... ... ...
100831 610 166534 4.0 1493848402
100832 610 168248 5.0 1493850091
100833 610 168250 5.0 1494273047
100834 610 168252 5.0 1493846352
100835 610 170875 3.0 1493846415
[100836 rows x 4 columns]
# 2、数据预处理
# 2.1、构建用户-电影评分矩阵
(610, 9724)
# 3、模型训练与推理
# 3.1、模型建立
user_vector
[[0.01526719 0. 0. ... 0.05609566 0.0136756 0. ]
[0. 0. 0.1684357 ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 0.01672881]
...
[0.40784337 0. 0. ... 1.01080972 0. 0.95654618]
[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 1.5909363 ... 0. 0. 3.87384557]]
item_vector
[[1.40945120e-01 1.27354584e-01 3.51527953e-01 ... 0.00000000e+00
0.00000000e+00 0.00000000e+00]
[1.34669831e-01 1.34021139e-01 0.00000000e+00 ... 0.00000000e+00
0.00000000e+00 0.00000000e+00]
[6.96397897e-01 8.60262712e-01 0.00000000e+00 ... 2.79393731e-02
2.79393731e-02 2.35939398e-02]
...
[1.91312235e+00 1.37028132e+00 4.63889253e-01 ... 0.00000000e+00
0.00000000e+00 0.00000000e+00]
[1.81885093e+00 1.20570539e+00 1.32260449e+00 ... 0.00000000e+00
0.00000000e+00 0.00000000e+00]
[1.28710844e-03 2.39310166e-01 0.00000000e+00 ... 0.00000000e+00
0.00000000e+00 0.00000000e+00]]
# 3.2、模型训练
pred_ratings
[[2.62855445e+00 1.01779693e+00 8.05678172e-01 ... 0.00000000e+00
0.00000000e+00 4.33188277e-03]
[2.57480353e-01 1.44898956e-01 0.00000000e+00 ... 4.70598799e-03
4.70598799e-03 7.12138324e-03]
[8.40818766e-02 3.81618949e-02 2.94026610e-02 ... 0.00000000e+00
0.00000000e+00 0.00000000e+00]
...
[4.18288177e+00 2.47847625e+00 1.21541771e+00 ... 0.00000000e+00
0.00000000e+00 1.17176752e-02]
[8.63399333e-01 7.02331092e-01 3.40531982e-01 ... 0.00000000e+00
0.00000000e+00 9.68909148e-04]
[1.35691562e+00 2.29567381e+00 0.00000000e+00 ... 4.44497630e-02
4.44497630e-02 4.30147711e-02]]
# 3.3、模型推理:基于评分表对用户进行推荐最高的5部电影
# 3.3.1、批量对用户预测
user_id 1: [ 507 901 793 2077 1210]
user_id 2: [1938 2224 4791 3633 4131]
user_id 3: [897 224 910 938 901]
user_id 4: [908 950 945 659 705]
user_id 5: [314 418 510 138 134]
user_id 6: [260 0 546 35 385]
user_id 7: [1938 2224 277 257 1502]
user_id 8: [138 307 134 506 507]
user_id 9: [1938 2224 277 224 314]
user_id 10: [4421 5156 3569 7355 4070]
user_id 11: [257 509 508 506 322]
# 3.3.2、对指定用户预测,再该用户对未评分电影的评分的情况下
推荐给用户11的电影是:257,《['Just Cause (1995)']》