python导出协同过滤结果_转换python协同过滤代码以使用Map Reduce

刘子栋

于 2021-02-10 09:32:20 发布

阅读量134

点赞数

文章标签： python导出协同过滤结果

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_31648507/article/details/113979980

版权

这实际上并不是一个“MapReduce”功能,但它可以给你一些显着的加速,而不会有任何麻烦.

我实际上会使用numpy来“矢量化”操作,让你的生活更轻松.从这里你只需要遍历这个字典并应用矢量化函数,将这个项目与其他项目进行比较.

import numpy as np

bnb_items = bnb.values()

for num in xrange(len(bnb_items)-1):

sims = cosSim(bnb_items[num], bnb_items[num+1:]

def cosSim(User, OUsers):

""" Determinnes the cosine-similarity between 1 user and all others.

Returns an array the size of OUsers with the similarity measures

User is a single array of the items purchased by a user.

OUsers is a LIST of arrays purchased by other users.

"""

multidot = np.vectorize(np.vdot)

multidenom = np.vectorize(lambda x: np.sum(x)*np.sum(User))

#apply the dot-product between this user and all others

num = multidot(OUsers, User)

#apply the magnitude multiplication across this user and all others

denom = multidenom(OUsers)

return num/denom

我没有测试过这段代码,所以可能会有一些愚蠢的错误,但这个想法应该让你获得90％的代价.

这应该有一个非常重要的加速.如果您仍然需要加速,那么有一个精彩的博客文章实现了“Slope One”推荐系统here.

希望有所帮助,将

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。