#论文阅读#Amazon.com Recommendations: Item-to-item collaborative filtering

最新推荐文章于 2022-04-11 16:58:53 发布

抖腿大刘

最新推荐文章于 2022-04-11 16:58:53 发布

阅读量717

点赞数 3

分类专栏： paper recommend 算法机器学习

本文链接：https://blog.csdn.net/u010496169/article/details/89254274

版权

7 篇文章 0 订阅

订阅专栏

5 篇文章 0 订阅

订阅专栏

1 篇文章 0 订阅

订阅专栏

其实就是基本的item cf 的算法，和教材中讲的差不多，算法实现没有什么好赘述的，只是记录下在引言中看到的原来不知道的部分。

首先是电商推荐面临的一些挑战，说的都是比较普遍的，但是其实针对不同的业务有不同的挑战：

数据量大：A large retailer might have huge amounts of data, tens of millions of customers and millions of distinct catalog items.
实时性的要求：Many applications require the results set to be returned in realtime, in no more than half a second, while still producing high-quality rec- ommendations.
用户的冷启动问题：New customers typically have extremely limit- ed information, based on only a few purchases or product ratings.
活跃用户的推送问题（感觉就是可推荐的太多了，是一个排序的问题）Older customers can have a glut of information, based on thousands of purchases and ratings.
用户的行为变化无常：Customer data is volatile: Each interaction pro- vides valuable customer data, and the algorithm must respond immediately to new information.

再次是传统的推荐算法：

user-cf：就是一个用户来了，找出与他相似的用户，然后将这些用户购买的top n 作为推荐（虽说是做个排序，但是感觉我这样说也没有问题）。优点是推荐的比较准确，缺点是不怎么能离线计算（对这个我比较好奇），所以在实际的intime场景不太好应用。
cluster models：就是通过聚类算法，找到一个用户属于哪一个簇，然后根据簇里面的用户的联合购买top 进行推荐。优点是效率相对高些，缺点是推荐不准确
search-based models: 就是比如买了林宥嘉的CD，他就找出关键字林宥嘉或者CD，推荐你林宥嘉的其他周边，或者其他CD。优点是速度快，可以实时。缺点就是，要不推荐的比较general（比如推荐cd），要不就是推荐的比较narrow（比如林宥嘉）
iterm-cf：优点就是可以离线计算，线上直接加载直接用。

关于这篇论文阅读写的比较好的blog：