php amazon-s3_推荐亚马逊电影-一种协作方法

php amazon-s3

Item-based collaborative and User-based collaborative approach for recommendation system with simple coding.

推荐系统的基于项目的协作和基于用户的协作方法,编码简单。

推荐系统概述 (Overview of Recommendation System)

There are many methods of recommendation system where each of them serve for different purposes. My previous article is talking about the simple and content-based recommendation. These recommendations are non-personalised recommenders, but that doesn’t mean they are less useful when compare to the other. These method are very popular for recommending top music of the week and recommending music of similar genre.

推荐系统的方法很多,每种方法都有不同的用途。 我的上一篇文章讨论的是基于内容的简单推荐。 这些推荐是非个性化的推荐者,但这并不意味着它们与其他推荐相比没有太大用处。 这些方法在推荐本周热门音乐和推荐类似流派的音乐时非常流行。

In this article, it will focus on collaborative filtering method. This method considers your taste in comparison to people/items that are in similar. Then, it recommends a list of items based on consumption similarity and suggest what you probably interested. These method only focus on calculating the rating.

在本文中,它将重点介绍协作过滤方法。 与相似的人/物品相比,此方法考虑了您的口味。 然后,它根据消费相似性推荐商品清单,并建议您可能感兴趣的商品。 这些方法仅专注于计算等级

There are two main filtering for this method: item-based filtering and user-based filtering. Item-based filtering will suggest items that are similar to what you have already liked. User-based filtering will suggest items that people similar to you have liked but you have not yet consumed.

此方法主要有两种过滤:基于项目的过滤和基于用户的过滤。 基于项目的过滤将建议与您喜欢的项目相似的项目。 基于用户的过滤将建议与您相似的人喜欢但尚未消耗的物品。

With the Amazon movie data, we will apply item-based filtering and user-based filtering recommendation methods to analyze similar items to be recommend and identify users that have similar taste.

借助Amazon电影数据 ,我们将应用基于项目的过滤和基于用户的过滤推荐方法来分析要推荐的相似项目并识别具有相似品味的用户。

分析概述 (Analysis Overview)

For both item-based filtering and user-based filtering recommendation, we need to clean data and prepare them into matrix so that it can be used for analysis. All ratings need to be in numbers and normalized and cosine similarity will be used to calculate items/users similarity.

对于基于项目的过滤和基于用户的过滤建议,我们都需要清理数据并将它们准备成矩阵,以便可以将其用于分析。 所有等级都必须以数字表示并进行归一化,余弦相似度将用于计算项目/用户相似度。

资料总览 (Data Overview)

There are 4,848 users with a total of 206 movies in the dataset.

数据集中有4848位用户,总共206部电影。

实作 (Implementation)

Now, lets import all tools that we are going to use for the analysis, put data into DataFrame, and clean them.

现在,让我们导入我们将用于分析的所有工具,将数据放入DataFrame并清理它们。

import pandas as pd
import numpy as np
import scipy as sp
from scipy import sparse
from sklearn.metrics.pairwise import cosine_similarityamazon = pd.read_csv('.../Amazon.csv')
amazon.info()
amazon.head()
Image for post
Image for post

Then, we need to rearrange data into matrix format where we will set index for the rows as user_id and index for the column as name.

然后,我们需要将数据重新排列为矩阵格式,在该格式中,将行的索引设置为user_id,将列的索引设置为name。

amazon = amazon.melt(id_vars=['user_id'], var_name='name', value_name='rating')
amazon_pivot = amazon.pivot_table(index=['user_id'], columns=['name'], values='rating')
amazon_pivot.head()
Image for post

From here, we need to normalized the rating values so that value range are closer to one and another. Then, turn the NaN values into 0 and select only those users who at least rate one movie.

从这里开始,我们需要对评级值进行归一化,以使值范围彼此接近。 然后,将NaN值设置为0,然后仅选择至少对一部电影评分的用户。

amazon_normalized = amazon_pivot.apply(lambda x: (x-np.min(x))/(np.max(x)-np.min(x)), axis=1)amazon_normalized.fillna(0, inplace=True)
amazon_normalized = amazon_normalized.T
amazon_normalized = amazon_normalized.loc[:, (amazon_normalized !=0).any(axis=0)]
Image for post

We nearly there. Now we need to put them into sparse matrix.

我们快到了。 现在我们需要将它们放入稀疏矩阵。

amazon_sparse = sp.sparse.csr_matrix(amazon_normalized.values)

Lets look at item-based filtering recommendation.

让我们看一下基于项目的过滤建议

item_similarity = cosine_similarity(amazon_sparse)
item_sim_df = pd.DataFrame(item_similarity, index=amazon_normalized.index, columns=amazon_normalized.index)
item_sim_df.head()
Image for post

All the columns and rows are now become each of the movie and it is ready for the recommendation calculation.

现在,所有的列和行都成为电影的每一个,并且可以进行推荐计算了。

def top_movie(movie_name):
for item in item_sim_df.sort_values(by = movie_name, ascending = False).index[1:11]:
print('Similar movie:{}'.format(item))top_movie("Movie102")
Image for post

These are the movies that are similar to Movie102.

这些是与Movie102类似的电影。

Lets look at user-based filtering recommendation. Who has similar taste to me?

让我们看一下基于用户的过滤推荐 。 谁有和我相似的品味?

user_similarity = cosine_similarity(amazon_sparse.T)
user_sim_df = pd.DataFrame(user_similarity, index = amazon_normalized.columns, columns = amazon_normalized.columns)
user_sim_df.head()
Image for post
def top_users(user):  
sim_values = user_sim_df.sort_values(by=user, ascending=False).loc[:,user].tolist()[1:11]
sim_users = user_sim_df.sort_values(by=user, ascending=False).index[1:11]
zipped = zip(sim_users, sim_values)
for user, sim in zipped:
print('User #{0}, Similarity value: {1:.2f}'.format(user, sim))top_users('A140XH16IKR4B0')
Image for post

These are the examples on how to implement the item-based and user-based filtering recommendation system. Some of the code are from https://www.kaggle.com/ajmichelutti/collaborative-filtering-on-anime-data

这些是有关如何实施基于项目和基于用户的过滤推荐系统的示例。 一些代码来自https://www.kaggle.com/ajmichelutti/collaborative-filtering-on-anime-data

Hope that you enjoy!

希望你喜欢!

翻译自: https://medium.com/analytics-vidhya/recommend-amazon-movie-a-collaborative-approach-9b3db8f48ad6

php amazon-s3

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值