基于内容推荐系统_基于内容的推荐系统

基于内容推荐系统

介绍 (Introduction)

Over time, we rely more and more heavily on online platforms and applications such as Netflix, Amazon, Spotify etc. we are finding ourselves having to constantly choose from a wide range of options.

随着时间的流逝,我们越来越依赖在线平台和应用程序,例如Netflix,Amazon,Spotify等。我们发现自己不得不不断从多种选择中进行选择。

One may think that having many options is a good thing, as opposed to having very few, but an excess of options can lead to what is known as a “decision paralysis”. As Barry Schwartz writes in The Paradox of Choice:

可能有人认为拥有很多选择是一件好事,而不是只有很少的选择,但是过多的选择会导致所谓的“决策瘫痪”。 正如巴里·施瓦茨(Barry Schwartz)在《选择的悖论》中写道:

“A large array of options may discourage consumers because it forces an increase in the effort that goes into making a decision. So consumers decide not to decide, and don’t buy the product. Or if they do, the effort that the decision requires detracts from the enjoyment derived from the results.”

“各种各样的选择可能会令消费者望而却步,因为这迫使做出决定的努力增加。 因此,消费者决定不决定,也不购买产品。 否则,决策需要付出的努力会降低结果带来的愉悦感。”

Also resulting in another, more subtle, negative effect:

还导致另一个更微妙的负面影响:

“A large array of options may diminish the attractiveness of what people actually choose, the reason being that thinking about the attractions of some of the unchosen options detracts from the pleasure derived from the chosen one.”

“各种各样的选择可能会削弱人们实际选择的吸引力,其原因是,对某些未选择的选择的吸引力进行思考会降低选择所带来的乐趣。”

An obvious consequence of this, is that we end up not making any effort in scrutinising among multiple options unless it is made easier for us; in other words, unless these are filtered according to our preferences.

显而易见的结果是,除非最终使我们变得更容易,否则我们最终不会花力气仔细检查多个选项。 换句话说,除非根据我们的偏好对其进行过滤。

This is why recommender systems have become a crucial component in platforms as the aforementioned, in which users have a myriad range of options available. Their success will heavily depend on their ability to narrow down the set of options available, making it easier for us to make a choice.

这就是为什么推荐系统已成为上述平台中至关重要的组成部分的原因,在该平台中,用户拥有众多可用选项。 他们的成功将在很大程度上取决于他们缩小可用选项范围的能力,这使我们更容易做出选择。

A major drive in the field is Netflix, which is continuously advancing the state-of-the-art through research and by having sponsored the Netflix Prize between 2006 to 2009, which hugely energised research in the field.

Netflix是该领域的主要推动力,它通过研究并通过赞助2006至2009年的Netflix奖来不断推进最新技术,极大地激发了该领域的研究。

In addition, the Netflix’s recommender has a huge presence in the platform. When we search for a movie, we immediately get a selection of similar movies which we are likely to enjoy too:

此外,Netflix的推荐人在该平台中占有重要地位。 当我们搜索电影时,我们会立即获得一系列我们也可能会喜欢的电影:

Image for post
Netflix - personal account
Netflix-个人帐户

大纲 (Outline)

This post starts by exposing the different paradigms in recommender systems, and goes through a hands on approach to a content based recommender. I’ll be using the well known MovieLens dataset, and show how we could recommend new movies based on their features.

这篇文章首先介绍了推荐器系统中的不同范例,然后逐步尝试了基于内容的推荐器 。 我将使用著名的MovieLens数据集 ,并展示如何根据新电影的功能推荐新电影。

This is the first in a series of two posts (perhaps more) on recommender systems, the upcoming one will be on Collaborative filtering.

这是有关推荐系统的两篇文章(也许更多)中的第一篇,即将发表的文章将是关于协作过滤的

Find a jupyter notebook version of this post with all the code here.

发现这个帖子的所有代码的jupyter笔记本版本 在这里

推荐系统的类型 (Types of recommender systems)

Most recommender systems make use of either or both collaborative filtering and content based filtering. Though current recommender systems typically combine several approaches into a hybrid system. Below is a general overview of these methods:

大多数推荐系统使用协作过滤和基于内容的过滤中的一个或两个。 尽管当前的推荐系统通常将几种方法组合到混合系统中。 下面是这些方法的一般概述:

  • Collaborative filtering: The main idea behind these methods is to use other users’ preferences and taste to recommend new items to a user. The usual procedure is to find similar users (or items) to recommend new items which where liked by those users, and which presumably will also be liked by the user being recommended.

    协作过滤 :这些方法背后的主要思想是利用其他用户的偏好和品味向用户推荐新商品。 通常的过程是找到相似的用户(或项目)以推荐那些用户喜欢的新项目,并且推荐的用户可能也会喜欢这些新项目。

  • Content-Based: Content based recommenders will instead use data exclusively about the items. For this we need to have a minimal understanding of the users’ preferences, so that we can then recommend new items with similar tags/keywords to those specified (or inferred) by the user.

    基于内容的内容:基于内容的推荐者将仅使用有关项目的数据。 为此,我们需要对用户的偏好有一个最低限度的了解,以便我们可以推荐具有与用户指定(或推断)的标签/关键字相似的标签/关键字的新项目。

  • Hybrid methods: Which, as the name suggests, include techniques combining collaborative filtering, content based and other possible approaches. Nowadays most recommender systems are hybrid, as is the case of factorization machines.

    混合方法:顾名思义,它包括结合了协作过滤,基于内容的方法和其他可能方法的技术。 如今,大多数推荐系统都是混合的&#x

  • 0
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值