明大推荐系统导论笔记 week 1

最新推荐文章于 2020-11-28 07:20:28 发布

TTransposition

最新推荐文章于 2020-11-28 07:20:28 发布

阅读量2.7k

点赞数

分类专栏：推荐系统文章标签： recomend

本文链接：https://blog.csdn.net/ttransposition/article/details/50812433

版权

推荐系统专栏收录该内容

2 篇文章 0 订阅

订阅专栏

1.Introduction to Recommender Systems

Understand what a recommender system is
Some history and background

A Bit of History

Ants, Cavemen, and Early Recommender Systems
– The emergence of critics 然后就follow critics啦
Information Retrieval and Filtering
Manual Collaborative Filtering
Automated Collaborative Filtering
The Commercial Era

Ants, Cavemen, and Early Recommender Systems

推荐即帮助选择，广泛存在的social navigation 即从众效应，可以帮助蚂蚁找到食物(别人的mark)，帮助原始人确定哪些食物时可以食用的（别人的 result: dead or not）,帮助体育馆的人找到出口(follow crowd，别人的choice)，是一种典型的use information from others.在推荐系统中information 一般是critics

Information Retrieval

like 搜索引擎
- Static content base
– Invest time in indexing content
例如，图书索引
- Dynamic information need
– Queries presented in “real time”
查询是动态的
- Common approach: TFIDF
– Rank documents by term overlap
– Rank terms by frequency

Information Filtering

Reverse assumptions from IR
– Static information need
– Dynamic content base
IR中，内容是static的，查询是dynamic,现在内容是dynamic,用户的taste/preferen是static
Invest effort in modeling user need
– Hand-created “profile”
比如用户follow的标签（科幻小说/传记）？
– Machine learned profile
– Feedback/updates
Pass new content through filters
过滤出可能喜欢的内容push给用户，or inverse 过滤出不喜欢的内容（spame email）

Manual Collaborative Filtering

Premise
– Information needs more complex than keywords or topics: quality and taste
complex：比如，我喜欢科幻小说，但科幻不能太不靠谱
taste：有时候比较难描述，比如我喜欢什么样的妹子
It’s easy to figure out if something’s about a topic,but It’s harder to figure out if it matches your taste.
Small Community: Manual
– Tapestry – database of content & comments
通过别人的comments来filter自己想要的内容，比如找有人评论过：“interesting”的，比如，偏执的我在淘宝上买东西的时候，都要挑评论包含：“无异味”的item
– Active CF – easy mechanisms for forwarding content to relevant readers
例如，我们经常用微信分享信息给朋友，就是在filter information（她/他的taste和内容match）给朋友

Automated CF

The GroupLens Project (CSCW ’94)
predict for you which articles you might like to read based on a personalized match between you and other people who shared your taste.

ACF for Usenet News
users rate items
users are correlated with other users
personal predictions for unrated items
Nearest-Neighbor Approach
find people with history of agreement
assume stable tastes

It Works Meaningfully Well!
predicting whether somebody would like an article, than simply looking at the average of what everybody said.

不是很清楚这个结果是怎么得到的

Usenet trial: rating/prediction correlation
• rec.humor: 0.62 (personalized) vs. 0.49 (avg.)
• comp.os.linux.system: 0.55 (pers.) vs. 0.41 (avg.)
• rec.food.recipes: 0.33 (pers.) vs. 0.05 (avg.)
个人的口味一般相差较大
Significantly more accurate than predicting average or modal rating.
Higher accuracy when partitioned by newsgroup
？？？
Relationship with User Behavior
Twice as likely to read 4/5 than 1/2/3
Users Like GroupLens
Some users stayed 12 months after the trial!

The Commercial Era

这就是我们目前所处的时代，网易云音乐，亚马逊等等
2016-02-28_161210.png-77.9kB
用taste相近的用户的rating加权
2016-02-28_161432.png-31.2kB

User-User Collaborative Filtering

2016-02-28_162034.png-49.4kB
首先measure target和所有人的distance,图中的阴影区域即相似的用户，然后用相似用户的rating{2，3}加权得到target的得分3

当用户群体爆炸的时候，会很慢！

Recommenders

Tools to help identify worthwhile stuff

Filtering interfaces
E-mail filters, clipping services
Recommendation interfaces
Suggestion lists, “top-n,” offers and promotions
Prediction interfaces
Evaluate candidates, predicted ratings

A Little Vocabulary

Rating – expression of preference
– Explicit rating (direct from the user)
– Implicit rating (inferred from user activity)
-Prediction – estimate of preference
Recommendation – selected items for user
Content – attributes, text, etc.
-Collaborative – using data from other users

Historical Challenges

Collecting Opinion and Experience Data
Finding the Relevant Data for a Purpose
Computing the Recommendations
Presenting the Data in a Useful Way

Your First Assignment

We are building a class ratings dataset using
the MovieLens infrastructure
– This will be used for several of the assignments
Your assignment is to rate movies through
our interface:
– http://mooc.grouplens.org/ratemovies/

Welcome to the Course!

可以忽略这一节……

Software Environment

easy….

Taxonomy of Recommender Systems (part 1 of 2)

Learning Objectives

To understand the different types of recommender systems
– A framework for analyzing recommender systems in general
– A specific overview of different recommendation algorithms
To acquire a roadmap for the rest of the course, based on the algorithms studied

Analytical Framework

Dimensions of Analysis

Domain
Purpose
Recommendation Context
Whose Opinions
Personalization Level
Privacy and Trustworthiness
Interfaces
Recommendation Algorithms

Domains of Recommendation

Content to Commerce and Beyond
– News, information, “text”
– Products, vendors, bundles（促销组合）
– Matchmaking (other people，比如相亲？)
– Sequences (e.g., music playlists)
One particularly interesting property
– New items (e.g., movies, books, …)
– Re-recommend old ones (e.g., groceries, music)

Google也可以看做web推荐系统

Purposes of Recommendation

The recommendations themselves
– Sales
– Information
Education of user/customer
软件命令推荐（in this case 衡量指标，不应该是使用推荐命令的接受程度。我觉得这是因为，是否使用还和命令设计的好坏有关）
Build a community of users/customers around products or content
tripAdviser

感觉就是大众点评……
ReferralWeb

find technical expertise using key words was mind the network of collaborators that you had.
then
looking for an expert in something,perhaps recommender systems.
And it would find experts that were close to you in your social graph.

感觉没啥推荐技术，但是利用人际网络圈，有点意思

Recommendation Context

What is the User doing at the time of recommendation?
– Shopping
– Listening to Music
如果，我经常切换歌，rs可能会推荐更多陌生的歌
– Hanging out with other people
此时，适合推荐多人的，而不是单人的
How does the context constrain the recommender?
– Groups, automatic consumption (vs. suggestion),level of attention, level of interruption（是指不能太频繁推荐，否则会骚扰用户吗？？）?

Whose Opinion?

“Experts”
Ordinary “phoaks”
所有人
People like you

？？？

Personalization Level

Generic / Non-Personalized
– Everyone receives same recommendations

哥是男的，给我推荐这个……
Demographic
– Matches a target group
例如，男女有别

I fit into the casual men’s demographic.
Ephemeral
– Matches current activity
例如，我现在想买本书

输入歌手名字，依据别人买的书推荐
Persistent
– Matches long-term interests
It had a model of his favorite artists

Privacy and Trustworthiness

Who knows what about me?
– Personal information revealed
– Identity
– Deniability of preferences
Is the recommendation honest?
– Biases built-in by operator
“business rules”
比如，我只推荐还有库存的item，网易云现在也是推荐有版权的
– Vulnerability to external manipulation
老师举了moveilen中的一个怪现象，电影该开始上映的时候，评分往往很高，有人觉得是被黑了，黑客在movielen中创建大量账号，给某一部新上映的电影以洪水般的好评，以求拉高票房获利
老师说其实并不是这样，刚开始评分高，是因为电影一上映，就去看的人，一般都是很期待很喜欢这部电影的，比如变形金刚的铁粉，他们一般会给高评分。
– Transparency of “recommenders”; Reputation
考虑评分人的信用，这可以部分缓解recommendation honest的问题

Interfaces

Types of Output
- Predictions
  score
- Recommendations
  set of items
- Filtering
- Organic vs. explicit presentation
  Agent/Discussion Interface
Types of Input
- Explicit
  比如评分
- Implicit
  比如，最终有没有买，or how often you return to look at a page?

Recommendation Algorithms

Non-Personalized Summary Statistics
Content-Based Filtering
– Information Filtering
– Knowledge-Based
Collaborative Filtering
– User-User
– Item-Item
– Dimensionality Reduction
Others
– Critique / Interview Based Recommendations
– Hybrid Techniques

Taxonomy of Recommender Systems (part 2 of 2)

Linking these together

2016-03-01_220941.png-26.4kB

Non-Personalized Summary Stats

External Community Data
– Best-seller; Most popular; Trending Hot

External 是指？？？？
Summary of Community Ratings
– Best-liked
Examples
– Zagat restaurant ratings
– Billboard music rankings
– TripAdvisor hotel ratings

Content-Based Filtering

User Ratings x Item Attributes => Model
item的attribute，比如通过你的rating知道你喜欢动作片，那么就给你推荐受欢迎的动作片
Model applied to new items via attributes

user’s preferences dot product item’s attributes ,见台大机器学习基石第一课
Alternative: knowledge-based
– Item attributes form model of item space
比如新闻主题
Users navigate/browse that space
Examples
– Personalized news feeds
– Artist or Genre music feeds

Personalized Collaborative Filtering

Use opinions of others to predict/recommend
User model – set of ratings
Item model – set of ratings
Common core: sparse matrix of ratings
如果不是sparse matrix 那我们就不用推荐了……
– Fill in missing values (predict)
– Select promising cells (recommend)
Several different techniques

Techniques

User-user
– Select neighborhood of similar-taste people
Variant: select people you know/trust
– Use their opinions
Item-item
– Pre-compute similarity among items via ratings
– Use own ratings to triangulate for recommendations
Dimensionality reduction
– Intuition: taste yields a lower-dimensionality matrix
– Compress and use a taste representation

矩阵压缩，听起来很神奇

Note on Evaluation

To properly understand relative merits of each approach, we will spend significant time on evaluation
– Accuracy of predictions
– Usefulness of recommendations
Correctness
Non-obviousness
Diversity
– Computational performance

Other Approaches

Interactive recommenders
– Critique-based, dialog-based
Hybrids of various techniques

A Tour of Amazon.com

Learning Objectives

To explore a wide range of recommender systems in the context of a large,professional site
To understand how to review a recommender-enabled site
未登录

搜索某一商品

关注的是当前，item-item product association比较有意思
有了购买历史/浏览历史