博客笔记四: [Airbnb] word embedding改编,list embedding表达相似性用于推荐系统

原文标题:Listing Embeddings for Similar Listing Recommendations and Real-time Personalization in Search Ranking
By Mihajlo Grbovic, Haibin Cheng, Qing Zhang, Lynn Yang, Phillippe Siclait and Matt Jones
https://medium.com/airbnb-engineering/listing-embeddings-for-similar-listing-recommendations-and-real-time-personalization-in-search-601172f7603e

总结:

In this blog post we describe a Listing Embedding technique we developed and deployed at Airbnb for the purpose of improving Similar Listing Recommendations and Real-Time Personalization in Search Ranking. The embeddings are vector representations of Airbnb homes learned from search sessions that allow us to measure similarities between listings. They effectively encode many listing features, such as location, price, listing type, architecture and listing style, all using only 32 float numbers. We believe that the embedding approach for personalization and recommendation is very powerful and useful for any type of online marketplace on the Web.
* 什么是list: Entire Home, Private Room, Shared Room之类的

  1. 灵感来自于nlp的word embedding( The networks are trained by directly taking into account the word order and their co-occurrence, based on the assumption that words frequently appearing together in the sentences also share more statistical dependence. ),并且已经应用到很多其他非nlp方向了。比如items that were clicked or purchased or queries and ads that were clicked
  2. dimensionality维度d=32
  3. 采用negative sampling 的方法训练词向量。https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
  4. 冷启动Cold-start Embeddings:3个最邻近To create embeddings for a new listing we find 3 geographically closest listings that do have embeddings, and are of same listing type and price range as the new listing, and calculate their mean vector.
  5. 评估embedding效果:
    • kmeans查看是否embedding包含了地理信息。First, to evaluate if geographical similarity is encoded we performed k-means clustering on learned embeddings.
    • 使用cosine检查相似价格范围与类型的list果然enbedding也很相似。confirmed that cosine similarities between listings of same type and price ranges are much higher compared to similarities between listings of different type and price ranges.
  6. 线下测试list embedding
    比较最好的结果和实际客户最新点击的list type。One way of evaluating trained embeddings is to test how good they are in recommending listings that the user would book, based on their most recent click.
  7. a/b test: The A/B test showed that embedding-based solution lead to a 21% increase in Similar Listing carousel CTR and 4.9% more guests discovering the listing they ended up booking in the Similar Listing carousel.
感受:
  • list embeddings to calculate similarities between listings 应用到recommendation applications
  • word embedding 还没有仔细学过,所以看的收获不是很大。大概意思懂了,只是不知道这样的应用是不是很普遍。
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值