机器学习指标_20种流行的机器学习指标第2部分排名统计指标

机器学习指标

介绍(Introduction)

In the first part of this post, I provided an introduction to 10 metrics used for evaluating classification and regression models. In this part, I am going to provide an introduction to the metrics used for evaluating models developed for ranking (AKA learning to rank), as well as metrics for statistical models. In particular, I will cover the talk about the below 5 metrics:

在本文的第一部分中,我介绍了用于评估分类和回归模型的10个指标。 在这一部分中,我将介绍用于评估为排名而开发的模型(即AKA学习排名)的度量以及统计模型的度量。 特别是,我将讨论以下5个指标:

  • Mean reciprocal rank (MRR)

    平均倒数排名(MRR)

  • Precision at k

    k精度

  • DCG and NDCG (normalized discounted cumulative gain)

    DCG和NDCG(归一化折现累计收益)

  • Pearson correlation coefficient

    皮尔逊相关系数

  • Coefficient of determination (R²)

    测定系数(R²)

排名相关指标 (Ranking Related Metrics)

Ranking is a fundamental problem in machine learning, which tries to rank a list of items based on their relevance in a particular task (e.g. ranking pages on Google based on their relevance to a given query). It has a wide range of applications in E-commerce, and search engines, such as:

排名是在机器学习,它试图排名根据他们在一个特定的任务相关的项目列表的一个基本问题(根据相关度给定查询上谷歌排名例如页)。 它在电子商务和搜索引擎中具有广泛的应用,例如:

  • Movie recommendation (as in Netflix, and YouTube),

    电影推荐(如NetflixYouTube ),

  • Page ranking on Google,

    Google上的网页排名,

  • Ranking E-commerce products on Amazon,

    亚马逊上对电子商务产品进行排名,

  • Query auto-completion,

    查询自动完成
  • Image search on vimeo,

    vimeo上进行图片搜索,

  • Hotel search on Expedia/Booking.

    Expedia /预订上搜索酒店。

In learning to rank problem, the model tries to predict the rank (or relative order) of a list of items for a given task¹. The algorithms for ranking problem can be grouped into:

在学习排名问题时,模型会尝试预测给定任务¹的项目列表的排名(或相对顺序)。 排名问题的算法可以分为:

  • Point-wise models: which try to predict a (matching) score for each query-document pair in the dataset, and use it for ranking the items.

    逐点模型:尝试为数据集中的每个查询文档对预测一个(匹配)分数,并将其用于对项目进行排名。

  • Pair-wise models: which try to learn a binary classifier that can tell which document is more relevant to a query, given pair of documents.

    逐对模型:在给定的成对文档中,尝试学习二进制分类器,该分类器可以告诉哪个文档与查询更相关。

  • List-wise models: which try to directly optimize the value of one of the above evaluation measures, averaged over all queries in the training data.

    列表式模型:尝试直接优化上述评估方法之一的价值,对培训数据中的所有查询取平均值。

During evaluation, given the ground-truth order of the list of items for several queries, we want to know how good the predicted order of those list of items is.

在评估过程中,给定几个查询的项目列表的真实顺序,我们想知道这些项目列表的预测顺序有多好。

There are various metrics proposed for evaluating ranking problems, such as:

建议使用各种度量标准来评估排名问题,例如:

  • MRR

    MRR
  • Precision@ K

    精度@ K
  • DCG & NDCG

    DCG和NDCG
  • MAP

    地图
  • Kendall’s tau

    肯德尔的牛头
  • Spearman’s rho

    斯皮尔曼的罗

In this post, we focus on the first 3 metrics above, which are the most popular metrics for ranking problem.

在本文中,我们重点介绍上面的前3个指标,它们是排名问题中最受欢迎的指标。

Some of these metrics may be very trivial, but I decided to cover

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值