Performance Measures and Evaluation on IR System

All common measures generally assume a ground truth notion of relevance: every document is known to be either relevant or non-relevance to a particular query.

1. Precision and Recall

Precision is the fraction of the documents retrieved that are relevant to the user’s information need.

Recall is the fraction of the documents that are relevant to the query that are successful retrieved.

https://img-blog.csdn.net/20141101210900756?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQveXVkZjIwMTA=/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center

:   Retrieved documents    Relevant documents 

So, we will have



2.  Fall-out

Fall-out is the proportion of non-relevant documents that are retrieved, out of all non-relevant documents available:


It can be looked at as the probability that a non-relevant document is retrieved by a query.


3.      F-measure

F-measure or F-score is the weighted harmonic mean of precision and recall.

The traditional F-measure or balanced F-score is:


The general formula for non-negative real  is



4. Average Precision

By computing a precision and recall at every position in the ranked sequence of documents, one can plot a precision-recall curve, plotting precision as a function of recall .

Average Precision computes the average value of  over the interval from to .


This integral is in practice replaced with a finite sum over every position in the ranked sequence of documents.

Where k is the rank in the sequence of retrieved documents, n is the number of retrieved documents,P(k) is the precision at cut-off k in the list, and  is the change in recall from items k-1 to k.


5. R-Precision

Precision at position in the ranking of results for a query that has R relevant documents.

6. Mean average precision

Mean average precision for a set of queries is the mean of the average precision scores for each query.


Where Q is the number of queries.


7. Discounted cumulative gain

DCG uses a graded relevance scale of documents from the results set to evaluate the usefulness or gain, of a document based on its position in the result list.

The DCG accumulated at a particular rank position p is defined as:


Precision and Recall

1. Information Retrieval

  • Precision is defined as the number of relevant documents retrieved by a search divided by the total number of documents retrieved by that search.
  • Recall is defined as the number of relevant documents retrieved by a search divided by the total number of existing relevant documents.

2. Classification task

  • Precision is defined as the number of true positives divided by the total number of elements labeled as belonging to the positive class (i.e.the sum of true positives and false positives). Precision is also called positive predict value (PPV).
  • Recall is defined as the number of true positives divided by the total number of elements that actually belong to positive class (i.e.the sum of true positives and false negatives). Recall is also called sensitivity or true positive rate.

3. Relationship

Often, there is an inverse relationship between precision and recall.Usually, precision and recall scores are not discussed in isolation. Instead,either values for one measure are compared for a fixed level at the other measure or both are combined into a single measure (such as F-measure).

 

Confusion Matrix(contingency table)

Each column of the matrix represents the instance in a predicted class, while each row represents the instances in an actual class.

Confusion Matrix allows more detailed analysis than accuracy. Accuracy is not a reliable metric for the real performance of a classifier, because it will yield misleading results if the data set is unbalanced (that is, when the number of samples in different classes vary greatly).


Reference:

[1] http://en.wikipedia.org/wiki/Information_retrieval

[2] http://en.wikipedia.org/wiki/Precision_and_recall

[3] http://en.wikipedia.org/wiki/Confusion_matrix


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值