例如:假设有两个主题,主题1有4个相关网页,主题2有5个相关网页。某系统对于主题1检索出4个相关网页,其rank分别为1, 2, 4, 7;对于主题2检索出3个相关网页,其rank分别为1,3,5。对于主题1,平均准确率为(1/1+2/2+3/4+4/7)/4=0.83。对于主题 2,平均准确率为(1/1+2/3+3/5+0+0)/5=0.45。则MAP=(0.83+0.45)/2=0.64。
MRR是把标准答案在被评价系统给出结果中的排序取倒数作为它的准确度,再对所有的问题取平均。
MAP可以由它的三个部分来理解:P,AP,MAP
正确率只是考虑了返回结果中相关文档的个数,没有考虑文档之间的序。对一个搜索引擎或推荐系统而言返回的结果必然是有序的,而且越相关的文档排的越靠前越好,于是有了AP的概念。对一个有序的列表,计算AP的时候要先求出每个位置上的precision,然后对所有的位置的precision再做个average。如果该位置的文档是不相关的则该位置
Precision
Main article: Precision and recall
Precision is the fraction of the documents retrieved that are relevant to the user’s information need.
In binary classification, precision is analogous to positive predictive value. Precision takes all retrieved documents into account. It can also be evaluated at a given cut-off rank, considering only the topmost results returned by the system. This measure is called precision at n or P@n.
Note that the meaning and usage of “precision” in the field of information retrieval differs from the definition of accuracy and precision within other branches of science and statistics.
Recall
Main article: Precision and recall
Recall is the fraction of the documents that are relevant to the query that are successfully retrieved.
In binary classification, recall is often called sensitivity. So it can be looked at as the probability that a relevant document is retrieved by the query.
It is trivial to achieve recall of 100% by returning all documents in response to any query. Therefore, recall alone is not enough but one needs to measure the number of non-relevant documents also, for example by computing the precision.
Average precision
Precision and recall are single-value metrics based on the whole list of documents returned by the system. For systems that return a ranked sequence of documents, it is desirable to also consider the order in which the returned documents are presented. By computing a precision and recall at every position in the ranked sequence of documents, one can plot a precision-recall curve, plotting precision
p(r)
as a function of recall
r
. Average precision computes the average value of
p(r)
over the interval from
r=0
to
r=1
That is the area under the precision-recall curve. This integral is in practice replaced with a finite sum over every position in the ranked sequence of documents:
AveP=∑k=1nP(k)Δr(k)
where
k
is the rank in the sequence of retrieved documents,
n
is the number of retrieved documents,
P(k)
is the precision at cut-off
k
in the list, and
Δr(k)
is the change in recall from items
k−1
to
k
This finite sum is equivalent to:
Mean average precision
Mean average precision for a set of queries is the mean of the average precision scores for each query.
where Q is the number of queries.