推荐召回模型评估指标（AUC、HR、F1、Precision、Recall）代码实现

最新推荐文章于 2024-09-23 21:02:36 发布

silentkunden

最新推荐文章于 2024-09-23 21:02:36 发布

阅读量2k

点赞数 4

分类专栏： Python 推荐系统文章标签： python 数据挖掘推荐算法

本文链接：https://blog.csdn.net/java_hzp/article/details/125751482

版权

Python 同时被 2 个专栏收录

7 篇文章

订阅专栏

推荐系统

1 篇文章

订阅专栏

这篇博客介绍了如何计算模型的AUC（Area Under the Curve）、HR（Hit Rate）、Precision、Recall以及F1 Score。AUC通过比较正负样本对的排序来评估模型；HR关注的是在召回的Top K中正样本的比例；Precision衡量召回的K个样本中有多少是真正正样本；Recall则关注模型找回了多少正样本；而F1 Score是Precision和Recall的调和平均数，综合评估模型性能。代码示例展示了如何用Python实现这些指标的计算。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

AUC计算的关键是找到所有正样的预测值大于负样本预测值的正负样本对。如下表格，假设召回模型召回topk=4，分布为ABCD，其中真实样本中，B、D为正例（这里正例1代表用户点击的样本，0为未点击样本），那么该AUC计算如下：

$AUC=\frac{\sum Rank_{i}-\frac{m*(m+1)}{2}}{m*(N-m))},i\epsilon positive$

N为样本的总数，m为正例的个数。那么表格中的 $AUC=\frac{1+3-2*(2+1)/2)}{2*2}=0.25$

举例表
样本	真实样本	预测得分	排序
A	0	0.8	1
B	1	0.7	2
C	0	0.6	3
D	1	0.5	4

那么代码可以实现，如下：

# 计算topk召回的auc
def calculate_auc(recall_items: list, true_items: list):
    N = len(recall_items)
    if N == 0:
        return 0
    # hit_item = set(recall_items) & set(true_items)  # 忽略重复点击情况
    hit_item = [item for item in true_items if item in recall_items]
    m = len(hit_item)
    if m == N:
        return 0
    rank_i = [N - recall_items.index(i) for i in hit_item]
    return (sum(rank_i) - (m + 1) * m / 2) / (m * (N - m))

这个比较好理解

$HR = \sum \frac{hit_{i}}{N}$

# topk召回HR
def calulate_HR(recall_items: list, true_items: list):
    N = len(recall_items)
    M = len(true_items)
    if N == 0 or M == 0:
        return 0
    hit_num = 0
    for item in true_items:
        if item in recall_items:
            hit_num += 1
    return hit_num / M

Precision

Precision就是召回了K个item，K个item中被点击了n个，那么Precision = n / K

# topk召回Precision
def calulate_Precision(recall_items: list, true_items: list):
    N = len(recall_items)
    M = len(true_items)
    if N == 0 or M == 0:
        return 0
    hit_items = set(recall_items) & set(true_items)  # 忽略重复点击情况
    return len(hit_items) / N

Recall

Recall是用户点击的M个item中，k个物品是在召回模型推荐列表的，那么Recall = k / M

# topk 召回Recall
def calulate_Recall(recall_items: list, true_items):
    N = len(recall_items)
    M = len(true_items)
    if N == 0 or M == 0:
        return 0
    hit_items = [item for item in recall_items if item in true_items]
    return len(hit_items) / M

F1 = 2 * Precision*Recall / (Precision + Recall)

# topk 召回F1
def calulate_F1(recall_items, true_items):
    Recall = calulate_Recall(recall_items, true_items)
    Precision = calulate_Precision(recall_items, true_items)
    if Recall != 0 or Precision != 0:
        return 2 * Precision * Recall / (Recall + Precision)