标注任务及评测指标

标注任务也很常见, 比如 提取文本中的若干个关键字(key-word extraction)当标签.
标注任务可以是有监督学习, 也可以是无监督学习.

1.评测

标注任务的评价指标和分类任务的评价指标一样,常用的有准确率,精确率,召回率。

1.1单个任务评测指标

记target=S1, predict=S2,

  • precision
    p r e c i s i o n = ∣ S 1 ∩ S 2 ∣ ∣ S 2 ∣ precision=\frac {|S1 \cap S2|} {|S2|} precision=S2S1S2
  • recall
    r e c a l l = ∣ S 1 ∩ S 2 ∣ ∣ S 1 ∣ recall=\frac {|S1 \cap S2|} {|S1|} recall=S1S1S2

1.2 汇总指标

同多分类一样, 有宏平均和微平均.

2.评测代码

import numpy as np
from typing import List,Dict,Set

class EvaluateResult:
    def __init__(self):
        self.sample_number=0
        self.macro_average_precision=0
        self.macro_average_recall=0

        self.micro_average_precision=0
        self.micro_average_recall=0

    def __str__(self):
        return str(self.__dict__)


def evaluate(answer_dict:Dict[int,Set[str]],predict_dict:Dict[int,Set[str]]):
    # precision
    precision_numerator=[]
    precision_denominator=[]

    recall_numerator=[]
    recall_denominator=[]

    for content_id in predict_dict.keys():
        if content_id not in answer_dict:
            continue
        elif len(predict_dict[content_id]) ==0 or len(answer_dict[content_id])==0:
            pass
        else:
            s1=answer_dict[content_id]
            s2=predict_dict[content_id]

            precision_numerator.append(len(s1&s2))
            precision_denominator.append(len(s2))

            recall_numerator.append(len(s1&s2))
            recall_denominator.append(len(s1))
    precision_numerator=np.array(precision_numerator)
    precision_denominator=np.array(precision_denominator)

    recall_numerator=np.array(recall_numerator)
    recall_denominator=np.array(recall_denominator)
    
    result=EvaluateResult()
    result.sample_number=len(precision_numerator)
    result.macro_average_precision=(precision_numerator/precision_denominator).mean()
    result.macro_average_recall=(recall_numerator/recall_denominator).mean()

    result.micro_average_precision=precision_numerator.sum()/precision_denominator.sum()
    result.micro_average_recall=recall_numerator.sum()/recall_denominator.sum()
    return result


answer_dict={1:set(['你好,小米']),2:set(['铅笔','自动'])}
predict_dict={1:set(['小米']),2:set(['气球','自动'])}

print(evaluate(answer_dict,predict_dict))

"""
{'sample_number': 2, 'micro_average_precision': 0.33333333333333331, 'micro_average_recall': 0.33333333333333331, 'macro_average_precision': 0.25, 'macro_average_recall': 0.25}
"""

参考

  1. 我的另一篇blog, 分类 简述-评测指标
  2. stackexchange, macro average vs. micro average
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值