标注任务也很常见, 比如 提取文本中的若干个关键字(key-word extraction)当标签.
标注任务可以是有监督学习, 也可以是无监督学习.
1.评测
标注任务的评价指标和分类任务的评价指标一样,常用的有准确率,精确率,召回率。
1.1单个任务评测指标
记target=S1, predict=S2,
- precision
p r e c i s i o n = ∣ S 1 ∩ S 2 ∣ ∣ S 2 ∣ precision=\frac {|S1 \cap S2|} {|S2|} precision=∣S2∣∣S1∩S2∣ - recall
r e c a l l = ∣ S 1 ∩ S 2 ∣ ∣ S 1 ∣ recall=\frac {|S1 \cap S2|} {|S1|} recall=∣S1∣∣S1∩S2∣
1.2 汇总指标
同多分类一样, 有宏平均和微平均.
2.评测代码
import numpy as np
from typing import List,Dict,Set
class EvaluateResult:
def __init__(self):
self.sample_number=0
self.macro_average_precision=0
self.macro_average_recall=0
self.micro_average_precision=0
self.micro_average_recall=0
def __str__(self):
return str(self.__dict__)
def evaluate(answer_dict:Dict[int,Set[str]],predict_dict:Dict[int,Set[str]]):
# precision
precision_numerator=[]
precision_denominator=[]
recall_numerator=[]
recall_denominator=[]
for content_id in predict_dict.keys():
if content_id not in answer_dict:
continue
elif len(predict_dict[content_id]) ==0 or len(answer_dict[content_id])==0:
pass
else:
s1=answer_dict[content_id]
s2=predict_dict[content_id]
precision_numerator.append(len(s1&s2))
precision_denominator.append(len(s2))
recall_numerator.append(len(s1&s2))
recall_denominator.append(len(s1))
precision_numerator=np.array(precision_numerator)
precision_denominator=np.array(precision_denominator)
recall_numerator=np.array(recall_numerator)
recall_denominator=np.array(recall_denominator)
result=EvaluateResult()
result.sample_number=len(precision_numerator)
result.macro_average_precision=(precision_numerator/precision_denominator).mean()
result.macro_average_recall=(recall_numerator/recall_denominator).mean()
result.micro_average_precision=precision_numerator.sum()/precision_denominator.sum()
result.micro_average_recall=recall_numerator.sum()/recall_denominator.sum()
return result
answer_dict={1:set(['你好,小米']),2:set(['铅笔','自动'])}
predict_dict={1:set(['小米']),2:set(['气球','自动'])}
print(evaluate(answer_dict,predict_dict))
"""
{'sample_number': 2, 'micro_average_precision': 0.33333333333333331, 'micro_average_recall': 0.33333333333333331, 'macro_average_precision': 0.25, 'macro_average_recall': 0.25}
"""
参考
- 我的另一篇blog, 分类 简述-评测指标
- stackexchange, macro average vs. micro average