各种指标计算

本文介绍了如何使用CER和WER评估中文文本预测的准确性,以及如何通过find_best_f1_and_threshold和find_best_accuracy_and_threshold函数找到最优的F1分数和精度阈值。同时提及了与sklearn.metrics相关的内容。
摘要由CSDN通过智能技术生成

字错误率(cer)

使用于中文等不以空格为分隔的语言

import evaluate

metric = evaluate.load("cer")
print(metric.compute(predictions=['你吃了吗', '今天我要去打篮球'], references=["我吃了么a" , '明天我要去打篮球']))

词错误率(wer)

import evaluate

metric = evaluate.load("wer")
print(metric.compute(predictions=['a b c', 'd e f'], references=['a b c', '1 2 3']))

获取最优f1

def find_best_f1_and_threshold(scores, labels, ):
    """
    
    :param scores: 正样本概率
    :param labels: 真实标签
    :return: 
    """
    assert len(scores) == len(labels)

    scores = np.asarray(scores)
    labels = np.asarray(labels)

    rows = list(zip(scores, labels))

    rows = sorted(rows, key=lambda x: x[0], reverse=True)

    best_f1 = best_precision = best_recall = 0
    threshold = 0
    nextract = 0
    ncorrect = 0
    total_num_duplicates = sum(labels)

    for i in range(len(rows) - 1):
        score, label = rows[i]
        nextract += 1

        if label == 1:
            ncorrect += 1

        if ncorrect > 0:
            precision = ncorrect / nextract
            recall = ncorrect / total_num_duplicates
            f1 = 2 * precision * recall / (precision + recall)
            if f1 > best_f1:
                best_f1 = f1
                best_precision = precision
                best_recall = recall
                threshold = (rows[i][0] + rows[i + 1][0]) / 2

    return best_f1, best_precision, best_recall, threshold

寻找最优accuracy

from sklearn.metrics import accuracy_score

def find_best_accuracy_and_threshold(scores, labels, ):
    """

        :param scores: 正样本概率
        :param labels: 真实标签
        :return:
        """
    best_accuracy = 0
    best_th = 0
    for th in scores:
        pre_label = [1 if score >= th else 0 for score in scores]
        accuracy = accuracy_score(labels, pre_label)
        if accuracy > best_accuracy:
            best_accuracy = accuracy
            best_th = th

    return best_accuracy, best_th

  • 3
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值