大模型训练过程中的常见评估指标及解析

最新推荐文章于 2024-07-15 17:50:53 发布

酱紫肘胃？

最新推荐文章于 2024-07-15 17:50:53 发布

阅读量579

点赞数 5

文章标签：自然语言处理人工智能 pytorch

本文链接：https://blog.csdn.net/qq_46326065/article/details/139350218

版权

在使用数据集对大模型进行训练后，我们肯定需要对结果进行评估，这就要求我们对常见的评估标准要有一定的了解。

1. 精度（Accuracy）

精度是指模型预测正确的样本数占总样本数的比例。对于分类任务，精度是最基本的评估指标。 Accuracy=Number of correct predictionsTotal number of predictionsAccuracy=Total number of predictionsNumber of correct predictions

2. 精确率（Precision）

精确率是指在模型预测为正类的样本中，实际为正类的比例。精确率特别适用于关注假阳性较多的场景。 Precision=True PositivesTrue Positives + False PositivesPrecision=True Positives + False PositivesTrue Positives

3. 召回率（Recall）

召回率是指实际为正类的样本中，被模型正确预测为正类的比例。召回率适用于关注假阴性较多的场景。 Recall=True PositivesTrue Positives + False NegativesRecall=True Positives + False NegativesTrue Positives

4. F1 分数（F1 Score）

F1 分数是精确率和召回率的调和平均数，是综合考虑这两个指标的评估标准。 F1 Score=2×Precision×RecallPrecision + RecallF1 Score=Precision + Recall2×Precision×Recall

5. ROC-AUC

ROC 曲线（接收者操作特征曲线）表示模型的真正率（True Positive Rate）与假正率（False Positive Rate）之间的关系，AUC（曲线下面积）则表示模型区分正负样本的能力。 AUC=Area under the ROC CurveAUC=Area under the ROC Curve