python 使用numpy计算混淆矩阵

mynaiskey

已于 2023-05-10 09:55:06 修改

阅读量3k

点赞数

分类专栏：杂项文章标签： python

于 2022-04-18 19:13:37 首次发布

本文链接：https://blog.csdn.net/mynaiskey/article/details/124256925

版权

杂项专栏收录该内容

11 篇文章

订阅专栏

本文介绍了如何使用numpy在Python中计算混淆矩阵，并展示了混淆矩阵的标准化过程。通过一个具体的例子，解释了混淆矩阵中每个元素的含义，以及如何在测试模型时构建混淆矩阵。此外，还提到了TensorFlow中计算混淆矩阵的函数。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

python 使用numpy计算混淆矩阵

混淆矩阵（Confusion Matrix）中每行表示真实的类别，每列表示预测的类别。

假如一个模型要预测的类别有三个，分别为A、B、C，使用模型预测测试集得到以下结果:

在这里插入图片描述

我们一列一列来看，先看第一列：30、15、5

这里我们的测试集有且只有三个分类A、B、C；也就是真实分类A、B、C就对应着测试集的总体，对于一个样本的预测也只可能是这三者之一。

模型预测值为A的，实际标签不一定就是A，但它一定是A、B、C三者之一，这里预测为A的前提下：真实值为A的有30个、真实值为B的有15个、真实值为C的有5个。

上述表格用numpy表示如下

import numpy as np
# 混淆矩阵
c_matrix = np.array([[30,  7,   3],
                    [ 15, 22,  3],
                    [ 5,   1,  14]])
print(c_matrix.shape)  # (3,3)
print(c_matrix[0][1])  # 7

也可对混淆矩阵进行标准化，使其值在0到1之间

# 混淆矩阵标准化（这里使用L1规范化，是对每一行来说规范化)
print(c_matrix.sum(axis=1))
print(c_matrix.sum(axis=1)[:, np.newaxis])  
c_matrix = c_matrix / c_matrix.sum(axis=1)[:, np.newaxis]
print(c_matrix)

[40 40 20]
[[40]
 [40]
 [20]]
[[0.75  0.175 0.075]
 [0.375 0.55  0.075]
 [0.25  0.05  0.7  ]]

在测试模型时计算

c_matrix = np.zeros( (len(class_names), len(class_names)) )  # 混淆矩阵
for images, labels in test_ds.take(total_batch):
        labels = labels.numpy()
        predictions = model.predict(images)
        score = tf.nn.softmax(predictions)
        for index, elem in enumerate(score):
            col, row = np.argmax(elem), labels[index]
            c_matrix[row][col] += 1
c_matrix = c_matrix / c_matrix.sum(axis=1)[:, np.newaxis]

当然tensorflow中也有直接计算混淆矩阵的函数

import tensorflow as tf

y_true = [0, 1, 2, 0, 1, 2]
y_pred = [0, 1, 1, 0, 2, 1]

c_matrix = tf.math.confusion_matrix(labels=y_true, predictions=y_pred, num_classes=3)
print(c_matrix)