分类模型中---softmax 函数

本文链接：https://blog.csdn.net/KIKI3666/article/details/145851725

在分类模型中，softmax 函数通常用于将模型的输出转换为概率分布。softmax 函数可以确保输出的每个类别的概率值在 0 到 1 之间，并且所有类别的概率值之和为 1。这使得输出可以被解释为每个类别的概率。

1. `softmax` 函数的定义

softmax 函数的定义如下：

2. 在分类模型中的应用

假设你有一个分类模型，输出层是一个全连接层，输出维度为类别数。例如，如果你有 10 个类别，输出层的维度为 10。模型的输出可以表示为一个向量 z，其中每个元素 zi 表示模型对第 i 个类别的原始预测值。

3. 使用 `softmax` 函数

在 PyTorch 中，你可以使用 torch.nn.functional.softmax 函数来应用 softmax。以下是一个简单的示例：

import torch
import torch.nn.functional as F

# 假设模型的输出是一个形状为 (batch_size, num_classes) 的张量
output = torch.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])

# 应用 softmax 函数
probabilities = F.softmax(output, dim=1)

print(probabilities)

输出解释

假设 output 是模型的原始输出：

output = torch.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])

应用 softmax 函数后，输出将转换为概率分布：

tensor([[0.0900, 0.2447, 0.6652],
        [0.0900, 0.2447, 0.6652]])

每个行向量表示一个样本的类别概率分布。例如，第一个样本属于类别 0、1 和 2 的概率分别为 0.0900、0.2447 和 0.6652。

4. 获取预测类别

通常，我们会选择概率最高的类别作为预测类别。在 PyTorch 中，可以使用 torch.argmax 函数来获取每个样本的预测类别：

# 获取每个样本的预测类别
predicted_classes = torch.argmax(probabilities, dim=1)

print(predicted_classes)

输出解释

tensor([2, 2])

这表示第一个样本和第二个样本都被预测为类别 2。

5. 在训练和评估中的应用

在训练和评估过程中，通常会使用 softmax 函数来计算损失函数（如交叉熵损失）和评估指标（如准确率）。

训练过程

import torch
import torch.nn as nn
import torch.optim as optim

# 假设你有一个简单的分类模型
class SimpleClassifier(nn.Module):
    def __init__(self, input_size, num_classes):
        super(SimpleClassifier, self).__init__()
        self.fc = nn.Linear(input_size, num_classes)

    def forward(self, x):
        return self.fc(x)

# 初始化模型、损失函数和优化器
model = SimpleClassifier(input_size=10, num_classes=3)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# 假设你有一个输入数据和标签
input_data = torch.randn(2, 10)
labels = torch.tensor([2, 2])

# 前向传播
output = model(input_data)

# 计算损失
loss = criterion(output, labels)

# 反向传播和优化
optimizer.zero_grad()
loss.backward()
optimizer.step()

# 输出损失
print(f'Loss: {loss.item()}')

评估过程

# 假设你有一个测试数据
test_data = torch.randn(2, 10)
test_labels = torch.tensor([2, 2])

# 前向传播
with torch.no_grad():
    test_output = model(test_data)

# 应用 softmax 函数
test_probabilities = F.softmax(test_output, dim=1)

# 获取预测类别
test_predicted_classes = torch.argmax(test_probabilities, dim=1)

# 计算准确率
accuracy = (test_predicted_classes == test_labels).float().mean()
print(f'Accuracy: {accuracy.item()}')

通过这些步骤，你可以更好地理解和应用 softmax 函数在分类模型中的作用。

在目标检测中的应用

1. YOLO 系列

Sigmoid 的使用：在 YOLO 系列中，每个锚框（anchor box）可能对应多个类别。因此，使用 Sigmoid 更合适，因为它允许一个目标属于多个类别。每个类别的预测是独立的，输出值表示该类别出现的概率。
优点：
- 多标签支持：允许一个目标属于多个类别。
- 灵活性：适用于多标签场景，如物体检测中的多个类别标注。
缺点：
- 独立预测：不同类别的置信度之间没有竞争，可能导致多个高置信度的预测，特别是在目标重叠或特征相似的情况下，容易出现误检。

2. 误检问题

问题：使用 Sigmoid 时，每个类别的预测是独立的，这可能导致不同类别的置信度之间没有竞争，从而出现多个高置信度的预测，特别是在目标重叠或特征相似的情况下，导致误检。
解决方案：
- 非极大值抑制（NMS）：在检测结果中，通过非极大值抑制（Non-Maximum Suppression, NMS）来过滤掉重叠的检测框，保留置信度最高的检测框。
- 多尺度检测：使用多尺度检测来处理不同大小的目标，减少误检。
- 特征增强：通过增强特征提取网络，提高模型对不同类别特征的区分能力。

示例代码

以下是一个简单的示例，展示了如何在 YOLO 中使用 Sigmoid 和 Softmax 进行类别预测：

import torch
import torch.nn.functional as F

# 假设我们有一个模型输出，形状为 (batch_size, num_anchors, num_classes)
output = torch.randn(1, 10, 20)  # 1个批次，10个锚框，20个类别

# 使用 Sigmoid 进行多标签分类
sigmoid_output = torch.sigmoid(output)
print("Sigmoid Output:", sigmoid_output)

# 使用 Softmax 进行单标签分类
softmax_output = F.softmax(output, dim=-1)
print("Softmax Output:", softmax_output)

# 非极大值抑制（NMS）示例
def non_max_suppression(boxes, scores, iou_threshold=0.5):
    # boxes: (num_boxes, 4) 每个框的坐标
    # scores: (num_boxes,) 每个框的置信度
    indices = torch.argsort(scores, descending=True)
    keep = []
    while indices.size(0) > 0:
        current_idx = indices[0]
        keep.append(current_idx)
        if indices.size(0) == 1:
            break
        current_box = boxes[current_idx]
        ious = box_iou(current_box, boxes[indices[1:]])
        indices = indices[1:][ious < iou_threshold]
    return keep

def box_iou(box1, box2):
    # 计算两个框的 IoU
    # box1: (4,) 单个框的坐标
    # box2: (num_boxes, 4) 多个框的坐标
    x1, y1, x2, y2 = box1
    x1s, y1s, x2s, y2s = box2.unbind(dim=1)
    inter_x1 = torch.max(x1, x1s)
    inter_y1 = torch.max(y1, y1s)
    inter_x2 = torch.min(x2, x2s)
    inter_y2 = torch.min(y2, y2s)
    inter_area = (inter_x2 - inter_x1).clamp(0) * (inter_y2 - inter_y1).clamp(0)
    box1_area = (x2 - x1) * (y2 - y1)
    box2_area = (x2s - x1s) * (y2s - y1s)
    union_area = box1_area + box2_area - inter_area
    return inter_area / union_area

# 假设我们有一个检测框和置信度
boxes = torch.tensor([[10, 10, 50, 50], [15, 15, 55, 55], [60, 60, 100, 100]])
scores = torch.tensor([0.9, 0.8, 0.7])

# 应用 NMS
keep_indices = non_max_suppression(boxes, scores)
print("Kept Indices:", keep_indices)