案例4：鸢尾花分类（pytorch）

最新推荐文章于 2025-04-14 11:18:11 发布

shadowtalon

最新推荐文章于 2025-04-14 11:18:11 发布

阅读量1.1k

点赞数 19

分类专栏：机器学习实战百例文章标签： python 机器学习人工智能深度学习 pytorch

本文链接：https://blog.csdn.net/shadowtalon/article/details/146476428

版权

机器学习实战百例专栏收录该内容

6 篇文章

订阅专栏

一、引言

鸢尾花分类是机器学习领域的经典案例，常用于演示分类算法的基本原理和应用。本案例使用 PyTorch 构建一个简单的神经网络模型，对鸢尾花进行分类。通过该案例，我们可以学习如何使用 PyTorch 进行数据处理、模型构建、训练和评估，以及如何对分类结果进行可视化分析。

二、环境准备与数据加载

（一）导库

import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix, accuracy_score
from sklearn.decomposition import PCA
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

torch：PyTorch 深度学习框架，用于构建和训练神经网络。
torch.nn：包含了构建神经网络所需的各种层和损失函数。
torch.optim：提供了各种优化算法，如 Adam、SGD 等。
sklearn.datasets.load_iris：用于加载鸢尾花数据集。
sklearn.model_selection.train_test_split：用于将数据集划分为训练集和测试集。
sklearn.preprocessing.StandardScaler：用于对数据进行标准化处理。
sklearn.metrics.confusion_matrix 和 sklearn.metrics.accuracy_score：用于评估模型的性能。
sklearn.decomposition.PCA：用于对数据进行主成分分析（PCA）降维。
numpy：用于数值计算。
matplotlib.pyplot：用于绘制图表。
seaborn：基于 matplotlib 的可视化库，提供更美观的图表样式。

（二）加载数据

iris = load_iris()
X = iris.data
y = iris.target

使用 load_iris 函数加载鸢尾花数据集，将特征数据存储在 X 中，标签数据存储在 y 中。

三、数据预处理

（一）划分训练集和测试集

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

使用 train_test_split 函数将数据集划分为训练集和测试集，测试集占比为 20%。stratify=y 表示进行分层抽样，确保训练集和测试集中各类别的比例与原始数据集相同。

（二）数据标准化

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

使用 StandardScaler 对特征数据进行标准化处理，使数据具有零均值和单位方差。fit_transform 方法用于计算训练集的均值和标准差，并对训练集进行标准化；transform 方法使用训练集的均值和标准差对测试集进行标准化。

（三）转换为 PyTorch 张量

X_train_tensor = torch.FloatTensor(X_train_scaled)
y_train_tensor = torch.LongTensor(y_train)
X_test_tensor = torch.FloatTensor(X_test_scaled)
y_test_tensor = torch.LongTensor(y_test)

将标准化后的特征数据和标签数据转换为 PyTorch 张量，以便输入到神经网络模型中。

四、模型构建

class IrisClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer1 = nn.Linear(4, 10)
        self.layer2 = nn.Linear(10, 10)
        self.output = nn.Linear(10, 3)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.relu(self.layer1(x))
        x = self.relu(self.layer2(x))
        x = self.output(x)
        return x

model = IrisClassifier()

定义一个名为 IrisClassifier 的神经网络模型，继承自 nn.Module。该模型包含三个全连接层和一个 ReLU 激活函数。forward 方法定义了模型的前向传播过程，输入数据依次通过三个全连接层和 ReLU 激活函数，最终输出预测结果。

五、定义损失函数和优化器

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

使用交叉熵损失函数 nn.CrossEntropyLoss() 作为模型的损失函数，用于衡量模型预测结果与真实标签之间的差异。使用 Adam 优化器 optim.Adam() 来更新模型的参数，学习率设置为 0.01。

六、模型训练

epochs = 1000
losses = []

for epoch in range(epochs):
    optimizer.zero_grad()
    outputs = model(X_train_tensor)
    loss = criterion(outputs, y_train_tensor)
    loss.backward()
    optimizer.step()
    losses.append(loss.item())
    if (epoch + 1) % 100 == 0:
        print(f"Epoch {epoch+1}/{epochs}, Loss: {loss.item():.4f}")

设置训练的轮数为 1000 轮，在每一轮训练中，首先将优化器的梯度清零，然后通过模型进行前向传播得到预测结果，计算损失值。接着进行反向传播，计算梯度，最后使用优化器更新模型的参数。每 100 轮打印一次损失值，并将每一轮的损失值存储在 losses 列表中。

七、训练损失曲线可视化

plt.figure(figsize=(8, 5))
plt.plot(losses)
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.title("Training Loss Curve")
plt.show()

使用 matplotlib 绘制训练损失曲线，横坐标为训练轮数，纵坐标为损失值。通过观察损失曲线，可以判断模型是否收敛。

八、模型评估

with torch.no_grad():
    y_pred = model(X_test_tensor)
    predicted_labels = torch.argmax(y_pred, dim=1)
    accuracy = accuracy_score(y_test, predicted_labels)
    print(f"Test Accuracy: {accuracy:.2%}")

在测试集上评估模型的性能，使用 torch.no_grad() 上下文管理器关闭梯度计算，以减少内存消耗。通过模型进行前向传播得到预测结果，使用 torch.argmax() 函数获取预测结果的类别索引。使用 accuracy_score() 函数计算模型在测试集上的准确率，并打印结果。

九、混淆矩阵可视化

cm = confusion_matrix(y_test, predicted_labels)
plt.figure(figsize=(8, 6))
sns.heatmap(
    cm,
    annot=True,
    fmt="d",
    cmap="Blues",
    xticklabels=iris.target_names,
    yticklabels=iris.target_names,
)
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.title("Confusion Matrix")
plt.show()

使用 confusion_matrix() 函数计算模型在测试集上的混淆矩阵，用于直观地展示模型的分类结果。使用 seaborn 的 heatmap() 函数绘制混淆矩阵热力图，横坐标为预测标签，纵坐标为真实标签。

十、PCA 降维可视化

（一）PCA 降维

pca = PCA(n_components=2)
X_train_pca = pca.fit_transform(X_train_scaled)
X_test_pca = pca.transform(X_test_scaled)

使用主成分分析（PCA）将特征数据降维到二维，方便进行可视化。n_components=2 表示将数据降维到二维。

（二）生成决策边界网格

h = 0.02  # 网格步长
x_min, x_max = X_train_pca[:, 0].min() - 1, X_train_pca[:, 0].max() + 1
y_min, y_max = X_train_pca[:, 1].min() - 1, X_train_pca[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

# 将网格点通过PCA逆变换回原始特征空间
mesh_points = np.c_[xx.ravel(), yy.ravel()]
X_inverse = pca.inverse_transform(mesh_points)
X_inverse_tensor = torch.FloatTensor(X_inverse)

生成一个二维网格，用于绘制决策边界。将网格点通过 PCA 逆变换回原始特征空间，并转换为 PyTorch 张量。

（三）预测类别

with torch.no_grad():
    outputs = model(X_inverse_tensor)
    Z = torch.argmax(outputs, dim=1).numpy()
    Z = Z.reshape(xx.shape)

在网格点上进行预测，得到每个网格点的类别标签。

（四）绘制决策区域和数据点

plt.figure(figsize=(10, 6))
plt.contourf(xx, yy, Z, alpha=0.3, cmap="Paired")
scatter = plt.scatter(
    X_train_pca[:, 0],
    X_train_pca[:, 1],
    c=y_train,
    cmap="Paired",
    edgecolors="k",
    label="Train",
)
plt.scatter(
    X_test_pca[:, 0],
    X_test_pca[:, 1],
    c=y_test,
    cmap="Paired",
    marker="x",
    s=100,
    linewidth=1,
    label="Test",
)
plt.xlabel("Principal Component 1")
plt.ylabel("Principal Component 2")
plt.legend(
    handles=scatter.legend_elements()[0],
    labels=iris.target_names.tolist(),
    title="Species",
)
plt.title("Decision Regions Visualized with PCA")
plt.show()