实战深度学习--进行蘑菇分类

最新推荐文章于 2024-11-06 20:42:39 发布

AKIKZ

最新推荐文章于 2024-11-06 20:42:39 发布

阅读量657

点赞数 7

分类专栏：实战深度学习文章标签：深度学习分类人工智能

本文链接：https://blog.csdn.net/mmd666/article/details/140692699

版权

实战深度学习专栏收录该内容

1 篇文章 0 订阅

订阅专栏

数据集：https://pan.quark.cn/s/4d3526600c0c

概述

本笔记将介绍如何使用Python和深度学习库（如TensorFlow和Keras）来构建一个卷积神经网络（CNN）模型，以区分可食用和有毒的蘑菇。我们将从数据准备、模型构建、训练和评估等方面进行详细说明。

1. 导入必要的库

os: 用于文件路径操作。
cv2: OpenCV库，用于图像处理。
numpy: 用于数值计算。
train_test_split和LabelEncoder: 来自sklearn，用于数据集划分和标签编码。
Sequential, Conv2D, MaxPooling2D, Flatten, Dense, Dropout: 来自tensorflow.keras，用于构建神经网络。
to_categorical: 用于独热编码。
ImageDataGenerator: 用于数据增强。

2. 设置数据路径和图像大小

设置可食用和有毒蘑菇的图像文件夹路径，并定义图像大小为64x64像素。

edible_path = 'path_to_edible_mushrooms'
poisonous_path = 'path_to_poisonous_mushrooms'
img_size = 64

3. 读取图像并添加标签

遍历图像文件夹，读取图像，调整大小，并将其添加到数据列表中，同时为可食用和有毒蘑菇分配相应的标签。

X = []
y = []

# 读取并添加可食用蘑菇图像和标签
# ...

# 读取并添加有毒蘑菇图像和标签
# ...

4. 转换数据为NumPy数组

将图像列表和标签列表转换为NumPy数组，以便于后续处理。

X = np.array(X)
y = np.array(y)

5. 标签编码和独热编码

使用LabelEncoder将标签转换为数值，然后使用to_categorical进行独热编码。

le = LabelEncoder()
y = le.fit_transform(y)
y = to_categorical(y, 2)

6. 划分数据集

使用train_test_split将数据集划分为训练集和测试集。

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

7. 数据增强

使用ImageDataGenerator进行数据增强，包括旋转、平移和翻转。

datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True
)
datagen.fit(X_train)

8. 构建CNN模型

使用Sequential模型和不同的层（如Conv2D, MaxPooling2D, Flatten, Dense）构建CNN。

model = Sequential([
    # ...
    Dense(2, activation='softmax')
])

9. 编译模型

使用adam优化器和categorical_crossentropy损失函数编译模型。

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

10. 训练模型

使用训练数据和数据增强训练模型。

history = model.fit(datagen.flow(X_train, y_train, batch_size=32), epochs=25, validation_data=(X_test, y_test))

11. 评估模型

评估模型在测试集上的性能。

loss, accuracy = model.evaluate(X_test, y_test)
print(f'Test accuracy: {accuracy:.2f}')

12. 打印分类报告

使用classification_report打印详细的分类性能报告。

y_pred = model.predict(X_test)
y_pred_classes = np.argmax(y_pred, axis=1)
y_true = np.argmax(y_test, axis=1)
print(classification_report(y_true, y_pred_classes, target_names=le.classes_))

13. 总代码

# 用于操作系统功能，如文件路径操作
import os
# OpenCV库的Python接口，用于图像处理
import cv2
import numpy as np
from sklearn.model_selection import train_test_split
# 用于将标签转换为数值编码
from sklearn.preprocessing import LabelEncoder
# 创建Keras模型的线性堆叠层结构
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
# 将整数类别标签转换为二进制矩阵（独热编码）
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# 数据路径
edible_path = r'Mushroom dataset\Edible'
poisonous_path = r'Mushroom dataset\Poisonous'
# 图像大小
img_size = 64
# 初始化数据和标签列表
X = []
y = []
# 读取可食用蘑菇图片并添加标签
for filename in os.listdir(edible_path):
    img_path = os.path.join(edible_path, filename)
    image = cv2.imread(img_path)
    if image is not None:
        image = cv2.resize(image, (img_size, img_size))
        X.append(image)
        y.append('Edible')
# 读取毒蘑菇图片并添加标签
for filename in os.listdir(poisonous_path):
    img_path = os.path.join(poisonous_path, filename)
    image = cv2.imread(img_path)
    if image is not None:
        image = cv2.resize(image, (img_size, img_size))
        X.append(image)
        y.append('Poisonous')
# 转换为numpy数组
X = np.array(X)
y = np.array(y)
# 标签编码
le = LabelEncoder()
y = le.fit_transform(y)
y = to_categorical(y, 2)
# 划分数据集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 数据增强
datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True
)
datagen.fit(X_train)
# 构建CNN模型
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(img_size, img_size, 3)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(128, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(512, activation='relu'),
    Dropout(0.5),
    Dense(2, activation='softmax')
])
# 编译模型
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# 训练模型
history = model.fit(datagen.flow(X_train, y_train, batch_size=32), epochs=25, validation_data=(X_test, y_test))
# 评估模型
loss, accuracy = model.evaluate(X_test, y_test)
print(f'Test accuracy: {accuracy:.2f}')
# 打印分类报告
y_pred = model.predict(X_test)
y_pred_classes = np.argmax(y_pred, axis=1)
y_true = np.argmax(y_test, axis=1)
from sklearn.metrics import classification_report
print(classification_report(y_true, y_pred_classes, target_names=le.classes_))