深度学习tensorflow实现宝可梦图像分类

最新推荐文章于 2023-03-03 21:51:24 发布

TheMatrixs

最新推荐文章于 2023-03-03 21:51:24 发布

阅读量2.7k

点赞数 4

分类专栏：人工智能文章标签： python tensorflow

本文链接：https://blog.csdn.net/jameschen9051/article/details/119515204

版权

深度学习卷积神经网络数据预处理模型训练宝可梦识别

关键词由CSDN通过智能技术生成

人工智能专栏收录该内容

23 篇文章

订阅专栏

本文介绍了如何使用深度学习识别宝可梦图像，包括数据预处理、构建简单的卷积神经网络、模型训练、预测及分析优化。数据集包含1168张图像，分为训练、验证和测试集。模型训练达到了98%的训练集准确率，但在验证集和测试集上分别为87.98%和85.04%。未来优化方向包括增加数据集、调整模型结构和参数，以及尝试迁移学习。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一、数据集简介

宝可梦数据集（共1168张图像）：bulbasaur（妙蛙种子，234）、charmander（小火龙，238）、mewtwo（超梦，239）、pikachu（皮卡丘，234）、squirtle（杰尼龟，223）。

二、数据预处理

通过pokmon.py批量读取图像路径，根据不同路径生成每张图像的路径和标签并打乱顺序。

import  os, glob

import  random, csv

import tensorflow as tf

def load_csv(root, filename, name2label):

    # root:数据集根目录

    # filename:csv文件名

    # name2label:类别名编码表

    if not os.path.exists(os.path.join(root, filename)):

        images = []

        for name in name2label.keys():

            images += glob.glob(os.path.join(root, name, '*.png'))

            images += glob.glob(os.path.join(root, name, '*.jpg'))

            images += glob.glob(os.path.join(root, name, '*.jpeg'))

        print(len(images), images)

        random.shuffle(images)

        with open(os.path.join(root, filename), mode='w', newline='') as f:

            writer = csv.writer(f)

            for img in images:

                name = img.split(os.sep)[-2]

                label = name2label[name]

                writer.writerow([img, label])

            print('written into csv file:', filename)



    images, labels = [], []

    with open(os.path.join(root, filename)) as f:

        reader = csv.reader(f)

        for row in reader:

            img, label = row

            label = int(label)


            images.append(img)

            labels.append(label)


    assert len(images) == len(labels)

    return images, labels



def load_pokemon(root, mode='train'):

    # 创建数字编码表

    name2label = {}  # "sq...":0

    for name in sorted(os.listdir(os.path.join(root))):

        if not os.path.isdir(os.path.join(root, name)):

            continue

        # 给每个类别编码一个数字

        name2label[name] = len(name2label.keys())


    # 读取Label信息

    # [file1,file2,], [3,1]

    images, labels = load_csv(root, 'images.csv', name2label)


    if mode == 'train':  # 60%

        images = images[:int(0.6 * len(images))]

        labels = labels[:int(0.6 * len(labels))]

    elif mode == 'val':  # 20% = 60%->80%

        images = images[int(0.6 * len(images)):int(0.8 * len(images))]

        labels = labels[int(0.6 * len(labels)):int(0.8 * len(labels))]

    else:  # 20% = 80%->100%

        images = images[int(0.8 * len(images)):]

        labels = labels[int(0.8 * len(labels)):]


    return images, labels, name2label


img_mean = tf.constant([0.485, 0.456, 0.406])

img_std = tf.constant([0.229, 0.224, 0.225])


def normalize(x, mean=img_mean, std=img_std):

    x = (x - mean)/std

    return x


def denormalize(x, mean=img_mean, std=img_std):

    x = x * std + mean

    return x


def main():

    import time

    images, labels, table = load_pokemon('pokemon', 'train')

    print('images', len(images), images)

    print('labels', len(labels), labels)

    print(table)


if __name__ == '__main__':

    main()

三、构建卷积神经网络

通过keras.Sequential构建一个简单的卷积神经网络。

network = keras.Sequential([
    layers.Conv2D(16,5,3),
    layers.MaxPool2D(3,3),
    layers.ReLU(),
    layers.Conv2D(64,5,3),
    layers.MaxPool2D(2,2),
    layers.ReLU(),
    layers.Flatten(),
    layers.Dense(64),
    layers.ReLU(),
    layers.Dense(5)
])

四、模型训练

1、读取训练数据，batchsize根据内存或显卡显存大小决定。

batchsz = 256
images, labels, table = load_pokemon('pokemon',mode='train')
db_train = tf.data.Dataset.from_tensor_slices((images, labels))
db_train = db_train.shuffle(1000).map(preprocess).batch(batchsz)

2、读取验证数据

images2, labels2, table = load_pokemon('pokemon',mode='val')
db_val = tf.data.Dataset.from_tensor_slices((images2, labels2))
db_val = db_val.map(preprocess).batch(batchsz)
3、读取测试数据
images3, labels3, table = load_pokemon('pokemon',mode='test')
db_test = tf.data.Dataset.from_tensor_slices((images3, labels3))
db_test = db_test.map(preprocess).batch(100)

4、数据预处理

def preprocess(x,y):
    # x: 图片的路径，y：图片的数字编码
    x = tf.io.read_file(x)
    x = tf.image.decode_jpeg(x, channels=3)
    x = tf.image.resize(x, [244, 244])

    x = tf.image.random_flip_left_right(x)
    x = tf.image.random_crop(x, [224,224,3])

    x = tf.cast(x, dtype=tf.float32) / 255.
    x = normalize(x)
    y = tf.convert_to_tensor(y)
    y = tf.one_hot(y, depth=5)

    return x, y

5、模型训练，损失采用交叉熵，使用earlystop防止过拟合。

network.build(input_shape=(4, 224, 224, 3))
network.summary()

early_stopping = EarlyStopping(
    monitor='val_accuracy',
    min_delta=0.001,
    patience=5
)

network.compile(optimizer=optimizers.Adam(lr=1e-3),
               loss=losses.CategoricalCrossentropy(from_logits=True),
               metrics=['accuracy'])
network.fit(db_train, validation_data=db_val, validation_freq=1, epochs=100,
           callbacks=[early_stopping])
network.evaluate(db_test)

模型结构：

Model: "sequential"

_________________________________________________________________

Layer (type) Output Shape Param #

=================================================================

conv2d (Conv2D) multiple 1216

_________________________________________________________________

max_pooling2d (MaxPooling2D) multiple 0

_________________________________________________________________

re_lu (ReLU) multiple 0

_________________________________________________________________

conv2d_1 (Conv2D) multiple 25664

_________________________________________________________________

max_pooling2d_1 (MaxPooling2 multiple 0

_________________________________________________________________

re_lu_1 (ReLU) multiple 0

_________________________________________________________________

flatten (Flatten) multiple 0

_________________________________________________________________

dense (Dense) multiple 36928

_________________________________________________________________

re_lu_2 (ReLU) multiple 0

_________________________________________________________________

dense_1 (Dense) multiple 325

=================================================================

Total params: 64,133

Trainable params: 64,133

Non-trainable params: 0

训练结果：

Epoch 16/100

1/3 [=========>....................] - ETA: 6s - loss: 0.1232 - accuracy: 0.9805

2/3 [===================>..........] - ETA: 3s - loss: 0.1455 - accuracy: 0.9785

3/3 [==============================] - 11s 4s/step - loss: 0.1241 - accuracy: 0.9793 - val_loss: 0.3912 - val_accuracy: 0.8798

1/3 [=========>....................] - ETA: 2s - loss: 0.4005 - accuracy: 0.8700

2/3 [===================>..........] - ETA: 1s - loss: 0.4779 - accuracy: 0.8450

3/3 [==============================] - 3s 899ms/step - loss: 0.4673 - accuracy: 0.8504
6、保存模型
network.save('model.h5')

五、预测

1、图像读取和预处理

def preprocess(img):
    img = tf.io.read_file(img)
    img = tf.image.decode_jpeg(img, channels=3)
    img = tf.image.resize(img, [244, 244])

    img = tf.image.random_flip_left_right(img)
    img = tf.image.random_crop(img, [224,224,3])

    img = tf.cast(img, dtype=tf.float32) / 255.

    return img


img = '3.jpg'
x = preprocess(img)
x = tf.reshape(x, [1, 224, 224, 3])

2、加载训练模型

network = tf.keras.models.load_model('model.h5')

3、预测分类结果及对应概率，这里使用softmax将输出的logits转换为每个分类对应概率。

logits = network .predict(x)
prob = tf.nn.softmax(logits, axis=1)
print(prob)
max_prob_index = np.argmax(prob, axis=-1)[0]
prob = prob.numpy()
max_prob = prob[0][max_prob_index]
max_index = np.argmax(logits, axis=-1)[0]
name = ['妙蛙种子', '小火龙', '超梦', '皮卡丘', '杰尼龟']
print(name[max_index] + “:” + max_prob)

测试图像：