（4-3）TensorFlow卷积神经网络实战：CNN识别器实战

码农三叔

于 2024-06-12 14:21:14 发布

阅读量461

点赞数 4

分类专栏：《零基础学习TensorFlow》文章标签： tensorflow cnn python 分类人工智能

本文链接：https://blog.csdn.net/asd343442/article/details/139625516

版权

《零基础学习TensorFlow》专栏收录该内容

68 篇文章 4 订阅

订阅专栏

4.3 CNN识别器实战

在现实应用中，通常使用CNN卷积神经网络实现物体识别功能。在本节的内容中，将通过两个具体实例讲解使用TensorFlow开发两个CNN物体识别器的知识。

4.3.1 创建CNN物体识别模型

请看下面的实例文件main.py，功能是基于CIFAR-10数据集开发一个CNN物体识别模型，能够分类出“飞机”、“汽车”、“鸟”、“猫”、“鹿”、“狗”、“青蛙”、“马”、“船”和“卡车”。CIFAR-10数据集共有60000张彩色图像，这些图像是32*32，分为10个类，每类6000张图。这里面有50000张用于训练，构成了5个训练批，每一批10000张图；另外10000用于测试，单独构成一批。测试批的数据里，取自10类中的每一类，每一类随机取1000张。抽剩下的就随机排列组成了训练批。注意一个训练批中的各类图像并不一定数量相同，总的来看训练批，每一类都有5000张图。下图4-9列举了10个类，每一类展示了随机的10张图片。

图4-9 CIFAR-10数据集中的图片

实例文件main.py的具体实现代码如下所示。

import tensorflow as tf

from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt

(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

#将像素值格式化为介于0和1之间
train_images, test_images = train_images / 255.0, test_images / 255.0

class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    # CIFAR标签恰好是数组，这是需要额外索引的原因
    plt.xlabel(class_names[train_labels[i][0]])
plt.show()

#创建卷积
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
# Add Dense layers on top
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))

model.summary()

# Compile and train the model
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(train_images, train_labels, epochs=10,
                    validation_data=(test_images, test_labels))

#评估模型
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.5, 1])
plt.legend(loc='lower right')

test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)
print(test_acc)

执行后会输出如下训练模型的过程，

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 30, 30, 32)        896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 15, 15, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 13, 13, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4, 4, 64)          36928     
_________________________________________________________________
flatten (Flatten)            (None, 1024)              0         
_________________________________________________________________
dense (Dense)                (None, 64)                65600     
_________________________________________________________________
dense_1 (Dense)              (None, 10)                650       
=================================================================
Total params: 122,570
Trainable params: 122,570
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
1563/1563 [==============================] - 249s 160ms/step - loss: 1.5445 - accuracy: 0.4354 - val_loss: 1.2372 - val_accuracy: 0.5517
Epoch 2/10
1563/1563 [==============================] - 257s 165ms/step - loss: 1.1574 - accuracy: 0.5893 - val_loss: 1.1760 - val_accuracy: 0.5857
Epoch 3/10

4.3.2 CNN服饰识别器

请看下面的实例文件clothing.py，功能是基于Fashion-MNIST数据集开发一个CNN服饰识别器能够分类出“T恤/上衣”、“裤子”、“套头衫”、“连衣裙”、“外套”、“凉鞋”、“衬衫”、“运动鞋”、“包”、“踝靴”。实例文件clothing.py的具体实现流程如下所示。

（1）导入数据集并训练模型，代码如下：

import tensorflow as tf

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt

print(tf.__version__)

# 导入MNIST数据集
fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

#标签
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

# 打印训练数据集的格式（60K图像，每个图像为28x28像素）
train_images.shape
len(train_labels)
train_labels # 每个标签都是0到9之间的整数（根据类名^）

#打印测试数据集格式（10K图像，每个图像28x28像素）
test_images.shape
len(test_labels)

#预处理图像（打印信息）
plt.figure()
plt.imshow(train_images[0])
plt.colorbar()
plt.grid(False)
plt.show()

#缩放值应在0到1之间
train_images = train_images / 255.0
test_images = test_images / 255.0

#使用标签打印训练集中的前25幅图像
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[train_labels[i]])
plt.show()

#设置nn层
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),  # 第一层将2D 28x28阵列转换为1D 784阵列（“取消堆叠”阵列并将其排列为一个阵列）
    tf.keras.layers.Dense(128, activation='relu'),  # 第二层（致密层）有128个节点/神经元，每个节点/神经元都有表示当前图像类别的分数
    tf.keras.layers.Dense(10)                       # 第三层（densed layer）返回长度为10的logits（线性输出）数组
])

#编译模型
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

#通过将标签拟合到训练图像来训练模型
model.fit(train_images, train_labels, epochs=10)

#在测试数据集上运行模型并评估性能
test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)
print('\nTest accuracy:', test_acc)

# 附加softmax层将logits转换为概率，然后使用模型进行预测
probability_model = tf.keras.Sequential([model,
                                         tf.keras.layers.Softmax()])
predictions = probability_model.predict(test_images)

#打印第一个预测（每个num表示模型对图像对应于类的信心）
predictions[0]
np.argmax(predictions[0]) #最可能的类别
test_labels[0]            #实际类别

执行后输出如下训练过程，并使用标签展示训练集中的前25幅图像，如图4-10所示。

Epoch 1/10
1875/1875 [==============================] - 15s 8ms/step - loss: 0.4982 - accuracy: 0.8244
Epoch 2/10
1875/1875 [==============================] - 16s 9ms/step - loss: 0.3750 - accuracy: 0.8661
Epoch 3/10
1875/1875 [==============================] - 16s 8ms/step - loss: 0.3355 - accuracy: 0.8779
Epoch 4/10
1875/1875 [==============================] - 16s 8ms/step - loss: 0.3142 - accuracy: 0.8844
Epoch 5/10
1875/1875 [==============================] - 17s 9ms/step - loss: 0.2955 - accuracy: 0.8908
Epoch 6/10
1875/1875 [==============================] - 16s 9ms/step - loss: 0.2799 - accuracy: 0.8968
Epoch 7/10
1875/1875 [==============================] - 13s 7ms/step - loss: 0.2681 - accuracy: 0.9010
Epoch 8/10
1875/1875 [==============================] - 10s 5ms/step - loss: 0.2551 - accuracy: 0.9045
Epoch 9/10
1875/1875 [==============================] - 14s 7ms/step - loss: 0.2468 - accuracy: 0.9080
Epoch 10/10
1875/1875 [==============================] - 13s 7ms/step - loss: 0.2379 - accuracy: 0.9119
313/313 - 2s - loss: 0.3420 - accuracy: 0.8783

Test accuracy: 0.8783000111579895

图4-10 前25幅图像

（2）编写预测函数plot_image()，并绘制可视化图展示预测结果，其中蓝色表示预测正确，红色表示预测错误。代码如下：

# 图形预测
def plot_image(i, predictions_array, true_label, img):
  true_label, img = true_label[i], img[i]
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])

  plt.imshow(img, cmap=plt.cm.binary)

  predicted_label = np.argmax(predictions_array)
  if predicted_label == true_label:
    color = 'blue'
  else:
    color = 'red'

  plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
                                100*np.max(predictions_array),
                                class_names[true_label]),
                                color=color)

def plot_value_array(i, predictions_array, true_label):
  true_label = true_label[i]
  plt.grid(False)
  plt.xticks(range(10))
  plt.yticks([])
  thisplot = plt.bar(range(10), predictions_array, color="#777777")
  plt.ylim([0, 1])
  predicted_label = np.argmax(predictions_array)

  thisplot[predicted_label].set_color('red')
  thisplot[true_label].set_color('blue')

#验证预测（蓝色=正确，红色=不正确）
i = 0
plt.figure(figsize=(6,3))
plt.subplot(1,2,1)
plot_image(i, predictions[i], test_labels, test_images)
plt.subplot(1,2,2)
plot_value_array(i, predictions[i],  test_labels)
plt.show()

i = 12
plt.figure(figsize=(6,3))
plt.subplot(1,2,1)
plot_image(i, predictions[i], test_labels, test_images)
plt.subplot(1,2,2)
plot_value_array(i, predictions[i],  test_labels)
plt.show()

# 绘制第一个X测试图像、其预测标签和真实标签。
# 蓝色显示正确预测，红色显示错误预测.
num_rows = 5
num_cols = 3
num_images = num_rows*num_cols
plt.figure(figsize=(2*2*num_cols, 2*num_rows))
for i in range(num_images):
  plt.subplot(num_rows, 2*num_cols, 2*i+1)
  plot_image(i, predictions[i], test_labels, test_images)
  plt.subplot(num_rows, 2*num_cols, 2*i+2)
  plot_value_array(i, predictions[i], test_labels)
plt.tight_layout()
plt.show()

#从测试数据集中获取图像
img = test_images[1]
print(img.shape)

#将图像添加到其为唯一成员的批处理中
img = (np.expand_dims(img,0))
print(img.shape)

#单幅图像的预测
predictions_single = probability_model.predict(img)
print(predictions_single)
np.argmax(predictions_single[0])
print(test_labels[1])

plot_value_array(1, predictions_single[0], test_labels)
_ = plt.xticks(range(10), class_names, rotation=45)
plt.show()

预测结果如图4-11所示。