mnist手写数字体识别-----这里重在理解图片的预处理问题

一只菜鸡...):

已于 2023-10-15 00:49:43 修改

阅读量225

点赞数

文章标签： python numpy 深度学习 tensorflow

于 2023-10-12 01:07:08 首次发布

本文链接：https://blog.csdn.net/weixin_52658812/article/details/133781825

版权

import tensorflow as tf

from tensorflow.keras import datasets,models,layers

# 导入数据
(train_images,train_labels),(test_images,test_labels) = datasets.mnist.load_data()

type(train_images)

numpy.ndarray

train_images.shape

(60000, 28, 28)

# 数据归一化处理 ,将像素的值标准化至0到1的区间内。
train_images, test_images = train_images /255, test_images/255

train_images.shape,test_images.shape,train_labels.shape,test_labels.shape

((60000, 28, 28), (10000, 28, 28), (60000,), (10000,))

# 图像可视化
import matplotlib.pyplot as plt

plt.imshow(train_images[1], cmap = plt.cm.binary)

<matplotlib.image.AxesImage at 0x22e2688f908>

在这里插入图片描述

plt.figure(figsize=(20,10))
for i in range(20):
    plt.subplot(5,10,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(train_labels[i])
plt.show()

! 在这里插入图片描述

1.plt.figure(figsize=(20,10))

这段代码用于创建一个大小为 (20, 10) 的图像窗口，以便在其中绘制图像。

具体来说，plt.figure() 是 Matplotlib 库中的一个函数，用于创建一个新的图像对象，并返回一个 Figure 类型的实例。该函数可以接受多个参数，其中包括：

figsize：表示画布的大小，是一个二元组，第一个值为宽度，第二个值为高度，单位为英寸。在本例中，指定了 (20, 10)，表示宽度为20英寸，高度为10英寸。
dpi：表示每英寸的点数，默认值为 100。
facecolor：表示画布的背景颜色，默认为白色（‘w’）。
通过调用 plt.figure() 并传入所需的参数，就可以创建一个指定大小的新的图像窗口，以便在其中进行数据可视化。

2.plt.subplot(5,10,i+1)

这段代码用于在一个 5x10 的网格中创建子图，并选择其中的第 i+1 个子图进行绘制。

3.plt.grid(False)

这段代码用于在当前的图像窗口中关闭或隐藏网格线。

type(train_images)

numpy.ndarray

# 调整图片格式
train_images = train_images.reshape((60000, 28, 28, 1))
test_images = test_images.reshape((10000, 28, 28, 1))
train_images = train_images / 255.0
test_images = test_images / 255.0
train_images.shape, test_images.shape,train_labels.shape,test_labels.shape

((60000, 28, 28, 1), (10000, 28, 28, 1), (60000,), (10000,))

plt.imshow(train_images[1])

<matplotlib.image.AxesImage at 0x22e0b8f7188>

在这里插入图片描述

# 构建CNN网络
model = models.Sequential([
    layers.Conv2D(32, (3,3),activation = 'relu',input_shape=(28,28,1)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64,(3,3),activation = 'relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64,(3,3),activation = 'relu'),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10)
])

model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_5 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten_2 (Flatten)          (None, 576)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 64)                36928     
_________________________________________________________________
dense_5 (Dense)              (None, 10)                650       
=================================================================
Total params: 93,322
Trainable params: 93,322
Non-trainable params: 0
_________________________________________________________________

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(train_images, train_labels, epochs=10, 
                    validation_data=(test_images, test_labels))

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 [==============================] - 37s 617us/sample - loss: 0.9271 - acc: 0.6747 - val_loss: 0.3575 - val_acc: 0.8848
Epoch 2/10
60000/60000 [==============================] - 37s 618us/sample - loss: 0.2710 - acc: 0.9166 - val_loss: 0.1777 - val_acc: 0.9456
Epoch 3/10
60000/60000 [==============================] - 37s 616us/sample - loss: 0.1718 - acc: 0.9478 - val_loss: 0.1414 - val_acc: 0.9559
Epoch 4/10
60000/60000 [==============================] - 38s 627us/sample - loss: 0.1298 - acc: 0.9600 - val_loss: 0.1052 - val_acc: 0.9656
Epoch 5/10
60000/60000 [==============================] - 38s 631us/sample - loss: 0.1074 - acc: 0.9678 - val_loss: 0.0949 - val_acc: 0.9715
Epoch 6/10
60000/60000 [==============================] - 39s 645us/sample - loss: 0.0922 - acc: 0.9709 - val_loss: 0.0787 - val_acc: 0.9751
Epoch 7/10
60000/60000 [==============================] - 39s 658us/sample - loss: 0.0805 - acc: 0.9750 - val_loss: 0.0650 - val_acc: 0.9797
Epoch 8/10
60000/60000 [==============================] - 41s 680us/sample - loss: 0.0716 - acc: 0.9778 - val_loss: 0.0661 - val_acc: 0.9788
Epoch 9/10
60000/60000 [==============================] - 39s 658us/sample - loss: 0.0643 - acc: 0.9804 - val_loss: 0.0650 - val_acc: 0.9783
Epoch 10/10
60000/60000 [==============================] - 37s 620us/sample - loss: 0.0593 - acc: 0.9818 - val_loss: 0.0576 - val_acc: 0.9822

这段代码是使用训练数据集 train_images 和对应的标签 train_labels 来对模型进行训练，并验证模型在测试数据集 test_images 和对应的标签 test_labels 上的性能。

在使用 model.fit() 进行模型训练时，需要传入训练数据和对应的目标值作为输入。该方法会根据指定的算法和参数对模型进行训练，并通过迭代优化来调整模型的参数，以使其能够更好地拟合训练数据。

具体来说，model.fit() 函数会执行模型的训练过程，并返回一个 history 对象，其中包含了每个训练周期（epoch）的训练指标和验证指标的历史记录。

参数解释：

train_images：训练数据集的输入特征。它是一个张量（或 NumPy 数组），表示多个图像样本。
train_labels：训练数据集的标签。它是一个张量（或 NumPy 数组），表示与训练样本对应的目标标签。
epochs：训练周期的数量。每个周期表示将训练数据集中的所有样本都用于训练的一次迭代。
validation_data：一个元组，包含用于验证模型的输入特征和对应的标签。在每个训练周期结束时，模型将根据这部分数据计算验证指标。

通过执行 model.fit()，模型将根据训练数据集进行多个训练周期的迭代，不断调整权重和参数以提高模型的性能。在每个周期结束时，模型还会使用验证数据集进行评估，以便监控模型在未见过的数据上的表现情况。

训练过程中的指标和损失值等信息会存储在 history 对象中，可以使用这些信息进行训练过程的可视化和分析。例如，可以通过 history.history 来访问训练和验证指标的历史记录。

plt.imshow(test_images[2])

<matplotlib.image.AxesImage at 0x22e0f6fa4c8>

![外链在这里插入图片描述

img = test_images[2]

img.shape

(28, 28, 1)

type(img)

numpy.ndarray

img = img.reshape(1,28,28,1)

img.shape

(1, 28, 28, 1)

pre = model.predict(img)

pre

array([[-6.144885  ,  6.0871305 , -2.4974186 , -2.5596278 ,  0.04332877,
        -4.744676  , -4.4906645 , -3.1354234 , -2.2001367 , -4.293309  ]],
      dtype=float32)

import numpy as np

index = np.argmax(pre)
index

plt.imshow(test_images[1])

<matplotlib.image.AxesImage at 0x22e261797c8>

在这里插入图片描述

img1 = test_images[1].reshape(1,28,28,1)

pre = model.predict(img1)

index = np.argmax(pre)
index

测试

img3 = plt.imread("C:\\Users\\YL\\Desktop\\3.jpg")

img3.shape

(587, 702, 3)

type(img3)

numpy.ndarray

plt.imshow(img3)

<matplotlib.image.AxesImage at 0x22e1714b7c8>

在这里插入图片描述

from PIL import Image

img3 = Image.open("C:\\Users\\YL\\Desktop\\3.jpg")

img3 = img3.resize((28,28))

这里可以使用torchvision.transforms.Resize()来进行改变大小，该函数输入是PIL返回还是PIL，输入是tensor，返回还是tensor。然后将PIL转换为tensor数据类型，可以使用torchvision.transofroms.ToTensor()

img3 = Image.open("C:\\Users\\YL\\Desktop\\3.jpg")
transform = torchvision.transforms.Compose([
				torchvision.transforms.Resize((28,28))
				torchvision.transforms.ToTesnor()
])
img3 = transoform(img3)

gray_image = image.convert(‘L’) 这一行代码将使用PIL库中的convert()方法将RGB图像转换为灰度图像。

gray_img3 = img3.convert('L')

gray_img3.save("C:\\Users\\YL\\Desktop\\3_gray.png")

gray_img31 = plt.imread("C:\\Users\\YL\\Desktop\\3_gray.png")

gray_img31.shape

(28, 28)

plt.imshow(gray_img31)

<matplotlib.image.AxesImage at 0x22e18bde388>

在这里插入图片描述

gray_img31 = gray_img31 / 255

#gray_img31.resize(28,28,1)

type(gray_img31)

numpy.ndarray

gray_img31.shape

(28, 28)

# img31 = gray_img3

plt.imshow(gray_img31)

<matplotlib.image.AxesImage at 0x22e17543288>

在这里插入图片描述

gray_img31 = gray_img31.reshape(1,28,28,1)

gray_img31.shape

(1, 28, 28, 1)

gray_img31 = gray_img31 / 255.0

pre = model.predict(gray_img31)

np.argmax(pre)

数字9的识别

img9 = plt.imread("C:\\Users\\YL\\Desktop\\9.jpg")

plt.imshow(img9)

<matplotlib.image.AxesImage at 0x22e1c65ad88>

在这里插入图片描述

img9.shape

(267, 305, 3)

转变图像为灰度图，并且将大小改为（28*28），使用的是PIL

from PIL import Image

img9 = Image.open("C:\\Users\\YL\\Desktop\\9.jpg")

type(img9)

PIL.JpegImagePlugin.JpegImageFile

img9 = img9.resize((28,28))

gray_img9 = img9.convert('L')

gray_img9.save("C:\\Users\\YL\\Desktop\\9_gray.png")

PIL将转变好的灰度图重新保存，这里使用plt重新读入保存好的灰度图像，使用plt读入的图像type为numpy，numpy可以使用reshape调整图片格式

test_img9 = plt.imread("C:\\Users\\YL\\Desktop\\9_gray.png")

plt.imshow(test_img9,cmap='gray')

<matplotlib.image.AxesImage at 0x22e1c6d9708>

在这里插入图片描述

test_img9.shape

(28, 28)

#test_img9.resize(28,28)  这个resize是plt里面的resize，通过这个resize会使得图片里面的数字“消失”不见

#test_img9 = test_img9/255.0  正则化处理

plt.imshow(test_img9)

<matplotlib.image.AxesImage at 0x22e1d884408>

在这里插入图片描述

#test_img9 = test_img9[:,:,0] 取0通道图片

test_img9.shape

(28, 28)

test_img9 = test_img9.reshape(1,28,28,1)

test_img9 = test_img9/255.0

test_img9.shape

(1, 28, 28, 1)

pre = model.predict(test_img9)

np.argmax(pre)

一只菜鸡...):

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
mnist手写数字体识别-----这里重在理解图片的预处理问题

!这段代码是使用训练数据集 train_images 和对应的标签 train_labels 来对模型进行训练，并验证模型在测试数据集 test_images 和对应的标签 test_labels 上的性能。在使用 model.fit() 进行模型训练时，需要传入训练数据和对应的目标值作为输入。该方法会根据指定的算法和参数对模型进行训练，并通过迭代优化来调整模型的参数，以使其能够更好地拟合训练数据。具体来说，model.fit() 函数会执行模型的训练过程，并返回一个 hist
复制链接

扫一扫