深度学习之FNN、CNN和RNN的对比理解--应用手写数字识别

最新推荐文章于 2025-03-07 13:28:19 发布

孻孻

最新推荐文章于 2025-03-07 13:28:19 发布

阅读量2.1k

点赞数 21

文章标签：深度学习 cnn rnn

本文链接：https://blog.csdn.net/weixin_44839559/article/details/135025694

版权

深度学习是机器学习的一个分支，通过模仿人类大脑的神经网络结构和学习方式来实现智能化任务。下面，我们分别对MNIST数据集进行处理，来了解三种常见的深度学习网络：前馈神经网络（Feedforward Neural Network），卷积神经网络（Convolutional Neural Network）和循环神经网络（Recurrent Neural Network）。

1、前馈神经网络（Feedforward Neural Network）

前馈神经网络是最简单的神经网络形式。它由一个输入层、一个或多个隐藏层和一个输出层组成，信息在网络中单向流动，不会形成环路。每个神经元与下一层的所有神经元相连，但没有反向连接。每个神经元接收上一层的输出，并将其加权求和，经过激活函数后传递给下一层。这种网络结构使得前馈神经网络能够处理各种类型的任务，如分类、回归和聚类等

应用示例：前馈神经网络广泛应用于图像分类、文本分类、语音识别和推荐系统等任务。例如，使用前馈神经网络可以对图像进行分类，将输入图像映射到相应的类别。

现利用前馈神经网络和MNIST数据集对模型进行训练，并检验模型的效果之前博客有写，详细过程可以查看这一篇：https://blog.csdn.net/weixin_44839559/article/details/134384734

代码展示：

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Normalize and reshape the input images
x_train = x_train.reshape(-1, 784).astype("float32") / 255.0
x_test = x_test.reshape(-1, 784).astype("float32") / 255.0

# Convert labels to one-hot encoding
y_train = tf.one_hot(y_train, 10)
y_test = tf.one_hot(y_test, 10)

# Create TensorFlow datasets for training and testing
train_ds = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(32)
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

# Define the Feedforward Neural Network model
class MyModel(keras.Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc1 = layers.Dense(64, activation="relu")
        self.fc2 = layers.Dense(10, activation="softmax")

    def call(self, x):
        x = self.fc1(x)
        x = self.fc2(x)
        return x

# Create an instance of the model
model = MyModel()

# Define the loss function and optimizer
loss_object = keras.losses.CategoricalCrossentropy()
optimizer = keras.optimizers.Adam()

# Define the metrics for evaluation
train_loss = keras.metrics.Mean(name="train_loss")
train_accuracy = keras.metrics.CategoricalAccuracy(name="train_accuracy")
test_loss = keras.metrics.Mean(name="test_loss")
test_accuracy = keras.metrics.CategoricalAccuracy(name="test_accuracy")

# Define the training and testing steps
@tf.function
def train_step(images, labels):
    with tf.GradientTape() as tape:
        predictions = model(images)
        loss = loss_object(labels, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

    train_loss(loss)
    train_accuracy(labels, predictions)

@tf.function
def test_step(images, labels):
    predictions = model(images)
    t_loss = loss_object(labels, predictions)

    test_loss(t_loss)
    test_accuracy(labels, predictions)

# Train the model
EPOCHS = 10

for epoch in range(EPOCHS):
    train_loss.reset_states()
    train_accuracy.reset_states()
    test_loss.reset_states()
    test_accuracy.reset_states()

    for images, labels in train_ds:
        train_step(images, labels)

    for test_images, test_labels in test_ds:
        test_step(test_images, test_labels)

    template = "Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}"
    print(
        template.format(
            epoch + 1,
            train_loss.result(),
            train_accuracy.result() * 100,
            test_loss.result(),
            test_accuracy.result() * 100,
        )
    )

结果展示：

2、卷积神经网络（Convolutional Neural Network）

卷积神经网络是专门用于处理具有网格结构数据的深度学习模型。它在图像识别和计算机视觉任务中得到广泛应用。卷积神经网络包含卷积层、池化层和全连接层。卷积层通过使用卷积运算在输入数据上提取特征。池化层用于降低特征图的空间尺寸，并减少网络对位置的敏感性。全连接层将特征映射转换为最终的输出。卷积神经网络的优势在于它们能够自动学习和提取图像中的特征，从而实现高效的图像分类、目标检测和图像生成等任务

应用示例：卷积神经网络在计算机视觉领域有广泛应用，包括图像分类、目标检测、人脸识别和图像生成等任务。例如，使用卷积神经网络可以实现车辆识别，从输入的图像中检测和识别出车辆的位置和类型

同理，使用卷积神经网络对MNIST数据集进行训练并得出训练结果

代码展示：

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Normalize and reshape the input images
x_train = x_train.reshape(-1, 28, 28, 1).astype("float32") / 255.0
x_test = x_test.reshape(-1, 28, 28, 1).astype("float32") / 255.0

# Convert labels to one-hot encoding
y_train = tf.one_hot(y_train, 10)
y_test = tf.one_hot(y_test, 10)

# Create TensorFlow datasets for training and testing
train_ds = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(32)
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

# Define the CNN model
class MyModel(keras.Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.conv1 = layers.Conv2D(32, (3, 3), activation="relu", input_shape=(28, 28, 1))
        self.flatten = layers.Flatten()
        self.fc1 = layers.Dense(100, activation="relu")
        self.fc2 = layers.Dense(10, activation="softmax")

    def call(self, x):
        x = self.conv1(x)
        x = self.flatten(x)
        x = self.fc1(x)
        x = self.fc2(x)
        return x

# Create an instance of the model
model = MyModel()

# Define the loss function and optimizer
loss_object = keras.losses.CategoricalCrossentropy()
optimizer = keras.optimizers.Adam()

# Define the metrics for evaluation
train_loss = keras.metrics.Mean(name="train_loss")
train_accuracy = keras.metrics.CategoricalAccuracy(name="train_accuracy")
test_loss = keras.metrics.Mean(name="test_loss")
test_accuracy = keras.metrics.CategoricalAccuracy(name="test_accuracy")

# Define the training and testing steps
@tf.function
def train_step(images, labels):
    with tf.GradientTape() as tape:
        predictions = model(images)
        loss = loss_object(labels, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

    train_loss(loss)
    train_accuracy(labels, predictions)

@tf.function
def test_step(images, labels):
    predictions = model(images)
    t_loss = loss_object(labels, predictions)

    test_loss(t_loss)
    test_accuracy(labels, predictions)

# Train the model
EPOCHS = 10

for epoch in range(EPOCHS):
    train_loss.reset_states()
    train_accuracy.reset_states()
    test_loss.reset_states()
    test_accuracy.reset_states()

    for images, labels in train_ds:
        train_step(images, labels)

    for test_images, test_labels in test_ds:
        test_step(test_images, test_labels)

    template = "Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}"
    print(
        template.format(
            epoch + 1,
            train_loss.result(),
            train_accuracy.result() * 100,
            test_loss.result(),
            test_accuracy.result() * 100,
        )
    )

结果展示：

3、循环神经网络（Recurrent Neural Network）

循环神经网络是一种具有循环连接的神经网络，适用于处理序列数据，如语音、文本和时间序列等。循环神经网络中的神经元可以在时间上保持状态，并将其作为输入传递给下一个时间步骤。这种循环结构使得网络能够捕捉到数据中的时序依赖关系。常见的循环神经网络模型包括简单循环神经网络（Simple RNN）、长短时记忆网络（LSTM）和门控循环单元（GRU）。循环神经网络广泛应用于机器翻译、语言模型、语音识别和情感分析等任务。

应用示例：循环神经网络在语言模型、机器翻译、语音识别和情感分析等任务中表现出色。例如，在机器翻译任务中，循环神经网络可以将源语言句子作为输入，并生成目标语言的翻译结果

同理，使用卷积神经网络对MNIST数据集进行训练并得出训练结果

代码展示：

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Normalize and reshape the input images
x_train = x_train.reshape(-1, 28, 28).astype("float32") / 255.0
x_test = x_test.reshape(-1, 28, 28).astype("float32") / 255.0

# Convert labels to one-hot encoding
y_train = tf.one_hot(y_train, 10)
y_test = tf.one_hot(y_test, 10)

# Create TensorFlow datasets for training and testing
train_ds = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(32)
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

# Define the RNN model
class MyModel(keras.Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.rnn = layers.LSTM(32)
        self.fc = layers.Dense(10, activation="softmax")

    def call(self, x):
        x = self.rnn(x)
        x = self.fc(x)
        return x

# Create an instance of the model
model = MyModel()

# Define the loss function and optimizer
loss_object = keras.losses.CategoricalCrossentropy()
optimizer = keras.optimizers.Adam()

# Define the metrics for evaluation
train_loss = keras.metrics.Mean(name="train_loss")
train_accuracy = keras.metrics.CategoricalAccuracy(name="train_accuracy")
test_loss = keras.metrics.Mean(name="test_loss")
test_accuracy = keras.metrics.CategoricalAccuracy(name="test_accuracy")

# Define the training and testing steps
@tf.function
def train_step(images, labels):
    with tf.GradientTape() as tape:
        predictions = model(images)
        loss = loss_object(labels, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

    train_loss(loss)
    train_accuracy(labels, predictions)

@tf.function
def test_step(images, labels):
    predictions = model(images)
    t_loss = loss_object(labels, predictions)

    test_loss(t_loss)
    test_accuracy(labels, predictions)

# Train the model
EPOCHS = 10

for epoch in range(EPOCHS):
    train_loss.reset_states()
    train_accuracy.reset_states()
    test_loss.reset_states()
    test_accuracy.reset_states()

    for images, labels in train_ds:
        train_step(images, labels)

    for test_images, test_labels in test_ds:
        test_step(test_images, test_labels)

    template = "Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}"
    print(
        template.format(
            epoch + 1,
            train_loss.result(),
            train_accuracy.result() * 100,
            test_loss.result(),
            test_accuracy.result() * 100,
        )
    )

结果展示：

通过上述结果的展示，我们可以看出在对MNIST图片数据集，卷积神经网络的训练率和测试率最高，其次是前馈神经网络，再到循环神经网络，这与三个模型在不同应用领域上一致；但是在训练时间上，最长的是卷积神经网络，其次到循环神经网络，最后是前馈神经网络。因此，我们在选择模型时，及要考虑训练的准确度，还需要考虑模型的内存及时间花费的代价情形去选择模型。