LeNet网络在手写数字识别上的应用(paddle)

写在前面: 我是「虐猫人薛定谔i」,一个不满足于现状,有梦想,有追求的00后
\quad
本博客主要记录和分享自己毕生所学的知识,欢迎关注,第一时间获取更新。
\quad
不忘初心,方得始终。
\quad

❤❤❤❤❤❤❤❤❤❤


在这里插入图片描述

LeNet简介

LeNet是最早的卷积神经网络之一,其网络结构如下图所示
在这里插入图片描述
从图中我们可以看到,该网络包含3个卷积层,2个池化层和2个全连接层,LeNet通过连续使用卷积层和全连接层来提取图像的特征,进而达到识别图像的目的。

代码

LeNet网络的实现代码如下

class LeNet(fluid.dygraph.Layer):
    def __init__(self, name_scope, num_classes=1):
        super(LeNet, self).__init__(name_scope)

        self.conv1 = Conv2D(num_channels=1,
                            num_filters=6,
                            filter_size=5,
                            act='sigmoid')
        self.pool1 = Pool2D(pool_size=2, pool_stride=2, pool_type='max')
        self.conv2 = Conv2D(num_channels=6,
                            num_filters=16,
                            filter_size=5,
                            act='sigmoid')
        self.pool2 = Pool2D(pool_size=2, pool_stride=2, pool_type='max')
        self.conv3 = Conv2D(num_channels=16,
                            num_filters=120,
                            filter_size=4,
                            act='sigmoid')
        self.fc1 = Linear(input_dim=120, output_dim=64, act='sigmoid')
        self.fc2 = Linear(input_dim=64, output_dim=num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.pool2(x)
        x = self.conv3(x)
        x = fluid.layers.reshape(x, [x.shape[0], -1])
        x = self.fc1(x)
        x = self.fc2(x)
        return x

完整代码如下

import paddle
import paddle.fluid as fluid
import numpy as np
from paddle.fluid.dygraph.nn import Conv2D, Pool2D, Linear
"""
LeNet在手写数字识别上的应用
"""


class LeNet(fluid.dygraph.Layer):
    def __init__(self, name_scope, num_classes=1):
        super(LeNet, self).__init__(name_scope)

        self.conv1 = Conv2D(num_channels=1,
                            num_filters=6,
                            filter_size=5,
                            act='sigmoid')
        self.pool1 = Pool2D(pool_size=2, pool_stride=2, pool_type='max')
        self.conv2 = Conv2D(num_channels=6,
                            num_filters=16,
                            filter_size=5,
                            act='sigmoid')
        self.pool2 = Pool2D(pool_size=2, pool_stride=2, pool_type='max')
        self.conv3 = Conv2D(num_channels=16,
                            num_filters=120,
                            filter_size=4,
                            act='sigmoid')
        self.fc1 = Linear(input_dim=120, output_dim=64, act='sigmoid')
        self.fc2 = Linear(input_dim=64, output_dim=num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.pool2(x)
        x = self.conv3(x)
        x = fluid.layers.reshape(x, [x.shape[0], -1])
        x = self.fc1(x)
        x = self.fc2(x)
        return x


def train(model):
    print("start training...")
    model.train()
    epoch_num = 5
    opt = fluid.optimizer.Momentum(learning_rate=0.001,
                                   momentum=0.9,
                                   parameter_list=model.parameters())
    train_loader = paddle.batch(paddle.dataset.mnist.train(), batch_size=10)
    valid_loader = paddle.batch(paddle.dataset.mnist.test(), batch_size=10)
    for epoch in range(epoch_num):
        for batch_id, data in enumerate(train_loader()):
            x_data = np.array([item[0] for item in data],
                              dtype='float32').reshape(-1, 1, 28, 28)
            y_data = np.array([item[1] for item in data],
                              dtype='int64').reshape(-1, 1)
            img = fluid.dygraph.to_variable(x_data)
            label = fluid.dygraph.to_variable(y_data)

            logits = model(img)
            loss = fluid.layers.softmax_with_cross_entropy(logits, label)
            avg_loss = fluid.layers.mean(loss)
            if batch_id % 1000 == 0:
                print("epoch: {}, bath_id: {}, loss is: {}".format(
                    epoch, batch_id, avg_loss.numpy()))
            avg_loss.backward()
            opt.minimize(avg_loss)
            model.clear_gradients()
        model.eval()
        accuracies = []
        losses = []
        for batch_id, data in enumerate(valid_loader()):
            x_data = np.array([item[0] for item in data],
                              dtype='float32').reshape(-1, 1, 28, 28)
            y_data = np.array([item[1] for item in data],
                              dtype='int64').reshape(-1, 1)
            img = fluid.dygraph.to_variable(x_data)
            label = fluid.dygraph.to_variable(y_data)
            logits = model(img)
            pred = fluid.layers.softmax(logits)
            loss = fluid.layers.softmax_with_cross_entropy(logits, label)
            acc = fluid.layers.accuracy(pred, label)
            accuracies.append(acc.numpy())
            losses.append(loss.numpy())
        print("[validation accuracy/loss: {}/{}]".format(
            np.mean(accuracies), np.mean(losses)))
        model.train()
    fluid.save_dygraph(model.state_dict(), './result/hwdrByLeNet')


if __name__ == '__main__':
    with fluid.dygraph.guard():
        model = LeNet('LeNet', num_classes=10)
        train(model)

在这里插入图片描述

结果

start training...
epoch: 0, bath_id: 0, loss is: [2.2495162]
epoch: 0, bath_id: 1000, loss is: [2.2928371]
epoch: 0, bath_id: 2000, loss is: [2.3267434]
epoch: 0, bath_id: 3000, loss is: [2.2698295]
epoch: 0, bath_id: 4000, loss is: [2.2489858]
epoch: 0, bath_id: 5000, loss is: [2.312758]
[validation accuracy/loss: 0.45590001344680786/2.215536117553711]
epoch: 1, bath_id: 0, loss is: [2.1956322]
epoch: 1, bath_id: 1000, loss is: [2.063491]
epoch: 1, bath_id: 2000, loss is: [1.9574039]
epoch: 1, bath_id: 3000, loss is: [1.420162]
epoch: 1, bath_id: 4000, loss is: [0.98229045]
epoch: 1, bath_id: 5000, loss is: [1.2404814]
[validation accuracy/loss: 0.776199996471405/0.8473402261734009]
epoch: 2, bath_id: 0, loss is: [0.62948656]
epoch: 2, bath_id: 1000, loss is: [0.49548474]
epoch: 2, bath_id: 2000, loss is: [0.5145985]
epoch: 2, bath_id: 3000, loss is: [0.2760195]
epoch: 2, bath_id: 4000, loss is: [0.36493483]
epoch: 2, bath_id: 5000, loss is: [0.5631878]
[validation accuracy/loss: 0.8793999552726746/0.4475659728050232]
epoch: 3, bath_id: 0, loss is: [0.30772734]
epoch: 3, bath_id: 1000, loss is: [0.2511763]
epoch: 3, bath_id: 2000, loss is: [0.32035473]
epoch: 3, bath_id: 3000, loss is: [0.12164386]
epoch: 3, bath_id: 4000, loss is: [0.20446599]
epoch: 3, bath_id: 5000, loss is: [0.27960077]
[validation accuracy/loss: 0.9111999869346619/0.3133259415626526]
epoch: 4, bath_id: 0, loss is: [0.16361086]
epoch: 4, bath_id: 1000, loss is: [0.15575354]
epoch: 4, bath_id: 2000, loss is: [0.24734934]
epoch: 4, bath_id: 3000, loss is: [0.07145926]
epoch: 4, bath_id: 4000, loss is: [0.14044744]
epoch: 4, bath_id: 5000, loss is: [0.16796467]
[validation accuracy/loss: 0.9281999468803406/0.24646225571632385]

总结

通过最后的结果,我们可以看到手写数字识别的准去率达到了92.8%
在这里插入图片描述

蒟蒻写博客不易,加之本人水平有限,写作仓促,错误和不足之处在所难免,谨请读者和各位大佬们批评指正。
如需转载,请署名作者并附上原文链接,蒟蒻非常感激
名称:虐猫人薛定谔i
博客地址:https://blog.csdn.net/Deep___Learning

展开阅读全文

没有更多推荐了,返回首页

©️2019 CSDN 皮肤主题: 黑客帝国 设计师: 上身试试
应支付0元
点击重新获取
扫码支付

支付成功即可阅读