基于PaddlePaddle的深度学习应用

简介

在这里插入图片描述

  • 本文以交通系统车牌分类数据集为例,介绍基于PaddlePaddle架构的图像分类应用。
  • 图像分类,是根据图像的语义信息对不同类别图像进行区分,是计算机视觉中重要的基础问题。
  • 图像分类在许多领域都有着广泛的应用,如:交通领域的交通场景识别,互联网领域基于内容的图像检索和相册自动归类,医学领域的图像识别等。

运行环境

  1. 系统:Windows 10 专业版
  2. 处理器:x86_64(x64)架构
  3. Python和pip版本: Python 3.7.6 - pip 20.0.6
  4. PaddlePaddle版本:1.7.1

一.加载Python库

import numpy as np
import paddle as paddle
import paddle.fluid as fluid
from PIL import Image
import cv2
import matplotlib.pyplot as plt
import os
from multiprocessing import cpu_count
from paddle.fluid.dygraph import Pool2D,Conv2D
from paddle.fluid.dygraph import Linear

二.数据处理

1.导入数据

  • 从AI Stuidio平台导入车牌图片数据包,生成字符图像列表,并将标签信息与图片列表内容对应:
# 生成车牌字符图像列表
data_path = '/home/aistudio/data'
character_folders = os.listdir(data_path)
label = 0
LABEL_temp = {}
if(os.path.exists('./train_data.list')):
    os.remove('./train_data.list')
if(os.path.exists('./test_data.list')):
    os.remove('./test_data.list')
for character_folder in character_folders:
    with open('./train_data.list', 'a') as f_train:
        with open('./test_data.list', 'a') as f_test:
            if character_folder == '.DS_Store' or character_folder == '.ipynb_checkpoints' or character_folder == 'data23617':
                continue
            print(character_folder + " " + str(label))
            LABEL_temp[str(label)] = character_folder #存储一下标签的对应关系
            character_imgs = os.listdir(os.path.join(data_path, character_folder))
            for i in range(len(character_imgs)):
                if i%10 == 0: 
                    f_test.write(os.path.join(os.path.join(data_path, character_folder), character_imgs[i]) + "\t" + str(label) + '\n')
                else:
                    f_train.write(os.path.join(os.path.join(data_path, character_folder), character_imgs[i]) + "\t" + str(label) + '\n')
    label = label + 1
print('注:图像列表已生成')

注:图像列表已生成

2.划分数据集

  • 根据图像列表信息,划分车牌字符训练集和测试集。其中,train_reader,作为用于训练的数据提供器;test_reader,作为用于测试的数据提供器。
# 用上一步生成的图像列表定义车牌字符训练集和测试集的reader
def data_mapper(sample):
    img, label = sample
    img = paddle.dataset.image.load_image(file=img, is_color=False)
    img = img.flatten().astype('float32') / 255.0
    return img, label
def data_reader(data_list_path):
    def reader():
        with open(data_list_path, 'r') as f:
            lines = f.readlines()
            for line in lines:
                img, label = line.split('\t')
                yield img, int(label)
    return paddle.reader.xmap_readers(data_mapper, reader, cpu_count(), 1024)

# 用于训练的数据提供器
train_reader = paddle.batch(reader=paddle.reader.shuffle(reader=data_reader('./train_data.list'), buf_size=512), batch_size=128)
# 用于测试的数据提供器
test_reader = paddle.batch(reader=data_reader('./test_data.list'), batch_size=128)

三.CNN网络的创建、训练与测试

1.创建LeNet网络

  • 将基础的LeNet模型,作为卷积神经网络的图像分类模型。其中,含有三个卷积层,两个池化层以及一个线性输出单元。
#定义网络
class MyLeNet(fluid.dygraph.Layer):
    def __init__(self):
        super(MyLeNet,self).__init__()
        self.hidden1_1 = Conv2D(1,28,5,1)
        self.hidden1_2 = Pool2D(pool_size=2,pool_type='max',pool_stride=1)
        self.hidden2_1 = Conv2D(28,32,3,1)
        self.hidden2_2 = Pool2D(pool_size=2,pool_type='max',pool_stride=1)
        self.hidden3 = Conv2D(32,32,3,1)
        self.hidden4 = Linear(32*10*10,65,act='softmax')
    def forward(self,input):
        x=self.hidden1_1(input)
        x=self.hidden1_2(x)
        x=self.hidden2_1(x)
        x=self.hidden2_2(x)
        x=self.hidden3(x)
        x=fluid.layers.reshape(x,shape=[-1,32*10*10])
        y=self.hidden4(x)
        return y

2.训练与测试模型

  • 设置网络迭代次数为25次,学习率learning_rate=0.001,选用SGD随机梯度下降法作为模型优化器。
with fluid.dygraph.guard():
    model=MyLeNet() #模型实例化
    model.train() #训练模式
    opt=fluid.optimizer.SGDOptimizer(learning_rate=0.001, parameter_list=model.parameters())#优化器选用SGD随机梯度下降,学习率为0.001.
    epochs_num=25 #迭代次数为25
    
    for pass_num in range(epochs_num):
        
        for batch_id,data in enumerate(train_reader()):
            images=np.array([x[0].reshape(1,20,20) for x in data],np.float32)
            labels = np.array([x[1] for x in data]).astype('int64')
            labels = labels[:, np.newaxis]
            image=fluid.dygraph.to_variable(images)
            label=fluid.dygraph.to_variable(labels)
            
            predict=model(image)#预测
            
            loss=fluid.layers.cross_entropy(predict,label)
            avg_loss=fluid.layers.mean(loss)#获取loss值
            
            acc=fluid.layers.accuracy(predict,label)#计算精度
            
            if batch_id!=0 and batch_id%50==0:
                print("train_pass:{},batch_id:{},train_loss:{},train_acc:{}".format(pass_num,batch_id,avg_loss.numpy(),acc.numpy()))
            
            avg_loss.backward()
            opt.minimize(avg_loss)
            model.clear_gradients()            
            
    fluid.save_dygraph(model.state_dict(),'MyLeNet')#保存模型

train_pass:0,batch_id:50,train_loss:[3.0882254],train_acc:[0.2421875]
train_pass:0,batch_id:100,train_loss:[4.052675],train_acc:[0.0546875]
train_pass:1,batch_id:50,train_loss:[2.579272],train_acc:[0.453125]
train_pass:1,batch_id:100,train_loss:[2.555184],train_acc:[0.4453125]
train_pass:2,batch_id:50,train_loss:[2.1244614],train_acc:[0.5390625]
train_pass:2,batch_id:100,train_loss:[1.4230838],train_acc:[0.5703125]
train_pass:3,batch_id:50,train_loss:[1.8108985],train_acc:[0.5625]
train_pass:3,batch_id:100,train_loss:[1.1870338],train_acc:[0.671875]
train_pass:4,batch_id:50,train_loss:[1.3705134],train_acc:[0.6796875]
train_pass:4,batch_id:100,train_loss:[0.91087323],train_acc:[0.7421875]
train_pass:5,batch_id:50,train_loss:[1.1405623],train_acc:[0.671875]
train_pass:5,batch_id:100,train_loss:[0.74630266],train_acc:[0.796875]
train_pass:6,batch_id:50,train_loss:[1.1114655],train_acc:[0.6875]
train_pass:6,batch_id:100,train_loss:[0.57304454],train_acc:[0.875]
train_pass:7,batch_id:50,train_loss:[0.9558362],train_acc:[0.75]
train_pass:7,batch_id:100,train_loss:[0.50046384],train_acc:[0.8984375]
train_pass:8,batch_id:50,train_loss:[0.89706266],train_acc:[0.734375]
train_pass:8,batch_id:100,train_loss:[0.59521234],train_acc:[0.8828125]
train_pass:9,batch_id:50,train_loss:[0.8451707],train_acc:[0.703125]
train_pass:9,batch_id:100,train_loss:[0.36055827],train_acc:[0.90625]
train_pass:10,batch_id:50,train_loss:[0.73698777],train_acc:[0.7578125]
train_pass:10,batch_id:100,train_loss:[0.3520307],train_acc:[0.921875]
train_pass:11,batch_id:50,train_loss:[0.7635164],train_acc:[0.75]
train_pass:11,batch_id:100,train_loss:[0.36311096],train_acc:[0.8984375]
train_pass:12,batch_id:50,train_loss:[0.6539128],train_acc:[0.765625]
train_pass:12,batch_id:100,train_loss:[0.2312974],train_acc:[0.953125]
train_pass:13,batch_id:50,train_loss:[0.72637016],train_acc:[0.7421875]
train_pass:13,batch_id:100,train_loss:[0.3795592],train_acc:[0.9140625]
train_pass:14,batch_id:50,train_loss:[0.50914115],train_acc:[0.890625]
train_pass:14,batch_id:100,train_loss:[0.24175346],train_acc:[0.9609375]
train_pass:15,batch_id:50,train_loss:[0.58999205],train_acc:[0.8125]
train_pass:15,batch_id:100,train_loss:[0.28724134],train_acc:[0.9375]
train_pass:16,batch_id:50,train_loss:[0.49204314],train_acc:[0.875]
train_pass:16,batch_id:100,train_loss:[0.1449625],train_acc:[0.984375]
train_pass:17,batch_id:50,train_loss:[0.45823067],train_acc:[0.8515625]
train_pass:17,batch_id:100,train_loss:[0.16351454],train_acc:[0.953125]
train_pass:18,batch_id:50,train_loss:[0.49262056],train_acc:[0.8984375]
train_pass:18,batch_id:100,train_loss:[0.15715218],train_acc:[0.9609375]
train_pass:19,batch_id:50,train_loss:[0.6008049],train_acc:[0.7890625]
train_pass:19,batch_id:100,train_loss:[0.27827048],train_acc:[0.9453125]
train_pass:20,batch_id:50,train_loss:[0.49254173],train_acc:[0.828125]
train_pass:20,batch_id:100,train_loss:[0.10245153],train_acc:[0.984375]
train_pass:21,batch_id:50,train_loss:[0.5114829],train_acc:[0.859375]
train_pass:21,batch_id:100,train_loss:[0.13853994],train_acc:[0.9765625]
train_pass:22,batch_id:50,train_loss:[0.41968846],train_acc:[0.890625]
train_pass:22,batch_id:100,train_loss:[0.15172973],train_acc:[0.9765625]
train_pass:23,batch_id:50,train_loss:[0.31932884],train_acc:[0.9140625]
train_pass:23,batch_id:100,train_loss:[0.14889],train_acc:[0.96875]
train_pass:24,batch_id:50,train_loss:[0.4187737],train_acc:[0.828125]
train_pass:24,batch_id:100,train_loss:[0.2674595],train_acc:[0.9296875]

3.模型损失校验

  • 计算LeNet模型的训练损失函数:
#模型校验
with fluid.dygraph.guard():
    accs = []
    model=MyLeNet()#模型实例化
    model_dict,_=fluid.load_dygraph('MyLeNet')
    model.load_dict(model_dict)#加载模型参数
    model.eval()#评估模式
    for batch_id,data in enumerate(test_reader()):#测试集
        images=np.array([x[0].reshape(1,20,20) for x in data],np.float32)
        labels = np.array([x[1] for x in data]).astype('int64')
        labels = labels[:, np.newaxis]
            
        image=fluid.dygraph.to_variable(images)
        label=fluid.dygraph.to_variable(labels)
            
        predict=model(image)#预测
        acc=fluid.layers.accuracy(predict,label)
        accs.append(acc.numpy()[0])
        avg_acc = np.mean(accs)
    print(avg_acc)

0.81890106

四.实际模型预测

1.图像处理

  • 将车牌图片进行处理后,分割出车牌中的每一个字符并保存:
# 对车牌图片进行处理,分割出车牌中的每一个字符并保存
license_plate = cv2.imread('./车牌.png')
gray_plate = cv2.cvtColor(license_plate, cv2.COLOR_RGB2GRAY)
ret, binary_plate = cv2.threshold(gray_plate, 175, 255, cv2.THRESH_BINARY)
result = []
for col in range(binary_plate.shape[1]):
    result.append(0)
    for row in range(binary_plate.shape[0]):
        result[col] = result[col] + binary_plate[row][col]/255
character_dict = {}
num = 0
i = 0
while i < len(result):
    if result[i] == 0:
        i += 1
    else:
        index = i + 1
        while result[index] != 0:
            index += 1
        character_dict[num] = [i, index-1]
        num += 1
        i = index

for i in range(8):
    if i==2:
        continue
    padding = (170 - (character_dict[i][1] - character_dict[i][0])) / 2
    ndarray = np.pad(binary_plate[:,character_dict[i][0]:character_dict[i][1]], ((0,0), (int(padding), int(padding))), 'constant', constant_values=(0,0))
    ndarray = cv2.resize(ndarray, (20,20))
    cv2.imwrite('./' + str(i) + '.png', ndarray)
    
def load_image(path):
    img = paddle.dataset.image.load_image(file=path, is_color=False)
    img = img.astype('float32')
    img = img[np.newaxis, ] / 255.0
    return img

2.标签转换

  • 将车牌对应的标签信息进行转换:
#将标签进行转换
print('Label:',LABEL_temp)
match = {'A':'A','B':'B','C':'C','D':'D','E':'E','F':'F','G':'G','H':'H','I':'I','J':'J','K':'K','L':'L','M':'M','N':'N',
        'O':'O','P':'P','Q':'Q','R':'R','S':'S','T':'T','U':'U','V':'V','W':'W','X':'X','Y':'Y','Z':'Z',
        'yun':'云','cuan':'川','hei':'黑','zhe':'浙','ning':'宁','jin':'津','gan':'赣','hu':'沪','liao':'辽','jl':'吉','qing':'青','zang':'藏',
        'e1':'鄂','meng':'蒙','gan1':'甘','qiong':'琼','shan':'陕','min':'闽','su':'苏','xin':'新','wan':'皖','jing':'京','xiang':'湘','gui':'贵',
        'yu1':'渝','yu':'豫','ji':'冀','yue':'粤','gui1':'桂','sx':'晋','lu':'鲁',
        '0':'0','1':'1','2':'2','3':'3','4':'4','5':'5','6':'6','7':'7','8':'8','9':'9'}
L = 0
LABEL ={}

for V in LABEL_temp.values():
    LABEL[str(L)] = match[V]
    L += 1
print(LABEL)

-标签信息-

Label: {‘0’: ‘gan’, ‘1’: ‘6’, ‘2’: ‘A’, ‘3’: ‘P’, ‘4’: ‘gan1’, ‘5’: ‘yue’, ‘6’: ‘F’, ‘7’: ‘C’, ‘8’: ‘jing’, ‘9’: ‘yu’, ‘10’: ‘ji’, ‘11’: ‘cuan’, ‘12’: ‘gui1’, ‘13’: ‘V’, ‘14’: ‘zang’, ‘15’: ‘sx’, ‘16’: ‘zhe’, ‘17’: ‘S’, ‘18’: ‘yun’, ‘19’: ‘e1’, ‘20’: ‘min’, ‘21’: ‘jin’, ‘22’: ‘9’, ‘23’: ‘H’, ‘24’: ‘liao’, ‘25’: ‘jl’, ‘26’: ‘xiang’, ‘27’: ‘D’, ‘28’: ‘3’, ‘29’: ‘M’, ‘30’: ‘yu1’, ‘31’: ‘hei’, ‘32’: ‘gui’, ‘33’: ‘J’, ‘34’: ‘0’, ‘35’: ‘shan’, ‘36’: ‘wan’, ‘37’: ‘hu’, ‘38’: ‘1’, ‘39’: ‘ning’, ‘40’: ‘Q’, ‘41’: ‘Z’, ‘42’: ‘E’, ‘43’: ‘T’, ‘44’: ‘5’, ‘45’: ‘U’, ‘46’: ‘qing’, ‘47’: ‘7’, ‘48’: ‘Y’, ‘49’: ‘W’, ‘50’: ‘lu’, ‘51’: ‘X’, ‘52’: ‘N’, ‘53’: ‘8’, ‘54’: ‘K’, ‘55’: ‘qiong’, ‘56’: ‘R’, ‘57’: ‘meng’, ‘58’: ‘su’, ‘59’: ‘4’, ‘60’: ‘G’, ‘61’: ‘L’, ‘62’: ‘xin’, ‘63’: ‘B’, ‘64’: ‘2’}
{‘0’: ‘赣’, ‘1’: ‘6’, ‘2’: ‘A’, ‘3’: ‘P’, ‘4’: ‘甘’, ‘5’: ‘粤’, ‘6’: ‘F’, ‘7’: ‘C’, ‘8’: ‘京’, ‘9’: ‘豫’, ‘10’: ‘冀’, ‘11’: ‘川’, ‘12’: ‘桂’, ‘13’: ‘V’, ‘14’: ‘藏’, ‘15’: ‘晋’, ‘16’: ‘浙’, ‘17’: ‘S’, ‘18’: ‘云’, ‘19’: ‘鄂’, ‘20’: ‘闽’, ‘21’: ‘津’, ‘22’: ‘9’, ‘23’: ‘H’, ‘24’: ‘辽’, ‘25’: ‘吉’, ‘26’: ‘湘’, ‘27’: ‘D’, ‘28’: ‘3’, ‘29’: ‘M’, ‘30’: ‘渝’, ‘31’: ‘黑’, ‘32’: ‘贵’, ‘33’: ‘J’, ‘34’: ‘0’, ‘35’: ‘陕’, ‘36’: ‘皖’, ‘37’: ‘沪’, ‘38’: ‘1’, ‘39’: ‘宁’, ‘40’: ‘Q’, ‘41’: ‘Z’, ‘42’: ‘E’, ‘43’: ‘T’, ‘44’: ‘5’, ‘45’: ‘U’, ‘46’: ‘青’, ‘47’: ‘7’, ‘48’: ‘Y’, ‘49’: ‘W’, ‘50’: ‘鲁’, ‘51’: ‘X’, ‘52’: ‘N’, ‘53’: ‘8’, ‘54’: ‘K’, ‘55’: ‘琼’, ‘56’: ‘R’, ‘57’: ‘蒙’, ‘58’: ‘苏’, ‘59’: ‘4’, ‘60’: ‘G’, ‘61’: ‘L’, ‘62’: ‘新’, ‘63’: ‘B’, ‘64’: ‘2’}

3.输出预测结果

  • 加载模型参数,构建实例化模型,并输出最终预测结果:
#构建预测动态图过程
with fluid.dygraph.guard():
    model=MyLeNet()#模型实例化
    model_dict,_=fluid.load_dygraph('MyLeNet')
    model.load_dict(model_dict)#加载模型参数
    model.eval()#评估模式
    lab=[]
    for i in range(8):
        if i==2:
            continue
        infer_imgs = []
        infer_imgs.append(load_image('./' + str(i) + '.png'))
        infer_imgs = np.array(infer_imgs)
        infer_imgs = fluid.dygraph.to_variable(infer_imgs)
        result=model(infer_imgs)
        lab.append(np.argmax(result.numpy()))
# print(lab)

display(Image.open('./车牌.png'))
print('\n车牌识别结果为:',end='')
for i in range(len(lab)):
    print(LABEL[str(lab[i])],end='')

在这里插入图片描述

车牌识别结果为:鲁A686EJ

五.感悟与体会

  • 通过此次“深度学习训练营 - CV疫情特辑”,我对百度的深度学习开发平台“AI Studio”,有了更加深入的了解。
  • 深度学习可以实现一些复杂的功能,在生活中有广泛应用。之前,只知道深层的神经网络叫深度学习。通过此次课程,我知道深度学习应用分为CV、NLP和Rec等多个领域。并且,了解了深度学习的发展历史和现有的网络模型。
  • 此外,个人认为PaddlePaddle的文档非常友好。文档具有中文版本,如果想要使用什么网络模型,直接在飞桨平台搜索,就可以找到对应的API,并且可以看到详细参数说明。
  • 总体而言,基于Python的AI Studio平台+PaddlePaddle架构,是一套非常好的、适用于深度学习的开发工具,也推荐其他初学者使用。
  • 至此,“深度学习-CV疫情特辑”也将告一段落,期待百度接下来的CV和NLP进阶训练营!
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
这段代码是导入所需的Python库和模块。其中: - numpy:Python的一个科学计算库,用于支持大型多维数组和矩阵运算。 - paddle:百度开源的深度学习框架,类似于TensorFlow和PyTorch。 - paddle.dataset.mnist:paddle框架中的MNIST数据集模块。 - paddle.fluid:paddle框架的核心模块,提供了深度学习训练和推理所需的各种API和工具。 - PIL:Python中的图像处理库,可以用于图像的读取、处理和展示。 - matplotlib:Python的一个绘图库,用于数据可视化。 - pathlib:Python 3.4引入的一个库,提供了一种面向对象的路径操作方式。 - paddle.vision.datasets:paddle框架中的视觉数据集模块,提供了常用的视觉数据集和数据集处理方法。 - paddle.vision.transforms:paddle框架中的数据预处理模块,提供了常用的数据预处理方法,如图像的缩放、翻转、裁剪等。 - paddle.nn.functional:paddle框架中的函数式API模块,提供了常用的深度学习函数和操作。 - sklearn.metrics:scikit-learn库中的评估指标模块,提供了混淆矩阵、F1-score等评估指标。 - seaborn:Python的一个数据可视化库,可以用于画混淆矩阵等图形。 - json:Python的一个数据格式转换库,用于将数据转换为JSON格式。 - gzip:Python的一个数据压缩库,可以用于压缩和解压缩数据。 - cv2:OpenCV库中的一个模块,用于图像处理和计算机视觉。 - tqdm:Python的一个进度条库,可以用于显示迭代过程中的进度条。 - InputSpec:paddle框架中的输入数据规格类,用于定义输入数据的形状和类型。 - Accuracy:paddle框架中的准确率指标类,用于计算模型的准确率。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

安の沐辰

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值