CNN飞机识别-基于paddlepaddle高级API（附详细代码讲解）

最新推荐文章于 2025-03-06 11:49:20 发布

edward_zcl

最新推荐文章于 2025-03-06 11:49:20 发布

阅读量1.1k

点赞数

分类专栏：人工智能-神经网络机器学习入门必备 Python使用技巧文章标签： matlab 开发语言算法

原文链接：https://blog.csdn.net/lzx159951/article/details/104932536/

版权

人工智能-神经网络同时被 3 个专栏收录

175 篇文章

订阅专栏

Python使用技巧

151 篇文章

订阅专栏

机器学习入门必备

119 篇文章

订阅专栏

本文介绍了使用PaddlePaddle进行飞机识别的实战项目，包括数据预处理、CNN网络结构、数据生成器、优化器以及训练和测试过程。通过学习高级API，如Ploter、Trainer和Inferencer，提升了代码的简洁性和效率。在训练过程中，利用事件处理函数实时监控训练损失，并保存模型参数。最后，对模型进行了验证，展示了训练和测试数据的准确率。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

前言

在使用paddlepaddle实战完手写数字集识别后，开始了新的一轮实战-飞机识别。

与之前不同的是，这次的项目使用更加高级的框架api接口，代码集成度更高，但是同样的也出现了一个问题，课程中并没有对这些api进行讲解。

我花了两个多小时，通过查看源代码，终于搞清楚了这些api。下面，我们分享给大家！

文章目录

CNN飞机识别网络结构

在这里插入图片描述

数据集以及项目说明

本项目共有7897张图像，其中训练集5897张，测试集2000张图像，每幅图像的大小是32*32，形状是(32,32,3)

本次项目是百度官方视频七天入门深度学习中的 day2.
课程链接：Day 2 实战——飞机识别

代码解析

1. 加载数据

testdata_orgin = np.load('data/plane/testdata.npy')
testlabel_orgin=np.load('data/plane/testlabel.npy')
traindata_orgin = np.load('data/plane/traindata.npy')
trainlabel_orgin=np.load('data/plane/trainlabel.npy')

 
 
 
 1
2
3
4

返回的数组形状：
testdata_orgin.shape==(2000, 32, 32, 3)
testlabel_orgin==(2000, 1)
traindata_orgin==(5897, 32, 32, 3)
trainlabel_orgin==(5897, 1)
2. 转换数据格式

testdata = np.array(testdata_orgin).reshape(2000,3,32,32).astype(np.float32)
traindata = np.array(traindata_orgin).reshape(5897,3,32,32).astype(np.float32)
testlabel = np.array(testlabel_orgin).reshape(2000,1).astype(np.float32)
trainlabel_orgin = np.array(trainlabel_orgin).reshape(5897,1).astype(np.float32)

由于在模型中，要求形状为(通道,宽,高),所以，这里需要对原数据进行格式上的转换。

3. 数据归一化

testdata = 2*testdata/255.0-1.0
traindata = 2*traindata/255.0-1.0

图片归一化，使图片中的每一个数在区间[-1,1]之间。这是在我们进行图象分类任务必须进行的操作，有利于提高我们程序执行速度和模型精准度。

4. 构造数据生成器

def dataset(data,label,buf_size):#这里bufsize是产生数据的个数
    def reader():
        for i in range(buf_size):
            yield data[i,:],int(label[i])
    return reader

这里通过yield这个生成器函数，将dataset函数变成一个迭代器。迭代器获取数据的好处就是节约内存。

详情大家参考这篇文章：
今天终于弄明白了python迭代器是什么（含paddlepaddle部分源码解析）

5. 构造卷积网络

def CNN():
    img = fluid.layers.data(
        name='img', shape =[3,32,32],dtype = 'float32')
    hidden = fluid.nets.simple_img_conv_pool(
        input=img,
        num_filters=6,#卷积核个数
        filter_size=5,#卷积核大小
        pool_size=2,#池化层大小
        pool_stride=2,#步长
        pool_padding=0,
        act='relu'
    )
    hidden1 = fluid.nets.simple_img_conv_pool(
        input=hidden,
        num_filters=16,
        filter_size=5,
        pool_size=2,
        pool_stride=2,
        pool_padding=0,
        act='relu'
    )
    flatten = fluid.layers.fc(input=hidden1,size=120,act='softmax')
    y_prediction = fluid.layers.fc(input=flatten,size=2,act='softmax')
    return y_prediction

/ul>

在文章开头我就把所使用的卷积神将网络结构图就展示了出来，这段代码就是按照网络结构进行搭建，最终返回值是一个形状为(1,2)的数组。

6. 建立整个网络结构

def train_func():
    y_label = fluid.layers.data(name='label',shape=[1],dtype='int64')
    #计算损失值
    prediction = CNN()
    #交叉熵计算损失
    cost = fluid.layers.cross_entropy(input=prediction,label=y_label)
    avg_cost = fluid.layers.mean(cost)
    return avg_cost

在这个网络结构中，我们通过调取CNN（）获取预测值，再通过交叉熵计算误差损失，最后通过mean()计算出平均误差

7. 构造优化器

def optimizer_func():
    #创建优化器
    optimizer = fluid.optimizer.Momentum(learning_rate=0.001,momentum=0.5)
    return optimizer

这里使用的是Momentum优化器，参数1是学习率，参数2是动量因子，我们也可以选择其他的优化器，比如adam或者SDG(随机梯度下降)。

8. 训练过程损失值的可视化和参数保存

params_dirname="model/plane-model"
train_title="Train cost"
test_title="Test cost"
plot_cost = Ploter(train_title,test_title)#添加标题，这里添加两个标题，实际上就是立了两个flag 用于后面数据添加。
step = 0
def event_handler_plot(event):
    global step#将step声明为一个全局变量使用。
    if isinstance(event, EndStepEvent):#判断对象类型
        if event.step % 2 == 0: # 若干个batch,记录cost 
            if event.metrics[0] < 10:#这里只记录损失值在10一下的情况
            #   添加数据，第一个参数为flag，向train_title添加数据，step为x轴坐标值，第三个参数是y轴坐标值
                plot_cost.append(train_title, step, event.metrics[0]) 
        if event.step % 20 == 0: # 若干个batch,记录cost
            #此方法返回值是损失值
            test_metrics = trainer.test(
                reader=test_reader, feed_order=feed_order)
            if test_metrics[0] < 10:
                plot_cost.append(test_title, step, test_metrics[0])
                plot_cost.plot()
        # 将参数存储，用于预测使用
        if params_dirname is not None:
            trainer.save_params(params_dirname)
    step += 1

这个方法会在notebook上显示一个动态变换的图像，能让我们更加直观的看到整个训练过程。
在这里插入图片描述
同时，通过 trainer.save_params(params_dirname)，将参数保存在这里路径下。

9. 创建数据迭代器

#训练所用到的具体数据
BATCH_SIZE=16
train_reader = paddle.batch(
    paddle.reader.shuffle(dataset(traindata,trainlabel_orgin,buf_size=209),buf_size=50),
    batch_size=BATCH_SIZE
)
test_reader = paddle.batch(
    paddle.reader.shuffle(dataset(testdata,testlabel_orgin,buf_size=50),buf_size=20),
    batch_size=BATCH_SIZE
)

10. 创建训练器，开始训练

#执行环境
place =  fluid.CPUPlace()
#创建训练器
trainer = Trainer(
    train_func = train_func,#必须是能够返回损失值的函数
    place=place,#执行环境
    optimizer_func=optimizer_func#优化器，返回值是优化器类型
)
#开始训练
trainer.train(#训练函数，
    reader=train_reader,#训练数据
    num_epochs=30,#训练次数，每次训练会处理数据读取器中的所有数据。
    event_handler=event_handler_plot,#事件处理函数
    feed_order=feed_order
)

11. 创建测试器

inferencer = Inferencer(
    infer_func =CNN ,#能够返回预测值的方法
    param_path=params_dirname,#之前经过保存的参数的路径
    place=place#运行环境
)

12. 测试模型

def right_ratio(right_counter,total):
    ratio = float(right_counter)/total
    return ratio
def evl(data_set):
    total=0
    right_counter=0
    pass_num=0
    for mini_batch in data_set():
        pass_num+=1
        test_x = np.array([data[0] for data in mini_batch]).astype("float32")
        test_y = np.array([data[1] for data in mini_batch]).astype("int64")
        mini_batch_result = inferencer.infer({'img':test_x})
        mini_batch_result = (mini_batch_result[0][:,-1]>0.5)+0
        label = np.array(test_y)
        label_len = len(label)
        total+=label_len
        for i in range(label_len):
            if mini_batch_result[i]==label[i]:
                right_counter+=1
    radio = right_ratio(right_counter,total)
    return radio
radio = evl(train_reader)
print("训练数据的正确率 %0.2f%%" %(radio*100))
radio = evl(test_reader)
print("预测数据的正确率 %0.2f%%" %(radio*100))

高级API讲解

引用高级API

from paddle.utils.plot import Ploter
from paddle.fluid.contrib.trainer import EndStepEvent
from  paddle.fluid.contrib.trainer import Trainer
from paddle.fluid.contrib.inferencer import Inferencer

  
  
  
  1
2
3
4

1. Ploter()绘画
Ploter(train_title,test_title)，参数是数据标题，用于后面添加数据时做区分标志的，同时也是显示图片图例时label值。

.append(train_title, step, event.metrics[0]) 参数1是所要添加数据的标志，第二个参数是x轴的值，第三个参数是y轴的值。例子中使用步数step作为x的值，损失值 event.metrics[0]作为y的值。
.plot_cost.plot() 显示图像。注意，这个才本地pycharm环境中不显示，在notebook这种环境中显示。其参数是路径，若入路径，将会保存图像到该路径。

2. Trainer（）训练器

Trainer(
train_func = train_func,#必须是能够返回损失值的函数
place=place,#执行环境
optimizer_func=optimizer_func#优化器，返回值是优化器类型
)

参数1 是我们构造的网络结构，这个参数的要求是该方法的返回值必须是损失值。

参数2 执行环境place，也就是我们设置使用GPU执行还是CPU执行

参数3 优化器

.train( #训练函数，
reader=train_reader,#训练数据
num_epochs=30,#训练次数，每次训练会处理数据读取器中的所有数据。
event_handler=event_handler_plot,#事件处理函数
feed_order=feed_order
)

这个是Trainer()的一个训练方法。

参数1 是需要训练的数据，类型是迭代器，这里我们前面构造好了。

参数2 训练次数，也就是大循环次数

参数3 事件处理函数，这里使用的是event_handler_plot 显示训练过程的方法，但是这个方法必须带有一个参数，我这里使用event。通过判断event的类型，我们可以判断训练的进度。比如有类型是BeginStepEvent开始和EndStepEvent结束，每一个类型返回的参数都是不一样的。

参数4 一个列表，里面是变量赋值的标签。这里是feed_order=[‘img’,‘label’]，一个是图片输入变量的name值，另一个是图片标签变量的name值。

3. Inferencer（）测试器

说白了就是拿着训练好的模型和测试数据进行对模型的测试，返回值是预测值。也就是CNN()的返回值。

Inferencer(
infer_func =CNN ,#能够返回预测值的方法
param_path=params_dirname,#之前经过保存的参数的路径
place=place#运行环境
)

参数1 方法，一个能够返回预测值的方法。
参数2 参数保存路径（吃现成的饭😆）
参数3 运行环境

.inferencer.infer({‘img’:test_x})
参数必须是字典形式，key是输入变量的name值，value是需要输入的数据。
返回值是预测值预测值形状是(batch_size,2)