Paddle-Lite C++ 推理开发完整指南

费发肠Norman

于 2025-06-06 09:03:56 发布

阅读量247

点赞数 4

本文链接：https://blog.csdn.net/gitblog_01193/article/details/148464816

版权

Paddle-Lite C++ 推理开发完整指南

Paddle-Lite PaddlePaddle High Performance Deep Learning Inference Engine for Mobile and Edge (飞桨高性能深度学习端侧推理引擎）项目地址: https://gitcode.com/gh_mirrors/pa/Paddle-Lite

前言

Paddle-Lite 是专为移动端和嵌入式设备优化的轻量级深度学习推理框架。本文将详细介绍如何使用 Paddle-Lite 的 C++ API 进行模型推理，帮助开发者快速掌握移动端深度学习应用开发的核心流程。

核心概念与流程概述

Paddle-Lite 的 C++ 推理流程主要包含以下几个关键步骤：

环境配置：准备模型文件和预测库
预测器创建：加载优化后的模型
输入处理：准备模型输入数据
推理执行：运行模型计算
输出解析：获取并处理推理结果

下面我们将通过一个完整的 MobileNet 图像分类示例，详细讲解每个步骤的具体实现。

环境准备

1. 开发环境要求

支持 C++11 的编译器
CMake 构建工具
Android NDK（如需在安卓平台运行）
ADB 调试工具（用于安卓设备部署）

2. 预测库获取

Paddle-Lite 提供了预编译的预测库，包含以下关键组件：

头文件：include/paddle_api.h
动态库：lib/libpaddle_light_api_shared.so
静态库：lib/libpaddle_api_light_bundled.a

基础推理流程实现

1. 引入必要头文件

#include "paddle_api.h"  // Paddle-Lite 核心头文件
using namespace paddle::lite_api;  // 使用 Paddle-Lite 命名空间

2. 创建预测器

// 1. 配置移动端运行参数
MobileConfig config;
// 2. 设置优化后的模型路径
config.set_model_from_file("mobilenet_v1_opt.nb");
// 3. 创建预测器对象
std::shared_ptr<PaddlePredictor> predictor = 
    CreatePaddlePredictor<MobileConfig>(config);

3. 准备输入数据

// 获取输入 Tensor
std::unique_ptr<Tensor> input_tensor(std::move(predictor->GetInput(0)));
// 设置输入形状 (batch_size, channels, height, width)
input_tensor->Resize({1, 3, 224, 224});
// 获取数据指针并填充数据
auto* data = input_tensor->mutable_data<float>();
for (int i = 0; i < ShapeProduction(input_tensor->shape()); ++i) {
    data[i] = 1.0f;  // 示例中使用全1输入
}

4. 执行推理

predictor->Run();  // 执行模型计算

5. 获取输出结果

std::unique_ptr<const Tensor> output_tensor(
    std::move(predictor->GetOutput(0)));
// 获取输出数据指针
auto output_data = output_tensor->data<float>();
// 处理输出结果...

实战案例：MobileNet 图像分类

1. 模型准备

首先需要将原始模型转换为 Paddle-Lite 优化格式：

paddle_lite_opt --model_dir=./mobilenet_v1 \
                --optimize_out_type=naive_buffer \
                --optimize_out=./mobilenet_v1_opt

2. 完整示例代码解析

#include <iostream>
#include "paddle_api.h"

using namespace paddle::lite_api;

int main(int argc, char** argv) {
    if (argc < 2) {
        std::cerr << "Usage: " << argv[0] << " <model_file>" << std::endl;
        return -1;
    }

    // 1. 创建预测器
    MobileConfig config;
    config.set_model_from_file(argv[1]);
    auto predictor = CreatePaddlePredictor<MobileConfig>(config);

    // 2. 准备输入
    auto input_tensor = predictor->GetInput(0);
    input_tensor->Resize({1, 3, 224, 224});
    auto input_data = input_tensor->mutable_data<float>();
    
    // 3. 填充输入数据 (示例中使用全1)
    for (int i = 0; i < ShapeProduction(input_tensor->shape()); ++i) {
        input_data[i] = 1.0f;
    }

    // 4. 执行推理
    predictor->Run();

    // 5. 获取输出
    auto output_tensor = predictor->GetOutput(0);
    auto output_data = output_tensor->data<float>();
    
    // 6. 输出结果示例
    std::cout << "Output tensor shape: ";
    for (auto dim : output_tensor->shape()) {
        std::cout << dim << " ";
    }
    std::cout << std::endl;

    return 0;
}

3. 编译与运行

使用以下命令编译示例程序：

g++ -std=c++11 mobilenet_demo.cc -o mobilenet_demo \
    -I/path/to/paddle_lite/include \
    -L/path/to/paddle_lite/lib \
    -lpaddle_light_api_shared

在安卓设备上运行：

adb push mobilenet_demo /data/local/tmp/
adb push mobilenet_v1_opt.nb /data/local/tmp/
adb push libpaddle_light_api_shared.so /data/local/tmp/
adb shell "cd /data/local/tmp && LD_LIBRARY_PATH=. ./mobilenet_demo mobilenet_v1_opt.nb"

进阶应用

1. 图像预处理

实际应用中，需要对输入图像进行预处理：

// 示例：图像归一化处理
cv::Mat image = cv::imread("test.jpg");
cv::resize(image, image, cv::Size(224, 224));
image.convertTo(image, CV_32FC3);
image = image / 255.0f;  // 归一化

// 转换为CHW格式并填充到Tensor
float* input_data = input_tensor->mutable_data<float>();
for (int c = 0; c < 3; ++c) {
    for (int h = 0; h < 224; ++h) {
        for (int w = 0; w < 224; ++w) {
            input_data[c * 224 * 224 + h * 224 + w] = 
                image.at<cv::Vec3f>(h, w)[c];
        }
    }
}

2. 多输入/输出模型处理

对于多输入输出的模型，需要分别处理每个Tensor：

// 处理多个输入
for (int i = 0; i < predictor->GetInputNames().size(); ++i) {
    auto input = predictor->GetInput(i);
    // 根据每个输入的要求分别处理...
}

// 处理多个输出
for (int i = 0; i < predictor->GetOutputNames().size(); ++i) {
    auto output = predictor->GetOutput(i);
    // 处理每个输出...
}