NNpack的安装和配置

最新推荐文章于 2024-11-19 16:24:50 发布

BigCowPeking

最新推荐文章于 2024-11-19 16:24:50 发布

阅读量1.7k

点赞数

分类专栏： Android 文章标签： NNPACK android 编译卷积优化

本文链接：https://blog.csdn.net/wfei101/article/details/74612752

版权

Android 专栏收录该内容

11 篇文章 0 订阅

订阅专栏

源码地址：https://github.com/Maratyszcza/NNPACK

第一步：NNPACK的安装和配置

NNPACK can be build on OS X and Linux.

Install ninja build system

sudo apt-get install ninja-build || brew install ninja

Install PeachPy assembler and confu configuration system

[sudo] pip install --upgrade git+https://github.com/Maratyszcza/PeachPy
[sudo] pip install --upgrade git+https://github.com/Maratyszcza/confu

Then clone NNPACK, install dependencies, configure, and build

git clone https://github.com/Maratyszcza/NNPACK.git
cd NNPACK
confu setup
python ./configure.py
ninja

当提示出现ninja版本不匹配的时候，可以下载ninja的包，然后编译，之后把可执行文件拷贝到NNPack中，然后./ninja，可以生成相应的静态库！

第二步：Linux下可以正常的运行的例子

#include <iostream>
#include "nnpack.h"
#include <ctime>
#include <vector>

using namespace std;
float test_nnpack(){
    //init nnpack
    enum nnp_status init_status = nnp_initialize();
    if (init_status != nnp_status_success) {
        return 0;
    }

    enum nnp_convolution_algorithm algorithm = nnp_convolution_algorithm_auto;
    enum nnp_convolution_transform_strategy strategy=nnp_convolution_transform_strategy_tuple_based;
    const size_t batch_size = 1;
    const size_t input_channels = 128;
    const size_t output_channels = 128;
    const struct nnp_padding input_padding = { 1, 1, 1, 1 };
    const struct nnp_size input_size ={ 256, 256};
    const struct nnp_size kernel_size = { 5, 5 };
    const struct  nnp_size stride={.width=2,.height=2};
    const struct nnp_size output_size = {
            .width = (input_padding.left + input_size.width + input_padding.right - kernel_size.width)/stride.width + 1,
            .height =(input_padding.top + input_size.height + input_padding.bottom - kernel_size.height)/stride.height  + 1
    };


    //malloc memory for input, kernel, output, bias
    float* input = (float*)malloc(batch_size * input_channels *input_size.height *input_size.width * sizeof(float));
    float* kernel = (float*)malloc(input_channels * output_channels * kernel_size.height * kernel_size.width * sizeof(float));
    float* output = (float*)malloc(batch_size* output_channels * output_size.height * output_size.width * sizeof(float));
    float* bias = (float*)malloc(output_channels * sizeof(float));

    pthreadpool_t threadpool= NULL;


    struct nnp_profile computation_profile;//use for compute time;
    //init input data
    int i,j;
    for(int c=0; c<input_channels;c++ ){
        for(i=0; i<input_size.height; i++){
            for(j=0; j<input_size.width; j++){
                input[c*input_size.height*input_size.width+i*input_size.width+j] = (i*input_size.width+j)*0.1;
            }
        }
    }

    //init kernel data
    for(int i=0; i<output_channels;i++ ){
        for(j=0; j<input_channels*kernel_size.height*kernel_size.width; j++){
            kernel[i*input_channels*kernel_size.height*kernel_size.width+j] = 0.1;
        }
    }

    //init bias data
    for(int i=0; i<output_channels;i++ ){
        bias[i] = 1.0;
    }

    //execute conv

    for(int i=0;i<10;i++)
    {
        nnp_convolution_inference(algorithm,
                                  strategy,
                                  input_channels,
                                  output_channels,
                                  input_size,
                                  input_padding,
                                  kernel_size,
                                  stride,
                                  input,
                                  kernel,
                                  bias,
                                  output,
                                  threadpool,
                                  NULL);
    }

  std::vector<float>out;
    for(int i=0;i<output_channels*output_size.height*output_size.width;i++){
        out.push_back(output[i]);
    }

    return 1;
}
int main() {
    cout << test_nnpack()<< endl;
    return 0;
}

编译命令：g++ 1.cpp -o test -I./include -L./lib -lnnpck -lpthreadpool

运行：./test

第三步：Android的JNI运行方法

1：配置NDK的路径

sudo gedit /etc/profile在profile文件下面添加,保存并退出

export NDK_ROOT= ndk路径

export PATH=$NDK_ROOT:$PATH

source /etc/profile

2：进入到NNPACK的目录

运行：${NDK_ROOT}/ndk-build

3:生成静态库

在目录obj/local/armv7文件夹下，生成5个静态库；运行时需要pthreadpool.h和nnpack.h的头文件,既可以运行！