halide编程技术指南（连载六）

最新推荐文章于 2022-08-03 12:02:57 发布

Aoulun

最新推荐文章于 2022-08-03 12:02:57 发布

阅读量608

点赞数

分类专栏：深度学习

本文链接：https://blog.csdn.net/Aoulun/article/details/108524796

版权

深度学习专栏收录该内容

45 篇文章 5 订阅

订阅专栏

本文是halide编程指南的连载，已同步至公众号

第十章 halide 编译（AOT 编译）


// 第一部分：halide的编译
// 本节演示如何用halide到达传统编译器的功能，也就是提前编译.
// 本课程分为两个文件。第一个构建一个halide管道并将其编译为静态库和头。第二节，使用该静态库实际运行管道。这意味着编译此代码是一个多步骤的过程。
// 在linux上,你可以像这样编译和运行:
// g++ lesson_10_generate.cpp -g -std=c++11 -I ../include -L ../bin -lHalide -lpthread -ldl -o lesson_10_generate
// LD_LIBRARY_PATH=../bin ./lesson_10_generate
// g++ lesson_10_run.cpp lesson_10_halide.a -std=c++11 -I ../include -lpthread -ldl -o lesson_10_run
// ./lesson_10_run
// 在os x上:
// g++ lesson_10*generate.cpp -g -std=c++11 -I ../include -L ../bin -lHalide -o lesson_10_generate
// DYLD_LIBRARY_PATH=../bin ./lesson_10_generate// g++ lesson_10*run.cpp lesson_10_halide.a -o lesson_10_run -I ../include
// ./lesson_10_run
// 这种方法的好处是，最终的程序可以:
// -在运行时不进行任何jit编译，所以速度很快。
// - 它完全不依赖于libHalide，所以它是一个小的，易于部署的二进制文件。
// 如果你有整个halide目录树，你也可以通过，在位于halide目录树顶部的当前目录shell中运行:make tutorial_lesson_10_aot_compilation_run
#include "Halide.h"
#include <stdio.h>
using namespace Halide;
int main(int argc, char **argv) {

    // 定义一个简单的管道:
    Func brighter;
    Var x, y;

    // 管道将依赖于一个标量参数.
    Param<uint8_t> offset;

// 取一个灰度8位输入buffer。第一个构造函数参数给出像素的类型，第二个参数指定维度的数量（而不是通道的数量！）。对于灰度图像，这是2；对于彩色图像，是3。目前，四个维度是输入和输出的最大值
    ImageParam input(type_of<uint8_t>(), 2);

// 如果我们是jit编译的话，它们将只是一个int和一个Buffer，但是因为我们希望只编译管道一次，并让它为参数的任何值工作，我们需要创建一个Param对象，它可以像Expr一样使用，而ImageParam对象可以像buffer一样使用。

    // 定义 Func.
    brighter(x, y) = input(x, y) + offset;

    // 安排.
    brighter.vectorize(x, 16).parallel(y);

// 这次，不是调用brighter.realize(...)，因为它会立即编译和执行这个管道。我们将调用另一个方法，将管道编译为静态库和头。
// 对于AOT编译的代码，我们需要显式地声明例程的参数。这个程序需要两个。参数通常是Params或ImageParams。
    brighter.compile_to_static_library("lesson_10_halide", {input, offset}, "brighter");

    printf("Halide pipeline compiled, but not yet run.\n");

    // To continue this lesson, look in the file lesson_10_aot_compilation_run.cpp

return 0;
}

// 第二部分
// 在开始之前, 请阅读lesson_10_aot_compilation_generate.cpp
// 这是实际使用我们编译的halide管道的代码. 它不依赖于 libHalide, 所以不会包含Halide.h.相反，它取决于运行lesson_10_generate时生成的头文件#include "lesson_10_halide.h"
// 想继续用 Halide::Buffer 利用AOT编译的代码, 所以我们显式地包含它。它是一个只包含头的类，不需要libHalides
#include "HalideBuffer.h"
#include <stdio.h>
int main(int argc, char **argv) {
    // 看看上面的头文件， (在你运行lesson_10_generate之前，它不会存在). 底部是我们生成的函数的签名:
    // int brighter(halide_buffer_t *_input_buffer, uint8_t _offset, halide_buffer_t *_brighter_buffer);

// ImageParam输入已成为指向“halide_buffer_t”结构的指针。这是halide用来表示数据数组的结构。除非您是从纯C代码调用halide管道，否则您不希望直接使用它。Halide::Runtime::Buffer是一个围绕Halide_Buffer_t的简单包装器，它将隐式转换为Halide_Buffer_t*。我们将在这些插槽（slots）中传递Halide:：Runtime:：Buffer对象。
     //Halide::Buffer 类实际上是Halide::Runtime::Buffer类的共享指针。他们有相同的API.

    // 最后，"brighter"的返回值是错误代码，０表示成功

    // 为输入输出创建buffer.
    Halide::Runtime::Buffer<uint8_t> input(640, 480), output(640, 480);

// Halide::Runtime::Buffer还具有包装现有数据而不是分配新内存的构造函数。如果您有自己想要使用的图像类型，请使用这些。
    int offset = 5;
    int error = brighter(input, offset, output);

    if (error) {
        printf("Halide returned an error: %d\n", error);
        return -1;
    }

// 现在让我们检查一下执行的滤波器。它应该为每个输入像素添加偏移量。
for (int y = 0; y < 480; y++) {
        for (int x = 0; x < 640; x++) {
            uint8_t input_val = input(x, y);
            uint8_t output_val = output(x, y);
            uint8_t correct_val = input_val + offset;
            if (output_val != correct_val) {
                printf("output(%d, %d) was %d instead of %d\n",
                       x, y, output_val, correct_val);
                return -1;
            }
        }
    }

    // 搞得不错!
    printf("Success!\n");
return 0;
}

lesson_10_aot_compilation_generate.cpp

// Halide tutorial lesson 10: AOT compilation part 1

// This lesson demonstrates how to use Halide as an more traditional
// ahead-of-time (AOT) compiler.

// This lesson is split across two files. The first (this one), builds
// a Halide pipeline and compiles it to a static library and
// header. The second (lesson_10_aot_compilation_run.cpp), uses that
// static library to actually run the pipeline. This means that
// compiling this code is a multi-step process.

// On linux, you can compile and run it like so:
// g++ lesson_10*generate.cpp -g -std=c++11 -I ../include -L ../bin -lHalide -lpthread -ldl -o lesson_10_generate
// LD_LIBRARY_PATH=../bin ./lesson_10_generate
// g++ lesson_10*run.cpp lesson_10_halide.a -std=c++11 -I ../include -lpthread -ldl -o lesson_10_run
// ./lesson_10_run

// On os x:
// g++ lesson_10*generate.cpp -g -std=c++11 -I ../include -L ../bin -lHalide -o lesson_10_generate
// DYLD_LIBRARY_PATH=../bin ./lesson_10_generate
// g++ lesson_10*run.cpp lesson_10_halide.a -o lesson_10_run -I ../include
// ./lesson_10_run

// The benefits of this approach are that the final program:
// - Doesn't do any jit compilation at runtime, so it's fast.
// - Doesn't depend on libHalide at all, so it's a small, easy-to-deploy binary.

// If you have the entire Halide source tree, you can also build it by
// running:
//    make tutorial_lesson_10_aot_compilation_run
// in a shell with the current directory at the top of the halide
// source tree.

#include "Halide.h"
#include <stdio.h>
using namespace Halide;

int main(int argc, char **argv) {

    // We'll define a simple one-stage pipeline:
    Func brighter;
    Var x, y;

    // The pipeline will depend on one scalar parameter.
    Param<uint8_t> offset;

    // And take one grayscale 8-bit input buffer. The first
    // constructor argument gives the type of a pixel, and the second
    // specifies the number of dimensions (not the number of
    // channels!). For a grayscale image this is two; for a color
    // image it's three. Currently, four dimensions is the maximum for
    // inputs and outputs.
    ImageParam input(type_of<uint8_t>(), 2);

    // If we were jit-compiling, these would just be an int and a
    // Buffer, but because we want to compile the pipeline once and
    // have it work for any value of the parameter, we need to make a
    // Param object, which can be used like an Expr, and an ImageParam
    // object, which can be used like a Buffer.

    // Define the Func.
    brighter(x, y) = input(x, y) + offset;

    // Schedule it.
    brighter.vectorize(x, 16).parallel(y);

    // This time, instead of calling brighter.realize(...), which
    // would compile and run the pipeline immediately, we'll call a
    // method that compiles the pipeline to a static library and header.
    //
    // For AOT-compiled code, we need to explicitly declare the
    // arguments to the routine. This routine takes two. Arguments are
    // usually Params or ImageParams.
    brighter.compile_to_static_library("lesson_10_halide", {input, offset}, "brighter");

    printf("Halide pipeline compiled, but not yet run.\n");

    // To continue this lesson, look in the file lesson_10_aot_compilation_run.cpp

    return 0;
}

lesson_10_aot_compilation_run.cpp

// Halide tutorial lesson 10: AOT compilation part 2

// Before reading this file, see lesson_10_aot_compilation_generate.cpp

// This is the code that actually uses the Halide pipeline we've
// compiled. It does not depend on libHalide, so we won't be including
// Halide.h.
//
// Instead, it depends on the header file that lesson_10_generate
// produced when we ran it:
#include "lesson_10_halide.h"

// We want to continue to use our Halide::Buffer with AOT-compiled
// code, so we explicitly include it. It's a header-only class, and
// doesn't require libHalide.
#include "HalideBuffer.h"

#include <stdio.h>

int main(int argc, char **argv) {
    // Have a look in the header file above (it won't exist until you've run
    // lesson_10_generate). At the bottom is the signature of the function we generated:

    // int brighter(halide_buffer_t *_input_buffer, uint8_t _offset, halide_buffer_t *_brighter_buffer);

    // The ImageParam inputs have become pointers to "halide_buffer_t"
    // structs. This is struct that Halide uses to represent arrays of
    // data.  Unless you're calling the Halide pipeline from pure C
    // code, you don't want to use it
    // directly. Halide::Runtime::Buffer is a simple wrapper around
    // halide_buffer_t that will implicitly convert to a
    // halide_buffer_t *. We will pass Halide::Runtime::Buffer objects
    // in those slots.

    // The Halide::Buffer class we have been using in JIT code is in
    // fact just a shared pointer to the simpler
    // Halide::Runtime::Buffer class. They share the same API.

    // Finally, the return value of "brighter" is an error code. It's
    // zero on success.

    // Let's make a buffer for our input and output.
    Halide::Runtime::Buffer<uint8_t> input(640, 480), output(640, 480);

    // Halide::Runtime::Buffer also has constructors that wrap
    // existing data instead of allocating new memory. Use these if
    // you have your own Image type that you want to use.

    int offset = 5;
    int error = brighter(input, offset, output);

    if (error) {
        printf("Halide returned an error: %d\n", error);
        return -1;
    }

    // Now let's check the filter performed as advertised. It was
    // supposed to add the offset to every input pixel.
    for (int y = 0; y < 480; y++) {
        for (int x = 0; x < 640; x++) {
            uint8_t input_val = input(x, y);
            uint8_t output_val = output(x, y);
            uint8_t correct_val = input_val + offset;
            if (output_val != correct_val) {
                printf("output(%d, %d) was %d instead of %d\n",
                       x, y, output_val, correct_val);
                return -1;
            }
        }
    }

    // Everything worked!
    printf("Success!\n");
    return 0;
}

Aoulun

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
halide编程技术指南（连载六）

本文是halide编程指南的连载，已同步至公众号第十章 halide 编译（AOT 编译）// 第一部分：halide的编译// 本节演示如何用halide到达传统编译器的功能，也就是提前编译.// 本课程分为两个文件。第一个构建一个halide管道并将其编译为静态库和头。第二节，使用该静态库实际运行管道。这意味着编译此代码是一个多步骤的过程。// 在linux上,你可以像这样编译和运行:// g++ lesson_10_generate.cpp -g -std=c++1...
复制链接

扫一扫