OpenCL C++绑定

OpenCL C++ Bindings

OpenCL C++绑定

Introduction

介绍

For many large applications C++ is the language of choice and so it seems reasonable to define C++ bindings for OpenCL.

对于许多大型应用程序来说,C++是首选语言,因此为OpenCL定义C++绑定似乎是合理的。

The interface is contained with a single C++ header file opencl.hpp and all definitions are contained within the namespace cl. There is no additional requirement to include cl.h and to use either the C++ or original C bindings; it is enough to simply include opencl.hpp.

​该接口包含在单个C++头文件opencl.hpp中,所有定义都包含在命名空间cl中。没有额外的要求包括cl.h并使用C++或原始C绑定;只包含opencl.hpp就足够了。

The bindings themselves are lightweight and correspond closely to the underlying C API. Using the C++ bindings introduces no additional execution overhead.

绑定本身是轻量级的,并与底层的C API紧密对应。使用C++绑定不会引入额外的执行开销。

There are numerous compatibility, portability and memory management fixes in the new header as well as additional OpenCL 2.0 features. As a result the header is not directly backward compatible and for this reason we release it as opencl.hpp rather than a new version of cl.hpp.

​新标头中有许多兼容性、可移植性和内存管理修复,以及其他OpenCL 2.0功能。因此,头不能直接向后兼容,因此我们将其发布为opencl.hpp,而不是cl.hpp的新版本。

Compatibility

兼容性

Due to the evolution of the underlying OpenCL API the 2.0 C++ bindings include an updated approach to defining supported feature versions and the range of valid underlying OpenCL runtime versions supported.

由于底层OpenCL API的发展,2.0 C++绑定包括一种更新的方法来定义支持的功能版本和支持的有效底层OpenCL运行时版本的范围。

The combination of preprocessor macros CL_HPP_TARGET_OPENCL_VERSION and CL_HPP_MINIMUM_OPENCL_VERSION control this range. These are three digit decimal values representing OpenCL runime versions. The default for the target is 200, representing OpenCL 2.0 and the minimum is also defined as 200. These settings would use 2.0 API calls only. If backward compatibility with a 1.2 runtime is required, the minimum version may be set to 120.

预处理器宏CL_HPP_TARGET_OPEN CL_VERSION和CL_HPP_MINIMUM_OPENCL_VERSION的组合控制此范围。这些是表示OpenCL运行时版本的三位十进制值。目标的默认值为200,表示OpenCL 2.0,最小值也定义为200。这些设置将仅使用2.0 API调用。如果需要与1.2运行时的向后兼容性,则最低版本可以设置为120。

Note that this is a compile-time setting, and so affects linking against a particular SDK version rather than the versioning of the loaded runtime.

请注意,这是一个编译时设置,因此会影响针对特定SDK版本的链接,而不是加载的运行时的版本控制。

The earlier versions of the header included basic vector and string classes based loosely on STL versions. These were difficult to maintain and very rarely used. For the 2.0 header we now assume the presence of the standard library unless requested otherwise. We use std::array, std::vector, std::shared_ptr and std::string throughout to safely manage memory and reduce the chance of a recurrance of earlier memory management bugs.

头的早期版本包括松散地基于STL版本的基本向量和字符串类。这些很难维护,而且很少使用。对于2.0头,我们现在假设存在标准库,除非另有要求。我们使用std::array、std::vector、std::shared_ptr和std::string来安全地管理内存,并减少早期内存管理错误再次出现的机会。

These classes are used through typedefs in the cl namespace: cl::array, cl::vector, cl::pointer and cl::string. In addition cl::allocate_pointer forwards to std::allocate_shared by default. In all cases these standard library classes can be replaced with custom interface-compatible versions using the CL_HPP_NO_STD_ARRAY, CL_HPP_NO_STD_VECTOR, CL_HPP_NO_STD_UNIQUE_PTR and CL_HPP_NO_STD_STRING macros.

​这些类通过cl名称空间中的typedef使用:cl::array、cl::vector、cl::pointer和cl::string。此外,默认情况下,cl::allocate_pointer会转发到std::allocation_shared。在所有情况下,这些标准库类都可以使用CL_HPP_NO_STD_ARAY、CL_HPP_NO_STD_ECTOR、CL_HPP_NO_STD_UNIQUE_PTR和CL_HPP-NO_STD_STRING宏替换为自定义接口兼容版本。

The OpenCL 1.x versions of the C++ bindings included a size_t wrapper class to interface with kernel enqueue. This caused unpleasant interactions with the standard size_t declaration and led to namespacing bugs. In the 2.0 version we have replaced this with a std::array-based interface. However, the old behaviour can be regained for backward compatibility using the CL_HPP_ENABLE_SIZE_T_COMPATIBILITY macro.

C++绑定的OpenCL1.x版本包括一个size_t包装器类,用于与内核队列接口。这导致了与标准size_t声明的不愉快交互,并导致了名称空间错误。在2.0版本中,我们用基于std::array的接口取代了它。但是,可以使用CL_HPP_ENABLE_SIZE_T_COMPATIBILITY宏恢复旧行为以实现向后兼容性。

Finally, the program construction interface used a clumsy vector-of-pairs design in the earlier versions. We have replaced that with a cleaner vector-of-vectors and vector-of-strings design. However, for backward compatibility old behaviour can be regained with the CL_HPP_ENABLE_PROGRAM_CONSTRUCTION_FROM_ARRAY_COMPATIBILITY macro.

In OpenCL 2.0 OpenCL C is not entirely backward compatibility with earlier versions. As a result a flag must be passed to the OpenCL C compiled to request OpenCL 2.0 compilation of kernels with 1.2 as the default in the absence of the flag. In some cases the C++ bindings automatically compile code for ease. For those cases the compilation defaults to OpenCL C 2.0. If this is not wanted, the CL_HPP_CL_1_2_DEFAULT_BUILD macro may be specified to assume 1.2 compilation. If more fine-grained decisions on a per-kernel bases are required then explicit build operations that take the flag should be used.

在OpenCL 2.0中,OpenCL C与早期版本并不完全向后兼容。因此,必须向编译的OpenCL C传递一个标志,以请求OpenCL 2.0编译内核,在没有该标志的情况下,1.2是默认值。在某些情况下,C++绑定会自动编译代码以方便使用。对于这些情况,编译默认为OpenCL C 2.0。如果不需要这样做,则可以指定CL_HPP_CL_1_2_DEFAULT_BUILD宏进行1.2编译。如果需要在每个内核的基础上进行更细粒度的决策,那么应该使用采用该标志的显式构建操作。

Parameters

参数

This header may be parameterized by a set of preprocessor macros.

该头可以由一组预处理器宏参数化。

  • CL_HPP_TARGET_OPENCL_VERSION

    Defines the target OpenCL runtime version to build the header against. Defaults to 200, representing OpenCL 2.0.

  • 定义用于构建头的目标OpenCL运行时版本。默认值为200,表示OpenCL 2.0。

  • CL_HPP_NO_STD_STRING

    Do not use the standard library string class. cl::string is not defined and may be defined by the user before opencl.hpp is included.

  • ​不要使用标准库字符串类。cl::string未定义,可能在opencl.hpp包含之前由用户定义。

  • CL_HPP_NO_STD_VECTOR

    Do not use the standard library vector class. cl::vector is not defined and may be defined by the user before opencl.hpp is included.

  • ​不要使用标准库矢量类。cl::vector未定义,可能在opencl.hpp包含之前由用户定义。

  • CL_HPP_NO_STD_ARRAY

    Do not use the standard library array class. cl::array is not defined and may be defined by the user before opencl.hpp is included.

  • ​不要使用标准库数组类。cl::array未定义,用户可以在包含opencl.hpp之前定义它。

  • CL_HPP_NO_STD_UNIQUE_PTR

    Do not use the standard library unique_ptr class. cl::pointer and the cl::allocate_pointer functions are not defined and may be defined by the user before opencl.hpp is included.

  • ​不要使用标准库unique_ptr类。cl::pointer和cl::allocate_pointer函数未定义,可能在opencl.hpp包含之前由用户定义。

  • CL_HPP_ENABLE_EXCEPTIONS

    Enable exceptions for use in the C++ bindings header. This is the preferred error handling mechanism but is not required.

  • 启用在C++绑定头中使用的异常。这是首选的错误处理机制,但不是必需的。

  • CL_HPP_ENABLE_SIZE_T_COMPATIBILITY

    Backward compatibility option to support cl.hpp-style size_t class. Replaces the updated std::array derived version and removal of size_t from the namespace. Note that in this case the new size_t class is placed in the cl::compatibility namespace and thus requires an additional using declaration for direct backward compatibility.

  • 支持cl.hpp-style size_t类的向后兼容性选项。替换更新的std::array派生版本,并从命名空间中删除size_t。请注意,在这种情况下,新的size_t类被放置在cl::compatibility命名空间中,因此需要额外的using声明来实现直接向后兼容性。

  • CL_HPP_ENABLE_PROGRAM_CONSTRUCTION_FROM_ARRAY_COMPATIBILITY

    Enable older vector of pairs interface for construction of programs.

  • 启用用于构建程序的旧矢量对接口。

  • CL_HPP_CL_1_2_DEFAULT_BUILD

    Default to OpenCL C 1.2 compilation rather than OpenCL C 2.0 applies to use of cl::Program construction and other program build variants.

  • ​默认为OpenCL C 1.2编译而非OpenCL C 2.0适用于cl::Program构建和其他程序构建变体的使用。

  • CL_HPP_USE_CL_DEVICE_FISSION

    Enable the cl_ext_device_fission extension.

  • 启用cl_ext_device_split扩展。

  • CL_HPP_USE_CL_IMAGE2D_FROM_BUFFER_KHR

    Enable the cl_khr_image2d_from_buffer extension.

  • 启用cl_khr_image2d_from_buffer扩展名。

  • CL_HPP_USE_CL_SUB_GROUPS_KHR

    Enable the cl_khr_subgroups extension.

  • 启用cl_khr_subgroup扩展。

  • CL_HPP_USE_DX_INTEROP

    Enable the cl_khr_d3d10_sharing extension.

  • 启用cl_khr_d3d10_shareing扩展。

  • CL_HPP_USE_IL_KHR

    Enable the cl_khr_il_program extension.

  • 启用cl_khr_il_program扩展。

Example

示例

The following example shows a general use case for the C++ bindings, including support for the optional exception feature and also the supplied vector and string classes, see following sections for decriptions of these features.

以下示例显示了C++绑定的一般用例,包括对可选异常功能的支持,以及提供的向量和字符串类,有关这些功能的描述,请参阅以下部分。

Note: the C++ bindings use std::call_once and therefore may need to be compiled using special command-line options (such as "-pthread") on some platforms!

注意:C++绑定使用std::call_once,因此在某些平台上可能需要使用特殊的命令行选项(如“-phread”)进行编译!

#define CL_HPP_ENABLE_EXCEPTIONS
#define CL_HPP_TARGET_OPENCL_VERSION 200
 
#include <CL/opencl.hpp>
#include <iostream>
#include <vector>
#include <memory>
#include <algorithm>
 
const int numElements = 32;
 
int main(void)
{
    // Filter for a 2.0 or newer platform and set it as the default
    //筛选2.0或更新版本的平台并将其设置为默认值
    std::vector<cl::Platform> platforms;
    cl::Platform::get(&platforms);
    cl::Platform plat;
    for (auto &p : platforms) {
        std::string platver = p.getInfo<CL_PLATFORM_VERSION>();
        if (platver.find("OpenCL 2.") != std::string::npos ||
            platver.find("OpenCL 3.") != std::string::npos) {
            // Note: an OpenCL 3.x platform may not support all required features!
            plat = p;
        }
    }
    if (plat() == 0) {
        std::cout << "No OpenCL 2.0 or newer platform found.\n";
        return -1;
    }
 
    cl::Platform newP = cl::Platform::setDefault(plat);
    if (newP != plat) {
        std::cout << "Error setting default platform.\n";
        return -1;
    }
 
    // C++11 raw string literal for the first kernel
    std::string kernel1{R"CLC(
        global int globalA;
        kernel void updateGlobal()
        {
          globalA = 75;
        }
    )CLC"};
 
    // Raw string literal for the second kernel
    std::string kernel2{R"CLC(
        typedef struct { global int *bar; } Foo;
        kernel void vectorAdd(global const Foo* aNum, global const int *inputA, global const int *inputB,
                              global int *output, int val, write_only pipe int outPipe, queue_t childQueue)
        {
          output[get_global_id(0)] = inputA[get_global_id(0)] + inputB[get_global_id(0)] + val + *(aNum->bar);
          write_pipe(outPipe, &val);
          queue_t default_queue = get_default_queue();
          ndrange_t ndrange = ndrange_1D(get_global_size(0)/2, get_global_size(0)/2);
 
          // Have a child kernel write into third quarter of output
          enqueue_kernel(default_queue, CLK_ENQUEUE_FLAGS_WAIT_KERNEL, ndrange,
            ^{
                output[get_global_size(0)*2 + get_global_id(0)] =
                  inputA[get_global_size(0)*2 + get_global_id(0)] + inputB[get_global_size(0)*2 + get_global_id(0)] + globalA;
            });
 
          // Have a child kernel write into last quarter of output
          enqueue_kernel(childQueue, CLK_ENQUEUE_FLAGS_WAIT_KERNEL, ndrange,
            ^{
                output[get_global_size(0)*3 + get_global_id(0)] =
                  inputA[get_global_size(0)*3 + get_global_id(0)] + inputB[get_global_size(0)*3 + get_global_id(0)] + globalA + 2;
            });
        }
    )CLC"};
 
    std::vector<std::string> programStrings;
    programStrings.push_back(kernel1);
    programStrings.push_back(kernel2);
 
    cl::Program vectorAddProgram(programStrings);
    try {
        vectorAddProgram.build("-cl-std=CL2.0");
    }
    catch (...) {
        // Print build info for all devices
        cl_int buildErr = CL_SUCCESS;
        auto buildInfo = vectorAddProgram.getBuildInfo<CL_PROGRAM_BUILD_LOG>(&buildErr);
        for (auto &pair : buildInfo) {
            std::cerr << pair.second << std::endl << std::endl;
        }
 
        return 1;
    }
 
    typedef struct { int *bar; } Foo;
 
    // Get and run kernel that initializes the program-scope global
    // A test for kernels that take no arguments
    auto program2Kernel =
        cl::KernelFunctor<>(vectorAddProgram, "updateGlobal");
    program2Kernel(
        cl::EnqueueArgs(
        cl::NDRange(1)));
 
    // SVM allocations
 
    auto anSVMInt = cl::allocate_svm<int, cl::SVMTraitCoarse<>>();
    *anSVMInt = 5;
    cl::SVMAllocator<Foo, cl::SVMTraitCoarse<cl::SVMTraitReadOnly<>>> svmAllocReadOnly;
    auto fooPointer = cl::allocate_pointer<Foo>(svmAllocReadOnly);
    fooPointer->bar = anSVMInt.get();
    cl::SVMAllocator<int, cl::SVMTraitCoarse<>> svmAlloc;
    std::vector<int, cl::SVMAllocator<int, cl::SVMTraitCoarse<>>> inputA(numElements, 1, svmAlloc);
    cl::coarse_svm_vector<int> inputB(numElements, 2, svmAlloc);
 
    // Traditional cl_mem allocations
 
    std::vector<int> output(numElements, 0xdeadbeef);
    cl::Buffer outputBuffer(begin(output), end(output), false);
    cl::Pipe aPipe(sizeof(cl_int), numElements / 2);
 
    // Default command queue, also passed in as a parameter
    cl::DeviceCommandQueue defaultDeviceQueue = cl::DeviceCommandQueue::makeDefault(
        cl::Context::getDefault(), cl::Device::getDefault());
 
    auto vectorAddKernel =
        cl::KernelFunctor<
            decltype(fooPointer)&,
            int*,
            cl::coarse_svm_vector<int>&,
            cl::Buffer,
            int,
            cl::Pipe&,
            cl::DeviceCommandQueue
            >(vectorAddProgram, "vectorAdd");
 
    // Ensure that the additional SVM pointer is available to the kernel
    // This one was not passed as a parameter
    vectorAddKernel.setSVMPointers(anSVMInt);
 
    cl_int error;
    vectorAddKernel(
        cl::EnqueueArgs(
            cl::NDRange(numElements/2),
            cl::NDRange(numElements/2)),
        fooPointer,
        inputA.data(),
        inputB,
        outputBuffer,
        3,
        aPipe,
        defaultDeviceQueue,
        error
        );
 
    cl::copy(outputBuffer, begin(output), end(output));
 
    cl::Device d = cl::Device::getDefault();
 
    std::cout << "Output:\n";
    for (int i = 1; i < numElements; ++i) {
        std::cout << "\t" << output[i] << "\n";
    }
    std::cout << "\n\n";
 
    return 0;
}
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值