OpenCL™规范 4.3. 对设备进行分区

本文详细介绍了OpenCL中的设备分区功能,特别是clCreateSubDevices函数,用于创建子设备并管理计算单元。还涵盖了clRetainDevice和clReleaseDevice用于设备引用计数的管理。
摘要由CSDN通过智能技术生成

4.3. Partitioning a Device

4.3. 对设备进行分区

Partitioning devices is missing before version 1.2.


1.2版本之前缺少分区设备。

To create sub-devices partitioning an OpenCL device, call the function:

要创建分区OpenCL设备的子设备,请调用以下函数:

cl_int clCreateSubDevices(
    cl_device_id in_device,
    const cl_device_partition_property* properties,
    cl_uint num_devices,
    cl_device_id* out_devices,
    cl_uint* num_devices_ret);

clCreateSubDevices is missing before version 1.2.

clCreateSubDevices在版本1.2之前丢失。

  • in_device is the device to be partitioned.

  • in_device是要分区的设备。

  • properties specifies how in_device is to be partitioned, described by a partition name and its corresponding value. Each partition name is immediately followed by the corresponding desired value. The list is terminated with 0. The list of supported partitioning schemes is described in the Sub-device Partition table. Only one of the listed partitioning schemes can be specified in properties.

  • ​属性指定如何对in_device进行分区,由分区名称及其相应的值来描述。每个分区名称后面紧跟着相应的所需值。列表以0终止。子设备分区表中描述了支持的分区方案列表。只能在属性中指定列出的分区方案之一。

  • num_devices is the size of memory pointed to by out_devices specified as the number of cl_device_id entries.

  • num_devices是out_devices指向的内存大小,指定为cl_device_id条目的数量。

  • out_devices is the buffer where the OpenCL sub-devices will be returned. If out_devices is NULL, this argument is ignored. If out_devices is not NULLnum_devices must be greater than or equal to the number of sub-devices that device may be partitioned into according to the partitioning scheme specified in properties.

  • out_devices是将返回OpenCL子设备的缓冲区。如果out_devices为NULL,则忽略此参数。如果out_devices不为NULL,则num_devices必须大于或等于根据属性中指定的分区方案可以将设备分区为的子设备的数量。

  • num_devices_ret returns the number of sub-devices that device may be partitioned into according to the partitioning scheme specified in properties. If num_devices_ret is NULL, it is ignored.

  • num_devices_ret返回根据属性中指定的划分方案可以将设备划分为的子设备的数量。如果num_devices_ret为NULL,则忽略它。

clCreateSubDevices creates an array of sub-devices that each reference a non-intersecting set of compute units within in_device, according to the partition scheme given by properties. The output sub-devices may be used in every way that the root (or parent) device can be used, including creating contexts, building programs, further calls to clCreateSubDevices and creating command-queues. When a command-queue is created against a sub-device, the commands enqueued on the queue are executed only on the sub-device.

clCreateSubDevices根据属性给出的分区方案,创建子设备阵列,每个子设备引用in_device内的一组不相交的计算单元。输出子设备可以以可以使用根(或父)设备的各种方式使用,包括创建上下文、构建程序、对clCreateSubDevices的进一步调用和创建命令队列。当针对子设备创建命令队列时,队列中排队的命令仅在子设备上执行。

Table 6. List of supported partition schemes by clCreateSubDevices
表6 clCreateSubDevices支持的分区方案列表

Partition Property

分区属性

Partition Value

分区值

Description

描述

CL_DEVICE_PARTITION_EQUALLY

missing before version 1.2.

cl_uint

Split the aggregate device into as many smaller aggregate devices as can be created, each containing n compute units. The value n is passed as the value accompanying this property. If n does not divide evenly into CL_DEVICE_MAX_COMPUTE_UNITS, then the remaining compute units are not used.

​将聚合设备拆分为尽可能多的较小聚合设备,每个设备包含n个计算单元。值n作为此属性附带的值传递。如果n未平均划分为CL_DEVICE_MAX_COMPUTE_UNITS,则不使用剩余的计算单位。

CL_DEVICE_PARTITION_BY_COUNTS

missing before version 1.2.

cl_uint

This property is followed by a list of compute unit counts terminated with 0 or CL_DEVICE_PARTITION_BY_COUNTS_LIST_END. For each non-zero count m in the list, a sub-device is created with m compute units in it.


此属性后面是以0或CL_DEVICE_PARTITION_BY_COUNTS_LIST_END结尾的计算单元计数列表。对于列表中的每个非零计数m,将创建一个子设备,其中包含m个计算单元。

The number of non-zero count entries in the list may not exceed CL_DEVICE_PARTITION_MAX_SUB_DEVICES.

列表中非零计数条目的数量不能超过CL_DEVICE_PARTITION_MAX_SUB_DEVICES。

The total number of compute units specified may not exceed CL_DEVICE_MAX_COMPUTE_UNITS.

​指定的计算单元总数不能超过CL_DEVICE_MAX_COMPUTE_UNITS。

CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN

missing before version 1.2.

cl_device_affinity_domain

Split the device into smaller aggregate devices containing one or more compute units that all share part of a cache hierarchy. The value accompanying this property may be drawn from the following list:

将设备拆分为包含一个或多个计算单元的较小聚合设备,这些计算单元共享缓存层次结构的一部分。此属性附带的值可以从以下列表中提取:

CL_DEVICE_AFFINITY_DOMAIN_NUMA - Split the device into sub-devices comprised of compute units that share a NUMA node.

CL_DEVICE_AFFINITY_DOMAIN_NUMA-将设备拆分为由共享NUMA节点的计算单元组成的子设备。

CL_DEVICE_AFFINITY_DOMAIN_L4_CACHE - Split the device into sub-devices comprised of compute units that share a level 4 data cache.

CL_DEVICE_AFFINITY_DOMAIN_L4_CACHE-将设备拆分为由共享4级数据缓存的计算单元组成的子设备。

CL_DEVICE_AFFINITY_DOMAIN_L3_CACHE - Split the device into sub-devices comprised of compute units that share a level 3 data cache.

CL_DEVICE_AFFINITY_DOMAIN_L3_CACHE-将设备拆分为由共享3级数据缓存的计算单元组成的子设备。

CL_DEVICE_AFFINITY_DOMAIN_L2_CACHE - Split the device into sub-devices comprised of compute units that share a level 2 data cache.

CL_DEVICE_AFFINITY_DOMAIN_L2_CACHE-将设备拆分为由共享2级数据缓存的计算单元组成的子设备。

CL_DEVICE_AFFINITY_DOMAIN_L1_CACHE - Split the device into sub-devices comprised of compute units that share a level 1 data cache.

CL_DEVICE_AFFINITY_DOMAIN_L1_CACHE-将设备拆分为由共享一级数据缓存的计算单元组成的子设备。

CL_DEVICE_AFFINITY_DOMAIN_NEXT_PARTITIONABLE - Split the device along the next partitionable affinity domain. The implementation shall find the first level along which the device or sub-device may be further subdivided in the order NUMA, L4, L3, L2, L1, and partition the device into sub-devices comprised of compute units that share memory subsystems at this level.

CL_DEVICE_AFFINITY_DOMAIN_NEXT_PARTITIONABLE-沿下一个可分区的关联域拆分设备。实现应找到第一级,沿着该第一级,设备或子设备可以按NUMA、L4、L3、L2、L1的顺序进一步细分,并将设备划分为由在该级共享存储器子系统的计算单元组成的子设备。

The user may determine what happened by calling clGetDeviceInfo(CL_DEVICE_PARTITION_TYPE) on the sub-devices.

​用户可以通过调用子设备上的clGetDeviceInfo(CL_DEVICE_PARTITION_TYPE)来确定发生了什么。

clCreateSubDevices returns CL_SUCCESS if the partition is created successfully. Otherwise, it returns a NULL value with the following error values returned in errcode_ret:

如果分区创建成功,clCreateSubDevices将返回CL_SUCCESS。否则,它将返回一个NULL值,并在errcode_ret中返回以下错误值:

  • CL_INVALID_DEVICE if in_device is not a valid device.

  • CL_INVALID_DEVICE(如果in_DEVICE不是有效设备)。

  • CL_INVALID_VALUE if values specified in properties are not valid or if values specified in properties are valid but not supported by the device.

  • CL_INVALID_VALUE,如果属性中指定的值无效,或者如果属性中规定的值有效但设备不支持。

  • CL_INVALID_VALUE if out_devices is not NULL and num_devices is less than the number of sub-devices created by the partition scheme.

  • CL_INVALID_VALUE,如果out_devices不为NULL并且num_devices小于分区方案创建的子设备的数量。

  • CL_DEVICE_PARTITION_FAILED if the partition name is supported by the implementation but in_device could not be further partitioned.

  • CL_DEVICE_PARTITION_FAILED(如果实现支持分区名称,但无法对in_device 进行进一步分区)。

  • CL_INVALID_DEVICE_PARTITION_COUNT if the partition name specified in properties is CL_DEVICE_PARTITION_BY_COUNTS and the number of sub-devices requested exceeds CL_DEVICE_PARTITION_MAX_SUB_DEVICES or the total number of compute units requested exceeds CL_DEVICE_MAX_COMPUTE_UNITS for in_device, or the number of compute units requested for one or more sub-devices is less than zero or the number of sub-devices requested exceeds CL_DEVICE_MAX_COMPUTE_UNITS for in_device.

  • CL_INVALID_DEVICE_PARTITION_COUNT 如果在属性中指定的分区名称为CL_DEVICE_PARTITION_BY_COUNTS,或者为一个或多个子设备请求的计算单元的数量小于零,或者请求的子设备的数量超过用于in_device的CL_DEVICE_MAX_COMPUTE_UNITS。

  • CL_OUT_OF_RESOURCES if there is a failure to allocate resources required by the OpenCL implementation on the device.

  • CL_OUT_OF_RESOURCES,如果在设备上分配OpenCL实现所需的资源失败。

  • CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources required by the OpenCL implementation on the host.

  • CL_OUT_OF_HOST_MEMORY,如果在主机上分配OpenCL实现所需的资源失败。

A few examples that describe how to specify partition properties in properties argument to clCreateSubDevices are given below:

下面给出了几个示例,描述如何在clCreateSubDevices的属性参数中指定分区属性:

To partition a device containing 16 compute units into two sub-devices, each containing 8 compute units, pass the following in properties:

要将包含16个计算单元的设备划分为两个子设备,每个子设备包含8个计算单元,请在属性中传递以下内容:

{ CL_DEVICE_PARTITION_EQUALLY, 8,
  0 } // 0 terminates the property list

To partition a device with four compute units into two sub-devices with one sub-device containing 3 compute units and the other sub-device 1 compute unit, pass the following in properties argument:

要将具有四个计算单元的设备划分为两个子设备,其中一个子设备包含3个计算单元,另一个子设备为1个计算单元。请在属性参数中传递以下内容:

{ CL_DEVICE_PARTITION_BY_COUNTS,
    3, 1, CL_DEVICE_PARTITION_BY_COUNTS_LIST_END,
  0 } // 0 terminates the property list

To split a device along the outermost cache line (if any), pass the following in properties argument:

要沿最外面的缓存线(如果有)拆分设备,请在properties参数中传递以下内容:

{ CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN,
    CL_DEVICE_AFFINITY_DOMAIN_NEXT_PARTITIONABLE,
  0 } // 0 terminates the property list

To retain a device, call the function:

要保留设备,请调用函数:

cl_int clRetainDevice(
    cl_device_id device);

clRetainDevice is missing before version 1.2.

clRetainDevice在1.2版本之前丢失。

  • device is the OpenCL device to retain.

  • 设备是要保留的OpenCL设备。

clRetainDevice increments the device reference count if device is a valid sub-device created by a call to clCreateSubDevices. If device is a root level device i.e. a cl_device_id returned by clGetDeviceIDs, the device reference count remains unchanged.

如果设备是通过调用clCreateSubDevices创建的有效子设备,则clRetainDevice会增加设备引用计数。如果设备是根级设备,即clGetDeviceID返回的cl_device_id,则设备引用计数保持不变。

clRetainDevice returns CL_SUCCESS if the function is executed successfully or the device is a root-level device. Otherwise, it returns one of the following errors:

如果函数执行成功或设备是根级设备,clRetainDevice将返回CL_SUCCESS。否则,它将返回以下错误之一:

  • CL_INVALID_DEVICE if device is not a valid device.

  • CL_INVALID_DEVICE(如果设备不是有效设备)。

  • CL_OUT_OF_RESOURCES if there is a failure to allocate resources required by the OpenCL implementation on the device.

  • CL_OUT_OF_RESOURCES,如果在设备上分配OpenCL实现所需的资源失败。

  • CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources required by the OpenCL implementation on the host.

  • CL_OUT_OF_HOST_MEMORY,如果在主机上分配OpenCL实现所需的资源失败。

To release a device, call the function:

要释放设备,请调用函数:

cl_int clReleaseDevice(
    cl_device_id device);

clReleaseDevice is missing before version 1.2.

clReleaseDevice在版本1.2之前丢失。

  • device is the OpenCL device to release.

  • 设备是要发布的OpenCL设备。

clReleaseDevice decrements the device reference count if device is a valid sub-device created by a call to clCreateSubDevices. If device is a root level device i.e. a cl_device_id returned by clGetDeviceIDs, the device reference count remains unchanged.

如果设备是通过调用clCreateSubDevices创建的有效子设备,则clReleaseDevice会递减设备引用计数。如果设备是根级设备,即clGetDeviceID返回的cl_device_id,则设备引用计数保持不变。

clReleaseDevice returns CL_SUCCESS if the function is executed successfully. Otherwise, it returns one of the following errors:

如果函数执行成功,clReleaseDevice将返回CL_SUCCESS。否则,它将返回以下错误之一:

  • CL_INVALID_DEVICE if device is not a valid device.

  • CL_INVALID_DEVICE(如果设备不是有效设备)。

  • CL_OUT_OF_RESOURCES if there is a failure to allocate resources required by the OpenCL implementation on the device.

  • CL_OUT_OF_RESOURCES,如果在设备上分配OpenCL实现所需的资源失败。

  • CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources required by the OpenCL implementation on the host.

  • CL_OUT_OF_HOST_MEMORY,如果在主机上分配OpenCL实现所需的资源失败。

After the device reference count becomes zero and all the objects attached to device (such as command-queues) are released, the device object is deleted. Using this function to release a reference that was not obtained by creating the object or by calling clRetainDevice causes undefined behavior.

​在设备引用计数变为零并且释放了所有附加到设备的对象(如命令队列)后,设备对象将被删除。使用此函数释放不是通过创建对象或调用clRetainDevice获得的引用会导致未定义的行为。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值