OpenCL™规范 3.2.1.将工作项映射到NDRange

最新推荐文章于 2024-02-14 16:30:32 发布

꧁白杨树下꧂

最新推荐文章于 2024-02-14 16:30:32 发布

阅读量182

点赞数

分类专栏： openCL 文章标签： opencl

openCL 专栏收录该内容

162 篇文章 12 订阅

订阅专栏

3.2.1. Mapping work-items onto an NDRange

3.2.1.将工作项映射到NDRange

The index space supported by OpenCL is called an NDRange. An NDRange is an N-dimensional index space, where N is one, two or three. The NDRange is decomposed into work-groups forming blocks that cover the Index space. An NDRange is defined by three integer arrays of length N:

OpenCL支持的索引空间称为NDRange。NDRange是一个N维索引空间，其中N是一、二或三。NDRange被分解为工作组，形成覆盖索引空间的块。NDRange由三个长度为N的整数数组定义：

The extent of the index space (or global size) in each dimension.
每个维度中索引空间（或全局大小）的范围。
An offset index F indicating the initial value of the indices in each dimension (zero by default).
偏移索引F指示每个维度中的索引的初始值（默认为零）。
The size of a work-group (local size) in each dimension.
每个维度中工作组的大小（局部大小）。

Each work-items global ID is an N-dimensional tuple. The global ID components are values in the range from F, to F plus the number of elements in that dimension minus one.

每个工作项全局ID都是一个N维元组。全局ID分量的值范围从F到F，再加上该维度中的元素数减1。

Unless a kernel comes from a source that disallows it, e.g. OpenCL C 1.x or using -cl-uniform-work-group-size, the size of work-groups in an NDRange (the local size) need not be the same for all work-groups. In this case, any single dimension for which the global size is not divisible by the local size will be partitioned into two regions. One region will have work-groups that have the same number of work-items as was specified for that dimension by the programmer (the local size). The other region will have work-groups with less than the number of work items specified by the local size parameter in that dimension (the remainder work-groups). Work-group sizes could be non-uniform in multiple dimensions, potentially producing work-groups of up to 4 different sizes in a 2D range and 8 different sizes in a 3D range.

除非内核来自不允许它的源，例如OpenCL C1.x或使用-cl-uniform-work-group-size，否则NDRange中的工作组大小（本地大小）不必对所有工作组都相同。在这种情况下，全局大小不能被局部大小整除的任何单个维度都将被划分为两个区域。一个区域将具有与程序员为该维度指定的工作项数量相同的工作组（本地大小）。另一个区域的工作组的工作项数将少于该维度中本地大小参数指定的工作项数目（其余工作组）。工作组的大小在多个维度上可能是不均匀的，可能在2D范围内产生多达4个不同大小的工作组，在3D范围内产生8个不同尺寸的工作组。

Non-uniform work-group sizes are missing before version 2.0.

2.0版本之前缺少不统一的工作组大小。

Each work-item is assigned to a work-group and given a local ID to represent its position within the work-group. A work-item’s local ID is an N-dimensional tuple with components in the range from zero to the size of the work-group in that dimension minus one.

每个工作项都被分配给一个工作组，并被赋予一个本地ID来表示其在工作组中的位置。工作项的本地ID是一个N维元组，其组件范围从零到该维度中工作组的大小减去1。

Work-groups are assigned IDs similarly. The number of work-groups in each dimension is not directly defined but is inferred from the local and global NDRanges provided when a kernel-instance is enqueued. A work-group’s ID is an N-dimensional tuple with components in the range 0 to the ceiling of the global size in that dimension divided by the local size in the same dimension. As a result, the combination of a work-group ID and the local-ID within a work-group uniquely defines a work-item. Each work-item is identifiable in two ways; in terms of a global index, and in terms of a work-group index plus a local index within a work-group.

工作组的ID分配方式类似。每个维度中的工作组数量不是直接定义的，而是根据内核实例入队时提供的本地和全局NDRANGE推断的。工作组的ID是一个N维元组，其组成部分的范围从0到该维度的全局大小除以同一维度的局部大小的上限。因此，工作组ID和工作组内的本地ID的组合唯一地定义了一个工作项。每个工作项目都可以通过两种方式识别；根据全局索引以及根据工作组索引加上工作组内的局部索引。

For example, consider the 2-dimensional index space shown below. We input the index space for the work-items (Gx, Gy), the size of each work-group (Sx, Sy) and the global ID offset (Fx, Fy). The global indices define an Gx by Gy index space where the total number of work-items is the product of Gx and Gy. The local indices define an Sx by Sy index space where the number of work-items in a single work-group is the product of Sx and Sy. Given the size of each work-group and the total number of work-items we can compute the number of work-groups. A 2-dimensional index space is used to uniquely identify a work-group. Each work-item is identified by its global ID (gx, gy) or by the combination of the work-group ID (wx, wy), the size of each work-group (Sx,Sy) and the local ID (sx, sy) inside the work-group such that

例如，考虑下面显示的二维索引空间。我们输入工作项的索引空间（Gx，Gy）、每个工作组的大小（Sx，Sy）和全局ID偏移（Fx，Fy）。全局索引定义了Gx-Gy索引空间，其中工作项的总数是Gx和Gy的乘积。局部索引定义了Sx-Sy索引空间，单个工作组中的工作项的数量是Sx和Sy的积。给定每个工作组的大小和工作项的总数量，我们可以计算工作组的数量。二维索引空间用于唯一地标识工作组。每个工作项通过其全局ID（gx，gy）或通过工作组ID（wx，wy）、每个工作组的大小（Sx，Sy）和工作组内部的本地ID（Sx，Sy）的组合来识别，使得

(gx, gy) = (wx × Sx + sx + Fx, wy × Sy + sy + Fy)

The number of work-groups can be computed as:

工作组的数量可以计算为：

(Wx, Wy) = (ceil(Gx / Sx), ceil(Gy / Sy))

Given a global ID and the work-group size, the work-group ID for a work-item is computed as:

给定全局ID和工作组大小，工作项的工作组ID计算如下：

(wx, wy) = ( (gx - sx - Fx) / Sx, (gy - sy - Fy) / Sy )

Figure 3. An example of an NDRange index space showing work-items, their global IDs and their mapping onto the pair of work-group and local IDs. In this case, we assume that in each dimension, the size of the work-group evenly divides the global NDRange size (i.e. all work-groups have the same size) and that the offset is equal to zero.

图3.NDRange索引空间的示例，显示工作项、它们的全局ID以及它们到工作组和本地ID对的映射。在这种情况下，我们假设在每个维度中，工作组的大小平均除以全局NDRange大小（即，所有工作组都具有相同的大小），并且偏移量等于零。

Within a work-group work-items may be divided into sub-groups. The mapping of work-items to sub-groups is implementation-defined and may be queried at runtime. While sub-groups may be used in multi-dimensional work-groups, each sub-group is 1-dimensional and any given work-item may query which sub-group it is a member of.

在一个工作组中，工作项可以分为多个子组。工作项到子组的映射是实现定义的，并且可以在运行时查询。虽然子组可以用于多维工作组，但每个子组都是一维的，任何给定的工作项都可以查询它是哪个子组的成员。

Sub-groups are missing before version 2.1.

2.1版本之前缺少子组。

Work-items are mapped into sub-groups through a combination of compile-time decisions and the parameters of the dispatch. The mapping to sub-groups is invariant for the duration of a kernels execution, across dispatches of a given kernel with the same work-group dimensions, between dispatches and query operations consistent with the dispatch parameterization, and from one work-group to another within the dispatch (excluding the trailing edge work-groups in the presence of non-uniform work-group sizes). In addition, all sub-groups within a work-group will be the same size, apart from the sub-group with the maximum index which may be smaller if the size of the work-group is not evenly divisible by the size of the sub-groups.

通过编译时决策和调度参数的组合，将工作项映射到子组中。子组的映射在内核执行的持续时间内是不变的，在具有相同工作组维度的给定内核的调度之间，在与调度参数化一致的调度和查询操作之间，以及在调度中从一个工作组到另一工作组（不包括存在非均匀工作组大小的后缘工作组）。此外，工作组内的所有子组都将具有相同的大小，除了具有最大索引的子组，如果工作组的大小不能被子组的大小整除，则最大索引可能更小。

In the degenerate case, a single sub-group must be supported for each work-group. In this situation all sub-group scope functions are equivalent to their work-group level equivalents.

在退化的情况下，每个工作组必须支持一个子组。在这种情况下，所有子组范围的功能都等效于其工作组级别的等效功能。