OpenCL™规范 2.术语表

2. Glossary

2.术语表

Application

应用程序

The combination of the program running on the host and OpenCL devices.

在主机和OpenCL设备上运行的程序的组合。

Acquire semantics

Acquire语义

One of the memory order semantics defined for synchronization operations. Acquire semantics apply to atomic operations that load from memory. Given two units of execution, A and B, acting on a shared atomic object M, if A uses an atomic load of M with acquire semantics to synchronize-with an atomic store to M by B that used release semantics, then A's atomic load will occur before any subsequent operations by A. Note that the memory orders releasesequentially consistent, and acquire_release all include release semantics and effectively pair with a load using acquire semantics.

为同步操作定义的内存顺序语义之一。获取语义应用于从内存加载的原子操作。给定作用于共享原子对象M的两个执行单元A和B,如果A使用具有获取语义的M的原子负载与使用释放语义的B对M的原子存储同步,则A的原子负载将发生在A的任何后续操作之前。请注意,内存顺序release、sequency-consistent和acquire_release都包含release语义,并使用acquire语义与负载有效配对。

Acquire release semantics

Acquire release语义

A memory order semantics for synchronization operations (such as atomic operations) that has the properties of both acquire and release memory orders. It is used with read-modify-write operations.

同步操作(如原子操作)的内存顺序语义,具有获取和释放内存顺序的属性。它用于读-修改-写操作。

Atomic operations

原子操作

Operations that at any point, and from any perspective, have either occurred completely, or not at all. Memory orders associated with atomic operations may constrain the visibility of loads and stores with respect to the atomic operations (see relaxed semanticsacquire semanticsrelease semantics or acquire release semantics).

在任何时候,从任何角度来看,要么完全发生,要么根本没有发生的操作。与原子操作相关联的内存顺序可以限制加载和存储相对于原子操作的可见性(参见relaxed语义、acquire 语义、release语义或获取发布语义)。

Blocking and Non-Blocking Enqueue API calls

阻塞和非阻塞加入API调用队列

non-blocking enqueue API call places a command on a command-queue and returns immediately to the host. The blocking-mode enqueue API calls do not return to the host until the command has completed.

非阻塞入队API调用将命令放在命令队列上,并立即返回到主机。阻塞模式入队API调用在命令完成之前不会返回到主机。

Barrier

栅栏

There are three types of barriers a command-queue barrier, a work-group barrier and a sub-group barrier.

有三种类型的栅栏:命令队列栅栏、工作组栅栏和子组栅栏。

  • The OpenCL API provides a function to enqueue a command-queue barrier command. This barrier command ensures that all previously enqueued commands to a command-queue have finished execution before any following commands enqueued in the command-queue can begin execution.

  • OpenCL API提供了一个将命令队列栅栏命令排入队列的函数。此barrier命令确保在命令队列中排队的任何后续命令可以开始执行之前,命令队列中所有先前排队的命令都已完成执行。

  • The OpenCL kernel execution model provides built-in work-group barrier functionality. This barrier built-in function can be used by a kernel executing on a device to perform synchronization between work-items in a work-group executing the kernel. All the work-items of a work-group must execute the barrier construct before any are allowed to continue execution beyond the barrier.

  • OpenCL内核执行模型提供了内置的工作组栅栏功能。在设备上执行的内核可以使用此栅栏内置功能来执行执行内核的工作组中的工作项之间的同步。一个工作组的所有工作项都必须执行栅栏构造,然后允许任何工作项在栅栏之外继续执行。

  • The OpenCL kernel execution model provides built-in sub-group barrier functionality. This barrier built-in function can be used by a kernel executing on a device to perform synchronization between work-items in a sub-group executing the kernel. All the work-items of a sub-group must execute the barrier construct before any are allowed to continue execution beyond the barrier.

  • OpenCL内核执行模型提供了内置的子组栅栏功能。在设备上执行的内核可以使用此栅栏内置功能来执行执行内核的子组中的工作项之间的同步。子组的所有工作项都必须执行栅栏构造,然后才允许任何工作项在栅栏之外继续执行。

Buffer Object

缓冲对象

A memory object that stores a linear collection of bytes. Buffer objects are accessible using a pointer in a kernel executing on a device. Buffer objects can be manipulated by the host using OpenCL API calls. A buffer object encapsulates the following information:

存储线性字节集合的内存对象。使用在设备上执行的内核中的指针可以访问缓冲区对象。主机可以使用OpenCL API调用来操作缓冲区对象。缓冲区对象封装以下信息:

  • Size in bytes.

  • 大小(以字节为单位)。

  • Properties that describe usage information and which region to allocate from.

  • 描述使用情况信息以及从哪个区域进行分配的属性。

  • Buffer data.

  • 缓冲区数据。

Built-in Kernel

内置内核

built-in kernel is a kernel that is executed on an OpenCL device or custom device by fixed-function hardware or in firmware. Applications can query the built-in kernels supported by a device or custom device. A program object can only contain kernels written in OpenCL C or built-in kernels but not both. See also Kernel and Program.

内置内核是通过固定功能硬件或固件在OpenCL设备或自定义设备上执行的内核。应用程序可以查询设备或自定义设备支持的内置内核。程序对象只能包含用OpenCL C编写的内核或内置内核,但不能同时包含这两种内核。另请参阅内核和程序。

Child kernel

子内核

See Device-side enqueue.

请参阅设备端排队。

Command

指令

The OpenCL operations that are submitted to a command-queue for execution. For example, OpenCL commands issue kernels for execution on a compute device, manipulate memory objects, etc.

提交到指令队列以供执行的OpenCL操作。例如,OpenCL命令发布内核以在计算设备上执行、操作内存对象等。

Command-queue

指令队列

An object that holds commands that will be executed on a specific device. The command-queue is created on a specific device in a contextCommands to a command-queue are queued in-order but may be executed in-order or out-of-order. Refer to In-order Execution_and_Out-of-order Execution.

保存将在特定设备上执行的指令的对象。指令队列是在上下文中的特定设备上创建的。指令队列中的命令按顺序排队,但也可以按顺序或无序执行。请参阅订顺序执行和无序执行。

Command-queue Barrier

指令队列栅栏

See Barrier.

请参见栅栏。

Command synchronization

指令同步

Constraints on the order that commands are launched for execution on a device defined in terms of the synchronization points that occur between commands in host command-queues and between commands in device-side command-queues. See synchronization points.

根据主机指令队列中的指令之间和设备端指令队列中指令之间的同步点定义的对指令在设备上执行的启动顺序的约束。请参见同步点。

Complete

完成

The final state in the six state model for the execution of a command. The transition into this state occurs is signaled through event objects or callback functions associated with a command.

六态模型中用于执行指令的最终状态。通过与指令相关联的事件对象或回调函数来发出向该状态转换的信号。

Compute Device Memory

计算设备内存

This refers to one or more memories attached to the compute device.

这是指连接到计算设备的一个或多个存储器。

Compute Unit

计算单元

An OpenCL device has one or more compute units. A work-group executes on a single compute unit. A compute unit is composed of one or more processing elements and local memory. A compute unit may also include dedicated texture filter units that can be accessed by its processing elements.

OpenCL设备具有一个或多个计算单元。工作组在单个计算单元上执行。计算单元由一个或多个处理元件和本地存储器组成。计算单元还可以包括可以由其处理元件访问的专用纹理过滤器单元。

Concurrency

并发性

A property of a system in which a set of tasks in a system can remain active and make progress at the same time. To utilize concurrent execution when running a program, a programmer must identify the concurrency in their problem, expose it within the source code, and then exploit it using a notation that supports concurrency.

系统的一种属性,其中系统中的一组任务可以保持活动状态并同时进行。要在运行程序时利用并发执行,程序员必须识别问题中的并发性,在源代码中公开它,然后使用支持并发性的表示法来利用它。

Constant Memory

常量内存

A region of global memory that remains constant during the execution of a kernel. The host allocates and initializes memory objects placed into constant memory.

在内核执行过程中保持不变的全局内存区域。主机分配并初始化放置在常量内存中的内存对象。

Context

上下文

The environment within which the kernels execute and the domain in which synchronization and memory management is defined. The context includes a set of devices, the memory accessible to those devices, the corresponding memory properties and one or more command-queues used to schedule execution of a kernel(s) or operations on memory objects.

内核执行的环境以及定义同步和内存管理的域。上下文包括一组设备、这些设备可访问的存储器、相应的存储器属性以及用于调度内核的执行或对存储器对象的操作的一个或多个指令队列。

Control flow

控制流

The flow of instructions executed by a work-item. Multiple logically related work-items may or may not execute the same control flow. The control flow is said to be converged if all the work-items in the set execution the same stream of instructions. In a diverged control flow, the work-items in the set execute different instructions. At a later point, if a diverged control flow becomes converged, it is said to be a re-converged control flow.

由工作项执行的指令流。多个逻辑相关的工作项可以执行也可以不执行相同的控制流。如果集合中的所有工作项执行相同的指令流,则称控制流收敛。在分散的控制流中,集合中的工作项执行不同的指令。在稍后的点上,如果发散的控制流变得收敛,则称其为重新收敛的控制流。

Converged control flow

收敛控制流

See Control flow.

请参见控制流。

Custom Device

自定义设备

An OpenCL device that fully implements the OpenCL Runtime but does not support programs written in OpenCL C. A custom device may be specialized non-programmable hardware that is very power efficient and performant for directed tasks or hardware with limited programmable capabilities such as specialized DSPs. Custom devices are not OpenCL conformant. Custom devices may support an online compiler. Programs for custom devices can be created using the OpenCL runtime APIs that allow OpenCL programs to be created from source (if an online compiler is supported) and/or binary, or from built-in kernels supported by the device. See also Device.

一种完全实现OpenCL运行时但不支持用OpenCL C编写的程序的OpenCL设备。自定义设备可以是专门的不可编程硬件,该硬件非常节能,可用于定向任务或具有有限可编程能力的硬件,如专门的DSP。自定义设备不符合OpenCL。自定义设备可能支持在线编译器。可以使用OpenCL运行时API创建用于自定义设备的程序,这些API允许从源代码(如果支持在线编译器)和/或二进制文件或设备支持的内置内核创建OpenCL程序。另请参见设备。

Data Parallel Programming Model

数据并行编程模型

Traditionally, this term refers to a programming model where concurrency is expressed as instructions from a single program applied to multiple elements within a set of data structures. The term has been generalized in OpenCL to refer to a model wherein a set of instructions from a single program are applied concurrently to each point within an abstract domain of indices.

传统上,这个术语指的是一种编程模型,其中并发性表示为来自单个程序的指令,应用于一组数据结构中的多个元素。该术语在OpenCL中被推广为指一种模型,其中来自单个程序的一组指令被并发地应用于索引的抽象域内的每个点。

Data race

数据竞争

The execution of a program contains a data race if it contains two actions in different work-items or host threads where (1) one action modifies a memory location and the other action reads or modifies the same memory location, and (2) at least one of these actions is not atomic, or the corresponding memory scopes are not inclusive, and (3) the actions are global actions unordered by the global-happens-before relation or are local actions unordered by the local-happens before relation.

如果程序的执行在不同的工作项或主线程中包含两个操作,则程序的执行包含数据竞赛,其中(1)一个操作修改内存位置,另一个操作读取或修改相同的内存位置,以及(2)这些操作中的至少一个不是原子的,或者相应的内存范围不包括在内,以及(3)动作是全局先发生关系无序的全局动作,或者是局部先发生关系有序的局部动作。

Deprecation

废弃

Existing features are marked as deprecated if their usage is not recommended as that feature is being de-emphasized, superseded and may be removed from a future version of the specification.

如果不建议使用现有功能,则将其标记为不推荐使用,因为该功能正在被淡化、取代,并且可能会从未来版本的规范中删除。

Device

设备

device is a collection of compute units. A command-queue is used to queue commands to a device. Examples of commands include executing kernels, or reading and writing memory objects. OpenCL devices typically correspond to a GPU, a multi-core CPU, and other processors such as DSPs and the Cell/B.E. processor.

设备是计算单元的集合。指令队列用于将指令排队到设备。指令的示例包括执行内核或读取和写入内存对象。OpenCL设备通常对应于GPU、多核CPU以及诸如DSP和Cell/B.E.之类的其他处理器。

Device-side enqueue

设备侧排队

A mechanism whereby a kernel-instance is enqueued by a kernel-instance running on a device without direct involvement by the host program. This produces nested parallelism; i.e. additional levels of concurrency are nested inside a running kernel-instance. The kernel-instance executing on a device (the parent kernel) enqueues a kernel-instance (the child kernel) to a device-side command queue. Child and parent kernels execute asynchronously though a parent kernel does not complete until all of its child-kernels have completed.

一种机制,在这种机制中,内核实例由运行在设备上的内核实例排队,而不需要主机程序的直接参与。这会产生嵌套的并行性;即,附加级别的并发嵌套在运行的内核实例内。在设备上执行的内核实例(父内核)将内核实例(子内核)排入设备端指令队列。子内核和父内核异步执行,尽管父内核直到其所有子内核都完成后才完成。

Diverged control flow

分流控制流

See Control flow.

请参见控制流。

Ended

结束

The fifth state in the six state model for the execution of a command. The transition into this state occurs when execution of a command has ended. When a Kernel-enqueue command ends, all of the work-groups associated with that command have finished their execution.

六态模型中用于执行指令的第五种状态。当指令的执行结束时,会发生向该状态的转换。当内核入队命令结束时,与该命令相关联的所有工作组都已完成执行。

Event Object

事件对象

An event object encapsulates the status of an operation such as a command. It can be used to synchronize operations in a context.

事件对象封装了操作(如指令)的状态。它可以用于同步上下文中的操作。

Event Wait List

事件等待列表

An event wait list is a list of event objects that can be used to control when a particular command begins execution.

事件等待列表是一个事件对象列表,可用于控制特定命令何时开始执行。

Fence

栅栏

A memory ordering operation without an associated atomic object. A fence can use the acquire semantics, release semantics, or acquire release semantics.

一种没有关联原子对象的内存排序操作。围栏可以使用acquire语义、release语义或acquire release语义。

Framework

框架

A software system that contains the set of components to support software development and execution. A framework typically includes libraries, APIs, runtime systems, compilers, etc.

一种软件系统,包含一组支持软件开发和执行的组件。框架通常包括库、API、运行时系统、编译器等。

Generic address space

通用地址空间

An address space that include the privatelocal, and global address spaces available to a device. The generic address space supports conversion of pointers to and from private, local and global address spaces, and hence lets a programmer write a single function that at compile time can take arguments from any of the three named address spaces.

一种地址空间,包括设备可用的专用、本地和全局地址空间。通用地址空间支持指针到私有、本地和全局地址空间的转换,因此允许程序员编写一个函数,该函数在编译时可以从三个命名地址空间中的任何一个获取参数。

Global Happens before

See Happens before.

Global ID

全局ID

global ID is used to uniquely identify a work-item and is derived from the number of global work-items specified when executing a kernel. The global ID is a N-dimensional value that starts at (0, 0, …​ 0). See also Local ID.

全局ID用于唯一标识工作项,它是从执行内核时指定的全局工作项的数量派生而来的。全局ID是从(0,0,…0)开始的N维值。另请参阅本地ID。

Global Memory

全局内存

A memory region accessible to all work-items executing in a context. It is accessible to the host using commands such as read, write and map. Global memory is included within the generic address space that includes the private and local address spaces.

上下文中执行的所有工作项都可以访问的内存区域。主机可以使用读取、写入和映射等命令访问它。全局存储器被包括在通用地址空间内,通用地址空间包括专用地址空间和本地地址空间。

GL share group

GL共享组

GL share group object manages shared OpenGL or OpenGL ES resources such as textures, buffers, framebuffers, and renderbuffers and is associated with one or more GL context objects. The GL share group is typically an opaque object and not directly accessible.

GL共享组对象管理共享的OpenGL或OpenGL ES资源,如纹理、缓冲区、帧缓冲区和渲染缓冲区,并与一个或多个GL上下文对象相关联。GL共享组通常是一个不透明的对象,不能直接访问。

Handle

An opaque type that references an object allocated by OpenCL. Any operation on an object occurs by reference to that object’s handle. Each object must have a unique handle value during the course of its lifetime. Handle values may be, but are not required to be, re-used by an implementation.

一种不透明类型,引用由OpenCL分配的对象。对对象的任何操作都是通过引用该对象的句柄来进行的。每个对象在其生命周期中都必须具有唯一的句柄值。句柄值可以被实现重复使用,但不是必须的。

Happens before

An ordering relationship between operations that execute on multiple units of execution. If an operation A happens-before operation B then A must occur before B; in particular, any value written by A will be visible to B. We define two separate happens before relations: global-happens-before and local-happens-before. These are defined in Memory Ordering Rules.

​在多个执行单元上执行的操作之间的排序关系。如果操作A发生在操作B之前,则A必须发生在B之前;特别地,A写的任何值对B都是可见的。我们定义了两个独立的先发生后关系:全局先发生和局部先发生。这些在内存排序规则中进行了定义。

Host

主机

The host interacts with the context using the OpenCL API.

主机使用OpenCL API与上下文交互。

Host-thread

主机线程

The unit of execution that executes the statements in the host program.

执行主机程序中语句的执行单元。

Host pointer

主机指针

A pointer to memory that is in the virtual address space on the host.

指向主机上虚拟地址空间中的内存的指针。

Illegal

非法的

Behavior of a system that is explicitly not allowed and will be reported as an error when encountered by OpenCL.

系统的行为是明确不允许的,当OpenCL遇到时将报告为错误。

Image Object

图像对象

memory object that stores a two- or three-dimensional structured array. Image data can only be accessed with read and write functions. The read functions use a sampler.

一种存储二维或三维结构化数组的内存对象。图像数据只能通过读取和写入功能进行访问。读取函数使用采样器。

The image object encapsulates the following information:

图像对象封装了以下信息:

  • Dimensions of the image.

  • 图像的尺寸。

  • Description of each element in the image.

  • 图像中每个元素的描述。

  • Properties that describe usage information and which region to allocate from.

  • 描述使用情况信息以及从哪个区域进行分配的属性。

  • Image data.

  • 图像数据。

The elements of an image are selected from a list of predefined image formats.

图像的元素是从预定义图像格式的列表中选择的。

Implementation Defined

实现定义

Behavior that is explicitly allowed to vary between conforming implementations of OpenCL. An OpenCL implementor is required to document the implementation-defined behavior.

明确允许在符合要求的OpenCL实现之间变化的行为。需要一个OpenCL实现者来记录实现定义的行为。

Independent Forward Progress

独立前进

If an entity supports independent forward progress, then if it is otherwise not dependent on any actions due to be performed by any other entity (for example it does not wait on a lock held by, and thus that must be released by, any other entity), then its execution cannot be blocked by the execution of any other entity in the system (it will not be starved). Work-items in a subgroup, for example, typically do not support independent forward progress, so one work-item in a subgroup may be completely blocked (starved) if a different work-item in the same subgroup enters a spin loop.

如果一个实体支持独立的向前推进,那么如果它不依赖于任何其他实体执行的任何操作(例如,它不等待任何其他实体持有的锁,因此必须由任何其他实体释放),那么它的执行就不能被系统中任何其他实体的执行阻止(它不会饿死)。例如,子组中的工作项通常不支持独立的正向进度,因此,如果同一子组中不同的工作项进入旋转循环,则子组中一个工作项可能会被完全阻塞(匮乏)。

In-order Execution

有序执行

A model of execution in OpenCL where the commands in a command-queue are executed in order of submission with each command running to completion before the next one begins. See Out-of-order Execution.

OpenCL中的一种执行模型,其中命令队列中的命令按提交顺序执行,每个命令在下一个命令开始之前运行到完成。请参阅无序执行。

Intermediate Language

媒介语言

A lower-level language that may be used to create programs. SPIR-V is a required intermediate language (IL) for OpenCL 2.1 and 2.2 devices. Other OpenCL devices may optionally support SPIR-V or other ILs.

一种较低级别的语言,可用于创建程序。SPIR-V是OpenCL 2.1和2.2设备所需的中间语言(IL)。其他OpenCL设备可以可选地支持SPIR-V或其他IL。

Kernel

内核

kernel is a function declared in a program and executed on an OpenCL device. A kernel is identified by the __kernel or kernel qualifier applied to any function defined in a program.

内核是在程序中声明并在OpenCL设备上执行的函数。内核由应用于程序中定义的__kernel或kernel限定符来标识的任何函数。

Kernel-instance

内核实例

The work carried out by an OpenCL program occurs through the execution of kernel-instances on devices. The kernel instance is the kernel object, the values associated with the arguments to the kernel, and the parameters that define the NDRange index space.

OpenCL程序执行的工作是通过在设备上执行内核实例来完成的。内核实例是内核对象、与内核参数相关联的值以及定义NDRange索引空间的参数。

Kernel Object

内核对象

kernel object encapsulates a specific kernel function declared in a program and the argument values to be used when executing this kernel function.

内核对象封装程序中声明的特定内核函数以及执行该内核函数时要使用的参数值。

Kernel Language

内核语言

A language that is used to represent source code for kernel. Kernels may be directly created from OpenCL C kernel language source strings. Other kernel languages may be supported by compiling to SPIR-V, another supported Intermediate Language, or to a device-specific program binary format.

一种用于表示内核源代码的语言。内核可以直接从OpenCL C内核语言源字符串创建。通过编译到SPIR-V(另一种受支持的中间语言)或设备特定程序二进制格式,可以支持其他内核语言。

Launch

启动

The transition of a command from the submitted state to the ready state. See Ready.

命令从已提交状态转换为就绪状态。请参阅准备就绪。

Local ID

本地ID

local ID specifies a unique work-item ID within a given work-group that is executing a kernel. The local ID is a N-dimensional value that starts at (0, 0, …​ 0). See also Global ID.

本地ID指定正在执行内核的给定工作组中的唯一工作项ID。本地ID是从(0,0,…0)开始的N维值。另请参阅全局ID。

Local Memory

本地内存

A memory region associated with a work-group and accessible only by work-items in that work-groupLocal memory is included within the generic address space that includes the private and global address spaces.

与工作组相关联的内存区域,并且只能由该工作组中的工作项访问。本地存储器被包括在包括私有地址空间和全局地址空间的通用地址空间内。

Marker

标记

command queued in a command-queue that can be used to tag all commands queued before the marker in the command-queue. The marker command returns an event which can be used by the application to queue a wait on the marker event i.e. wait for all commands queued before the marker command to complete.

指令队列中排队的指令,可用于标记指令队列中标记之前排队的所有命令。标记指令返回一个事件,应用程序可以使用该事件来排队等待标记事件,即等待在标记指令完成之前排队的所有指令。

Memory Consistency Model

内存一致性模型

Rules that define which values are observed when multiple units of execution load data from any shared memory plus the synchronization operations that constrain the order of memory operations and define synchronization relationships. The memory consistency model in OpenCL is based on the memory model from the ISO C11 programming language.

定义当多个执行单元从任何共享内存加载数据时要观察哪些值的规则,加上约束内存操作顺序并定义同步关系的同步操作。OpenCL中的内存一致性模型基于ISO C11编程语言中的内存模型。

Memory Objects

内存对象

memory object is a handle to a reference counted region of Global Memory. Also see Buffer Object and Image Object.

内存对象是全局内存中引用计数区域的句柄。另请参见缓冲区对象和图像对象。

Memory Regions (or Pools)

内存区域(或池)

A distinct address space in OpenCL. Memory regions may overlap in physical memory though OpenCL will treat them as logically distinct. The memory regions are denoted as privatelocalconstant, and global.

OpenCL中的一个不同地址空间。虽然OpenCL会将内存区域视为逻辑上不同的,但它们在物理内存中可能会重叠。内存区域表示为私有、局部、常量和全局。

Memory Scopes

内存作用域

These memory scopes define a hierarchy of visibilities when analyzing the ordering constraints of memory operations. They are defined by the values of the memory_scope enumeration constant. Current values are memory_scope_work_item (memory constraints only apply to a single work-item and in practice apply only to image operations), memory_scope_sub_group (memory-ordering constraints only apply to work-items executing in a sub-group), memory_scope_work_group (memory-ordering constraints only apply to work-items executing in a work-group), memory_scope_device (memory-ordering constraints only apply to work-items executing on a single device) and memory_scope_all_svm_devices or equivalently memory_scope_all_devices (memory-ordering constraints only apply to work-items executing across multiple devices and when using shared virtual memory).

在分析内存操作的排序约束时,这些内存作用域定义了可见性的层次结构。它们由memory_scope枚举常量的值定义。当前值为memory_scope_work_item(内存约束仅适用于单个工作项,实际上仅适用于图像操作)、memory_scope_sub_group(内存排序约束仅适用于子组中执行的工作项)、memory_rope_work_group,memory_scope_device(内存排序约束仅适用于在单个设备上执行的工作项)和memory_scope_all_svm_devices或等效地memory_scopo_all_devices(内存排序限制仅适用于跨多个设备执行的工作项目以及当使用共享虚拟内存时)。

Modification Order

改装命令

All modifications to a particular atomic object M occur in some particular total order, called the modification order of M. If A and B are modifications of an atomic object M, and A happens-before B, then A shall precede B in the modification order of M. Note that the modification order of an atomic object M is independent of whether M is in local or global memory.

对特定原子对象M的所有修改都以某种特定的总顺序发生,称为M的修改顺序。如果A和B是对原子对象M进行的修改,而A发生在B之前,则A应在M的修改次序中位于B之前。注意,原子对象M的修改顺序与M是在局部内存中还是全局内存中无关。

Nested Parallelism

嵌套并行

See device-side enqueue.

请参阅设备端排队。

Object

对象

Objects are abstract representation of the resources that can be manipulated by the OpenCL API. Examples include program objectskernel objects, and memory objects.

对象是可以由OpenCL API操作的资源的抽象表示。示例包括程序对象、内核对象和内存对象。

Out-of-Order Execution

无序执行

A model of execution in which commands placed in the work queue may begin and complete execution in any order consistent with constraints imposed by event wait lists_and_command-queue barrier. See In-order Execution.

一种执行模型,其中放置在工作队列中的指令可以按照与事件等待列表和指令队列屏障施加的约束一致的任何顺序开始和完成执行。请参阅按顺序执行。

Parent device

父设备

The OpenCL device which is partitioned to create sub-devices. Not all parent devices are root devices. A root device might be partitioned and the sub-devices partitioned again. In this case, the first set of sub-devices would be parent devices of the second set, but not the root devices. Also see Deviceparent device and root device.

对OpenCL设备进行分区以创建子设备。并非所有父设备都是根设备。根设备可能会被分区,子设备可能会再次被分区。在这种情况下,第一组子设备将是第二组的父设备,但不是根设备。另请参阅设备、父设备和根设备。

Parent kernel

父内核

see Device-side enqueue.

请参阅设备端排队。

Pipe

管道

The pipe memory object conceptually is an ordered sequence of data items. A pipe has two endpoints: a write endpoint into which data items are inserted, and a read endpoint from which data items are removed. At any one time, only one kernel instance may write into a pipe, and only one kernel instance may read from a pipe. To support the producer consumer design pattern, one kernel instance connects to the write endpoint (the producer) while another kernel instance connects to the reading endpoint (the consumer).

管道内存对象在概念上是一个有序的数据项序列。管道有两个端点:写入端点和读取端点,写入端点中插入数据项,读取端点中删除数据项。在任何时候,只有一个内核实例可以写入管道,而只有一个核心实例可以从管道读取。为了支持生产者-消费者设计模式,一个内核实例连接到写入端点(生产者),而另一个内核示例连接到读取端点(消费者)。

Platform

平台

The host plus a collection of devices managed by the OpenCL framework that allow an application to share resources and execute kernels on devices in the platform.

主机加上由OpenCL框架管理的设备集合,这些设备允许应用程序在平台中的设备上共享资源和执行内核。

Private Memory

私有内存

A region of memory private to a work-item. Variables defined in one work-items private memory are not visible to another work-item.

工作项专用的内存区域。在一个工作项专用内存中定义的变量对另一个工作项目不可见。

Processing Element

处理单元

A virtual scalar processor. A work-item may execute on one or more processing elements.

一个虚拟标量处理器。工作项可以在一个或多个处理元件上执行。

Program

程序

An OpenCL program consists of a set of kernelsPrograms may also contain auxiliary functions called by the kernel functions and constant data.

OpenCL程序由一组内核组成。程序还可以包含内核函数调用的辅助函数和常量数据。

Program Object

程序对象

program object encapsulates the following information:

程序对象封装以下信息:

  • A reference to an associated context.

  • 对相关上下文的引用。

  • program source or binary.

  • 程序源或二进制文件。

  • The latest successfully built program executable, the list of devices for which the program executable is built, the build options used and a build log.

  • 最新成功生成的程序可执行文件、为其生成程序可执行程序的设备列表、使用的生成选项和生成日志。

  • The number of kernel objects currently attached.

  • 当前附加的内核对象数。

Queued

已入队列

The first state in the six state model for the execution of a command. The transition into this state occurs when the command is enqueued into a command-queue.

六态模型中用于执行指令的第一个状态。当指令被排入指令队列时,将发生向该状态的转换。

Ready

已就绪

The third state in the six state model for the execution of a command. The transition into this state occurs when pre-requisites constraining execution of a command have been met; i.e. the command has been launched. When a kernel-enqueue command is launched, work-groups associated with the command are placed in a devices work-pool from which they are scheduled for execution.

六态模型中用于执行指令的第三种状态。当约束指令执行的先决条件已经满足时,就会发生向该状态的转换;即指令已经启动。当启动内核入队指令时,与该命令相关联的工作组将被放置在设备工作池中,并从中计划执行。

Re-converged Control Flow

重新收敛的控制流

see Control flow.

请参见控制流程。

Reference Count

引用计数器

The life span of an OpenCL object is determined by its reference count, an internal count of the number of references to the object. When you create an object in OpenCL, its reference count is set to one. Subsequent calls to the appropriate retain API (such as clRetainContextclRetainCommandQueue) increment the reference count. Calls to the appropriate release API (such as clReleaseContextclReleaseCommandQueue) decrement the reference count. Implementations may also modify the reference count, e.g. to track attached objects or to ensure correct operation of in-progress or scheduled activities. The object becomes inaccessible to host code when the number of release operations performed matches the number of retain operations plus the allocation of the object. At this point the reference count may be zero but this is not guaranteed.

​OpenCL对象的寿命由其引用计数决定,引用计数是对该对象引用数量的内部计数。在OpenCL中创建对象时,其引用计数设置为1。对相应的retain API(如clRetainContext、clRetainCommandQueue)的后续调用会增加引用计数。对相应版本API(如clReleaseContext、clReleaseCommandQueue)的调用会递减引用计数。实现还可以修改引用计数,例如,以跟踪附加的对象或确保正在进行的或计划的活动的正确操作。当执行的释放操作的数量与保留操作的数量加上对象的分配相匹配时,主机代码将无法访问该对象。在这一点上,参考计数可能为零,但这并不能保证。

Relaxed Consistency

松散一致性

A memory consistency model in which the contents of memory visible to different work-items or commands may be different except at a barrier or other explicit synchronization points.

一种内存一致性模型,其中不同工作项或命令可见的内存内容可能不同,但在栅栏或其他显式同步点除外。

Relaxed Semantics

Relaxed语义

A memory order semantics for atomic operations that implies no order constraints. The operation is atomic but it has no impact on the order of memory operations.

原子操作的内存顺序语义,表示没有顺序约束。该操作是原子操作,但对内存操作的顺序没有影响。

Release Semantics

Release语义

One of the memory order semantics defined for synchronization operations. Release semantics apply to atomic operations that store to memory. Given two units of execution, A and B, acting on a shared atomic object M, if A uses an atomic store of M with release semantics to synchronize-with an atomic load to M by B that used acquire semantics, then A's atomic store will occur after any prior operations by A. Note that the memory orders acquiresequentially consistent, and acquire_release all include acquire semantics and effectively pair with a store using release semantics.

为同步操作定义的内存顺序语义之一。Release语义适用于存储到内存的原子操作。给定作用于共享原子对象M的两个执行单元A和B,如果A使用具有释放语义的M的原子存储与使用获取语义的B对M的原子加载同步,则A的原子存储将在A的任何先前操作之后发生。请注意,内存顺序获取、顺序一致和获取_释放都包括获取语义,并使用发布语义与存储有效配对。

Remainder work-groups

剩余工作组

When the work-groups associated with a kernel-instance are defined, the sizes of a work-group in each dimension may not evenly divide the size of the NDRange in the corresponding dimensions. The result is a collection of work-groups on the boundaries of the NDRange that are smaller than the base work-group size. These are known as remainder work-groups.

当定义与内核实例相关联的工作组时,每个维度中工作组的大小可能无法平均划分相应维度中NDRange的大小。结果是NDRange边界上的工作组的集合,这些工作组小于基本工作组大小。这些被称为剩余工作组。

Running

运行中

The fourth state in the six state model for the execution of a command. The transition into this state occurs when the execution of the command starts. When a Kernel-enqueue command starts, one or more work-groups associated with the command start to execute.

六态模型中用于执行指令的第四种状态。当指令的执行开始时,将转换到该状态。当内核入队命令启动时,与该命令相关联的一个或多个工作组开始执行。

Root device

根设备

root device is an OpenCL device that has not been partitioned. Also see DeviceParent device and Root device.

根设备是尚未分区的OpenCL设备。另请参阅设备、父设备和根设备。

Resource

资源

A class of objects defined by OpenCL. An instance of a resource is an object. The most common resources are the contextcommand-queueprogram objectskernel objects, and memory objects. Computational resources are hardware elements that participate in the action of advancing a program counter. Examples include the hostdevicescompute units and processing elements.

由OpenCL定义的一类对象。资源的实例就是一个对象。最常见的资源是上下文、指令队列、程序对象、内核对象和内存对象。计算资源是参与推进程序计数器的动作的硬件元素。示例包括主机、设备、计算单元和处理元件。

Retain, Release

保留、释放

The action of incrementing (retain) and decrementing (release) the reference count using an OpenCL object. This is a book keeping functionality to make sure the system doesn’t remove an object before all instances that use this object have finished. Refer to Reference Count.

使用OpenCL对象递增(保留)和递减(释放)引用计数的操作。这是一个记账功能,可以确保系统在所有使用该对象的实例完成之前不会删除该对象。请参阅引用计数。

Sampler

采样器

An object that describes how to sample an image when the image is read in the kernel. The image read functions take a sampler as an argument. The sampler specifies the image addressing-mode i.e. how out-of-range image coordinates are handled, the filter mode, and whether the input image coordinate is a normalized or unnormalized value.

一个对象,描述在内核中读取图像时如何对图像进行采样。图像读取函数将采样器作为参数。采样器指定图像寻址模式,即如何处理超出范围的图像坐标、滤波器模式以及输入图像坐标是归一化值还是未归一化值。

Scope inclusion

范围包含

Two actions A and B are defined to have an inclusive scope if they have the same scope P such that: (1) if P is memory_scope_sub_group, and A and B are executed by work-items within the same sub-group, or (2) if P is memory_scope_work_group, and A and B are executed by work-items within the same work-group, or (3) if P is memory_scope_device, and A and B are executed by work-items on the same device, or (4) if P is memory_scope_all_svm_devices or memory_scope_all_devices, if A and B are executed by host threads or by work-items on one or more devices that can share SVM memory with each other and the host process.

如果两个动作A和B具有相同的作用域P,则它们被定义为具有包含作用域,使得:(1)如果P是memory_scope_sub_group,并且A和B由相同子组内的工作项执行,或者(2)如果P为memory_scope_work_group并且A和B由相同工作组内的工作项执行,或者(3)如果P是memory_scope_device并且A和B由同一设备上的工作项执行,或者(4)如果P是memory_scope_all_svm_devices或memory_scope_all_devices,如果A和B是由主机线程或由一个或多个设备上的可以彼此和主机进程共享svm存储器的工作项执行。

Sequenced before

在之前排序

A relation between evaluations executed by a single unit of execution. Sequenced-before is an asymmetric, transitive, pair-wise relation that induces a partial order between evaluations. Given any two evaluations A and B, if A is sequenced-before B, then the execution of A shall precede the execution of B.

由单个执行单元执行的评估之间的关系。Sequenced before是一种不对称的、传递的、成对的关系,它在求值之间产生偏序。给定任意两个评估A和B,如果A在B之前排序,则A的执行应先于B的执行。

Sequential consistency

顺序一致性

Sequential consistency interleaves the steps executed by each unit of execution. Each access to a memory location sees the last assignment to that location in that interleaving.

顺序一致性交错每个执行单元执行的步骤。对存储器位置的每次访问都会看到在该交织中对该位置的最后一次分配。

Sequentially consistent semantics

顺序一致语义

One of the memory order semantics defined for synchronization operations. When using sequentially-consistent synchronization operations, the loads and stores within one unit of execution appear to execute in program order (i.e., the sequenced-before order), and loads and stores from different units of execution appear to be simply interleaved.

为同步操作定义的内存顺序语义之一。当使用顺序一致的同步操作时,一个执行单元内的加载和存储看起来是按程序顺序执行的(即顺序在前的顺序),而来自不同执行单元的加载和保存看起来是简单交错的。

Shared Virtual Memory (SVM)

共享虚拟内存(SVM)

An address space exposed to both the host and the devices within a context. SVM causes addresses to be meaningful between the host and all of the devices within a context and therefore supports the use of pointer based data structures in OpenCL kernels. It logically extends a portion of the global memory into the host address space therefore giving work-items access to the host address space. There are three types of SVM in OpenCL:

在上下文中向主机和设备公开的地址空间。SVM使地址在主机和上下文中的所有设备之间具有意义,因此支持在OpenCL内核中使用基于指针的数据结构。它在逻辑上将全局内存的一部分扩展到主机地址空间,从而使工作项能够访问主机地址空间。OpenCL中有三种类型的SVM:

Coarse-Grained buffer SVM

粗粒度缓冲SVM

Sharing occurs at the granularity of regions of OpenCL buffer memory objects.

共享发生在OpenCL缓冲区内存对象区域的粒度上。

Fine-Grained buffer SVM

细粒缓冲SVM

Sharing occurs at the granularity of individual loads/stores into bytes within OpenCL buffer memory objects.

共享发生在OpenCL缓冲区内存对象中单个加载/存储到字节的粒度上。

Fine-Grained system SVM

细粒度系统SVM

Sharing occurs at the granularity of individual loads/stores into bytes occurring anywhere within the host memory.

共享以单个加载/存储到主机内存中任何位置的字节的粒度进行。

SIMD

单指令多数据

Single Instruction Multiple Data. A programming model where a kernel is executed concurrently on multiple processing elements each with its own data and a shared program counter. All processing elements execute a strictly identical set of instructions.

单指令多数据。一种编程模型,其中一个内核在多个处理单元上同时执行,每个处理单元都有自己的数据和共享的程序计数器。所有处理元素都执行一组完全相同的指令。

Specialization constants

特殊常量

Specialization constants are special constant objects that do not have known constant values in an intermediate language (e.g. SPIR-V). Applications may provide updated values for the specialization constants before a program is built. Specialization constants that do not receive a value from an application shall use the default specialization constant value.

特殊常量是在中间语言(例如SPIR-V)中没有已知常数值的特殊常量对象。应用程序可以在构建程序之前提供特殊化常数的更新值。没有从应用程序接收值的特殊常量应使用默认的特殊常数值。

SPMD

Single Program Multiple Data. A programming model where a kernel is executed concurrently on multiple processing elements each with its own data and its own program counter. Hence, while all computational resources run the same kernel they maintain their own instruction counter and due to branches in a kernel, the actual sequence of instructions can be quite different across the set of processing elements.

单程序多数据。一种编程模型,其中一个内核在多个处理单元上同时执行,每个处理单元都有自己的数据和程序计数器。因此,当所有计算资源都运行相同的内核时,它们维护自己的指令计数器,并且由于内核中的分支,实际的指令序列在处理元素的集合中可能非常不同。

Sub-device

子设备

An OpenCL device can be partitioned into multiple sub-devices. The new sub-devices alias specific collections of compute units within the parent device, according to a partition scheme. The sub-devices may be used in any situation that their parent device may be used. Partitioning a device does not destroy the parent device, which may continue to be used along side and intermingled with its child sub-devices. Also see DeviceParent device and Root device.

一个OpenCL设备可以划分为多个子设备。根据分区方案,新的子设备对父设备内的计算单元的特定集合进行别名。子设备可以在其父设备可以被使用的任何情况下使用。对设备进行分区不会破坏父设备,父设备可以继续沿边使用,并与其子设备混合使用。另请参阅设备、父设备和根设备。

Sub-group

子组

Sub-groups are an implementation-dependent grouping of work-items within a work-group. The size and number of sub-groups is implementation-defined.

子组是工作组中工作项的实现相关分组。子组的大小和数量由实现定义。

Sub-group Barrier

子组栅栏

See Barrier.

请参见栅栏。

Submitted

已提交

The second state in the six state model for the execution of a command. The transition into this state occurs when the command is flushed from the command-queue and submitted for execution on the device. Once submitted, a programmer can assume a command will execute once its prerequisites have been met.

六态模型中用于执行指令的第二种状态。当指令从指令队列中刷新并提交以在设备上执行时,将转换到此状态。提交后,程序员可以假设一个指令将在满足其先决条件后执行。

SVM Buffer

SVM缓冲区

A memory allocation enabled to work with Shared Virtual Memory (SVM). Depending on how the SVM buffer is created, it can be a coarse-grained or fine-grained SVM buffer. Optionally it may be wrapped by a Buffer Object. See Shared Virtual Memory (SVM).

一种允许使用共享虚拟内存(SVM)的内存分配。根据SVM缓冲区的创建方式,它可以是粗粒度或细粒度的SVM缓冲区。可选地,它可以由缓冲区对象包装。请参阅共享虚拟内存(SVM)。

Synchronization

同步

Synchronization refers to mechanisms that constrain the order of execution and the visibility of memory operations between two or more units of execution.

同步是指限制两个或多个执行单元之间的执行顺序和内存操作可见性的机制。

Synchronization operations

同步操作

Operations that define memory order constraints in a program. They play a special role in controlling how memory operations in one unit of execution (such as work-items or, when using SVM a host thread) are made visible to another. Synchronization operations in OpenCL include atomic operations and fences.

在程序中定义内存顺序约束的操作。它们在控制一个执行单元中的内存操作(如工作项或使用SVM时的主线程)如何对另一个单元可见方面发挥着特殊作用。OpenCL中的同步操作包括原子操作和栅栏。

Synchronization point

同步点

A synchronization point between a pair of commands (A and B) assures that results of command A happens-before command B is launched (i.e. enters the ready state) .

一对指令(A和B)之间的同步点确保指令A的结果发生在指令B启动之前(即进入就绪状态)。

Synchronizes with

与同步

A relation between operations in two different units of execution that defines a memory order constraint in global memory (global-synchronizes-with) or local memory (local-synchronizes-with).

两个不同执行单元中的操作之间的关系,定义了全局内存(全局与同步)或本地内存(本地与同步)中的内存顺序约束。

Task Parallel Programming Model

任务并行编程模型

A programming model in which computations are expressed in terms of multiple concurrent tasks executing in one or more command-queues. The concurrent tasks can be running different kernels.

一种编程模型,其中计算是用在一个或多个指令队列中执行的多个并发任务来表示的。并发任务可以运行不同的内核。

Thread-safe

线程安全

An OpenCL API call is considered to be thread-safe if the internal state as managed by OpenCL remains consistent when called simultaneously by multiple host threads. OpenCL API calls that are thread-safe allow an application to call these functions in multiple host threads without having to implement mutual exclusion across these host threads i.e. they are also re-entrant-safe.

如果OpenCL管理的内部状态在多个主机线程同时调用时保持一致,则认为OpenCL API调用是线程安全的。线程安全的OpenCL API调用允许应用程序在多个主机线程中调用这些函数,而不必在这些主机线程之间实现互斥,即它们也是可重入安全的。

Undefined

未限定的

The behavior of an OpenCL API call, built-in function used inside a kernel or execution of a kernel that is explicitly not defined by OpenCL. A conforming implementation is not required to specify what occurs when an undefined construct is encountered in OpenCL.

OpenCL API调用的行为、内核内部使用的内置函数或未由OpenCL明确定义的内核的执行。当在OpenCL中遇到未定义的构造时,不需要一致的实现来指定会发生什么。

Unit of execution

执行单元

A generic term for a process, OS managed thread running on the host (a host-thread), kernel-instance, host program, work-item or any other executable agent that advances the work associated with a program.

进程、运行在主机(主机线程)上的操作系统管理线程、内核实例、主机程序、工作项或任何其他可执行代理的通用术语,用于推进与程序相关的工作。

Valid Object

有效对象

An OpenCL object is considered valid if it meets all of the following criteria:

如果OpenCL对象满足以下所有条件,则该对象被视为有效:

  • The object was created by a successful call to an OpenCL API function.

  • 该对象是通过对OpenCL API函数的成功调用创建的。

  • The object has a strictly positive application-owned reference count.

  • 该对象具有严格正的应用程序拥有的引用计数。

  • The object has not had its backing memory changed outside of normal usage by the OpenCL implementation (e.g. corrupted by the application, a library it uses, the implementation itself, or any other agent that can access the object’s backing memory).

  • 在OpenCL实现的正常使用之外,对象的后备内存没有发生更改(例如,被应用程序、它使用的库、实现本身或任何其他可以访问对象后备内存的代理损坏)。

An object is only valid in the platform where it was created.

对象仅在创建对象的平台中有效。

An OpenCL implementation must check for a NULL object to determine if an object is valid. The behavior for all other invalid objects is implementation-defined.

OpenCL实现必须检查NULL对象,以确定对象是否有效。所有其他无效对象的行为都是由实现定义的。

Work-group

工作组

A collection of related work-items that execute on a single compute unit. The work-items in the group execute the same kernel-instance and share local memory and work-group functions.

在单个计算单元上执行的相关工作项的集合。组中的工作项执行相同的内核实例,并共享本地内存和工作组功能。

Work-group Barrier

工作组栅栏

See Barrier.

请参见栅栏。

Work-group Function

工作组功能

A function that carries out collective operations across all the work-items in a work-group. Available collective operations are a barrier, reduction, broadcast, prefix sum, and evaluation of a predicate. A work-group function must occur within a converged control flow; i.e. all work-items in the work-group must encounter precisely the same work-group function.

对工作组中的所有工作项执行集体操作的函数。可用的集合运算是谓词的栅栏、归约、广播、前缀和和求值。工作组功能必须出现在聚合控制流中;即工作组中的所有工作项必须遇到完全相同的工作组功能。

Work-group Synchronization

工作组同步

Constraints on the order of execution for work-items in a single work-group.

对单个工作组中工作项的执行顺序的约束。

Work-pool

工作池

A logical pool associated with a device that holds commands and work-groups from kernel-instances that are ready to execute. OpenCL does not constrain the order that commands and work-groups are scheduled for execution from the work-pool; i.e. a programmer must assume that they could be interleaved. There is one work-pool per device used by all command-queues associated with that device. The work-pool may be implemented in any manner as long as it assures that work-groups placed in the pool will eventually execute.

与设备相关联的逻辑池,用于保存准备执行的内核实例中的指令和工作组。OpenCL不限制指令和工作组计划从工作池执行的顺序;即程序员必须假设它们可以被交织。与该设备相关联的所有命令队列都使用每个设备一个工作池。工作池可以以任何方式实现,只要它确保放置在池中的工作组最终会执行即可。

Work-item

工作项

One of a collection of parallel executions of a kernel invoked on a device by a command. A work-item is executed by one or more processing elements as part of a work-group executing on a compute unit. A work-item is distinguished from other work-items by its global ID or the combination of its work-group ID and its local ID within a work-group.

由指令在设备上调用的内核的并行执行集合之一。工作项由一个或多个处理元件执行,作为在计算单元上执行的工作组的一部分。工作项与其他工作项的区别在于其全局ID或其工作组ID和工作组内的本地ID的组合。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值