OpenCL™规范 5.14. 内核和内存对象命令的无序执行

5.14. Out-of-Order Execution of Kernels and Memory Object Commands

5.14. 内核和内存对象命令的无序执行

The OpenCL functions that are submitted to a command-queue are enqueued in the order the calls are made but can be configured to execute in-order or out-of-order. The properties argument in clCreateCommandQueueWithProperties or clCreateCommandQueue can be used to specify the execution order.

​提交到命令队列的OpenCL函数按照调用的顺序排队,但可以配置为按顺序或无序执行。clCreateCommandQueueWithProperties或clCreateCommandQueue中的properties参数可用于指定执行顺序。

If the CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE property of a command-queue is not set, the commands enqueued to a command-queue execute in-order. For example, if an application calls clEnqueueNDRangeKernel to execute kernel A followed by a clEnqueueNDRangeKernel to execute kernel B, the application can assume that kernel A finishes first and then kernel B is executed. If the memory objects output by kernel A are inputs to kernel B then kernel B will see the correct data in memory objects produced by execution of kernel A. If the CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE property of a command-queue is set, then there is no guarantee that kernel A will finish before kernel B starts execution.

​如果未设置命令队列的CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE属性,则排队到命令队列的命令将按顺序执行。例如,如果一个应用程序调用clEnqueueNDRangeKernel来执行内核A,然后再调用clEnquenceNDRangeKernels来执行内核B,则该应用程序可以假设内核A先完成,然后执行内核B。如果内核A输出的内存对象是内核B的输入,则内核B将在内核A执行产生的内存对象中看到正确的数据。如果设置了命令队列的CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE属性,则无法保证内核A将在内核B开始执行之前完成。

Applications can configure the commands enqueued to a command-queue to execute out-of-order by setting the CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE property of the command-queue. This can be specified when the command-queue is created. In out-of-order execution mode there is no guarantee that the enqueued commands will finish execution in the order they were queued. As there is no guarantee that kernels will be executed in-order, i.e. based on when the clEnqueueNDRangeKernel or clEnqueueTask calls are made within a command-queue, it is therefore possible that an earlier clEnqueueNDRangeKernel call to execute kernel A identified by event A may execute and/or finish later than a clEnqueueNDRangeKernel call to execute kernel B which was called by the application at a later point in time. To guarantee a specific order of execution of kernels, a wait on a particular event (in this case event A) can be used. The wait for event A can be specified in the event_wait_list argument to clEnqueueNDRangeKernel for kernel B.

​应用程序可以通过设置命令队列的CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE属性,将排队到命令队列的命令配置为无序执行。这可以在创建命令队列时指定。在无序执行模式下,无法保证排队的命令将按照排队的顺序完成执行。由于无法保证内核将按顺序执行,即基于命令队列中何时进行clEnqueueNDRangeKernel或clEnqueueTask调用,因此,较早执行由事件A标识的内核A的clEnqueue NDRangekernel调用可能会晚于应用程序在稍后时间点调用的执行内核B的clEnquenueNDRangeInner调用执行或完成。为了保证内核的特定执行顺序,可以使用对特定事件(在本例中为事件A)的等待。可以在内核B的clEnqueueNDRangeKernel的event_wait_list参数中指定事件A的等待时间。

In addition, a marker (clEnqueueMarker or clEnqueueMarkerWithWaitList) or a barrier (clEnqueueBarrier or clEnqueueBarrierWithWaitList) command can be enqueued to the command-queue. The marker command ensures that previously enqueued commands identified by the list of events to wait for (or all previous commands) have finished. A barrier command is similar to a marker command, but additionally guarantees that no later-enqueued commands will execute until the waited-for commands have executed.

​此外,可以将标记(clEnqueueMarker或clEnquequeueMarkerWithWaitList)或屏障(clEniqueBarrier或clEnquenueBarrierWithWaitlist)命令加入命令队列。marker命令确保由要等待的事件列表标识的先前排队的命令(或所有先前的命令)已经完成。屏障命令类似于标记命令,但额外保证在等待的命令执行之前,不会执行后续排队的命令。

Similarly, commands to read, write, copy or map memory objects that are enqueued after clEnqueueNDRangeKernelclEnqueueTask or clEnqueueNativeKernel commands are not guaranteed to wait for kernels scheduled for execution to have completed (if the CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE property is set). To ensure correct ordering of commands, the event object returned by clEnqueueNDRangeKernelclEnqueueTask or clEnqueueNativeKernel can be used to enqueue a wait for event or a barrier command can be enqueued that must complete before reads or writes to the memory object(s) occur.

​同样,在clEnqueueNDRangeKernel、clEnqueueTask或clEnqueue NativeKernel命令之后排队的读取、写入、复制或映射内存对象的命令不能保证等待计划执行的内核完成(如果设置了CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE属性)。为了确保命令的正确顺序,可以使用clEnqueueNDRangeKernel、clEnqueueTask或clEnqueueNativeKernel返回的事件对象来排队等待事件,或者可以排队等待必须在对内存对象进行读取或写入之前完成的屏障命令。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值