OpenCL™规范 3.3.7.内存排序规则

3.3.7. Memory Ordering Rules
3.3.7.内存排序规则

Fundamentally, the issue in a memory model is to understand the orderings in time of modifications to objects in memory. Modifying an object or calling a function that modifies an object are side effects, i.e. changes in the state of the execution environment. Evaluation of an expression in general includes both value computations and initiation of side effects. Value computation for an lvalue expression includes determining the identity of the designated object. [C11 standard, Section 5.1.2.3, paragraph 2, modified.]

​从根本上讲,内存模型中的问题是理解修改内存中对象的时间顺序。修改对象或调用修改对象的函数都是副作用,即执行环境的状态变化。表达式的评估通常包括数值计算和副作用的产生。左值表达式的值计算包括确定指定对象的身份。[C11标准,第5.1.2.3节,第2段,修改。]

We assume that the OpenCL kernel language and host programming languages have a sequenced-before relation between the evaluations executed by a single unit of execution. This sequenced-before relation is an asymmetric, transitive, pair-wise relation between those evaluations, which induces a partial order among them. Given any two evaluations A and B, if A is sequenced-before B, then the execution of A shall precede the execution of B. (Conversely, if A is sequenced-before B, then B is sequenced-after A.) If A is not sequenced-before or sequenced-after B, then A and B are unsequenced. Evaluations A and B are indeterminately sequenced when A is either sequenced-before or sequenced-after B, but it is unspecified which. [C11 standard, Section 5.1.2.3, paragraph 3, modified.]

​我们假设OpenCL内核语言和主机编程语言在单个执行单元执行的求值之间具有先序后序的关系。这种排序前的关系是这些评估之间的不对称、传递、成对关系,这在它们之间引发了偏序。给定任意两个评估A和B,如果A在B之前排序,则A的执行应先于B的执行。(相反,如果A在B之前测序,那么B在A之后测序。)如果A没有在B之前或之后测序,那么A和B没有测序。当A在B之前排序或在B之后排序时,评估A和B是不确定排序的,但未指定是哪一个。[C11标准,第5.1.2.3节,第3段,修改。]

Sequenced-before is a partial order of the operations executed by a single unit of execution (e.g. a host thread or work-item). It generally corresponds to the source program order of those operations, and is partial because of the undefined argument evaluation order of the OpenCL C kernel language.

Sequenced-before是由单个执行单元(例如,主线程或工作项)执行的操作的部分顺序。它通常对应于这些操作的源程序顺序,并且是部分的,因为OpenCL C内核语言的参数求值顺序未定义。

In an OpenCL kernel language, the value of an object visible to a work-item W at a particular point is the initial value of the object, a value stored in the object by W, or a value stored in the object by another work-item or host thread, according to the rules below. Depending on details of the host programming language, the value of an object visible to a host thread may also be the value stored in that object by another work-item or host thread. [C11 standard, Section 5.1.2.4, paragraph 2, modified.]

​在OpenCL内核语言中,根据以下规则,工作项W在特定点可见的对象的值是该对象的初始值、由W存储在对象中的值或由另一个工作项或主线程存储在对象内的值。根据宿主编程语言的细节,宿主线程可见的对象的值也可能是另一个工作项或宿主线程存储在该对象中的值。[C11标准,第5.1.2.4节,第2段,修改。]

Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location. [C11 standard, Section 5.1.2.4, paragraph 4.]

​如果两个表达式求值中的一个修改内存位置,而另一个读取或修改同一内存位置,则会发生冲突。[C11标准,第5.1.2.4节,第4段。]

All modifications to a particular atomic object M occur in some particular total order, called the modification order of M. If A and B are modifications of an atomic object M, and A happens-before B, then A shall precede B in the modification order of M, which is defined below. Note that the modification order of an atomic object M is independent of whether M is in local or global memory. [C11 standard, Section 5.1.2.4, paragraph 7, modified.]

​对特定原子对象M的所有修改都以某种特定的总顺序发生,称为M的修改顺序。如果A和B是原子对象M的修改,并且A发生在B之前,则A应在M的修改顺序中位于B之前,如下所定义。注意,原子对象M的修改顺序与M是在局部存储器中还是在全局存储器中无关。[C11标准,第5.1.2.4节,第7段,修改。]

A release sequence begins with a release operation A on an atomic object M and is the maximal contiguous sub-sequence of side effects in the modification order of M, where the first operation is A and every subsequent operation either is performed by the same work-item or host thread that performed the release or is an atomic read-modify-write operation. [C11 standard, Section 5.1.2.4, paragraph 10, modified.]

​释放序列以原子对象M上的释放操作A开始,并且是按M的修改顺序的副作用的最大连续子序列,其中第一个操作是A,并且每个后续操作要么由执行释放的同一工作项或主线程执行,要么是原子读-修改-写操作。[C11标准,第5.1.2.4节,第10段,修改。]

OpenCL’s local and global memories are disjoint. Kernels may access both kinds of memory while host threads may only access global memory. Furthermore, the flags argument of OpenCL’s work_group_barrier function specifies which memory operations the function will make visible: these memory operations can be, for example, just the ones to local memory, or the ones to global memory, or both. Since the visibility of memory operations can be specified for local memory separately from global memory, we define two related but independent relations, global-synchronizes-with and local-synchronizes-with. Certain operations on global memory may global-synchronize-with other operations performed by another work-item or host thread. An example is a release atomic operation in one work- item that global-synchronizes-with an acquire atomic operation in a second work-item. Similarly, certain atomic operations on local objects in kernels can local-synchronize- with other atomic operations on those local objects. [C11 standard, Section 5.1.2.4, paragraph 11, modified.]

​OpenCL的局部记忆和全局内存是脱节的。内核可以访问这两种内存,而主机线程只能访问全局内存。此外,OpenCL的work_group_barrier函数的flags参数指定了该函数将使哪些内存操作可见:例如,这些内存操作可以是对本地内存的操作,也可以是对全局内存的操作。由于可以分别为本地内存和全局内存指定内存操作的可见性,因此我们定义了两种相关但独立的关系,即全局同步和本地同步。全局内存上的某些操作可以与另一个工作项或主机线程执行的其他操作全局同步。一个例子是一个工作项中的release原子操作与第二个工作项的acquire原子操作全局同步。类似地,内核中对本地对象的某些原子操作可以与对这些本地对象的其他原子操作进行本地同步。[C11标准,第5.1.2.4节,第11段,修改。]

We define two separate happens-before relations: global-happens-before and local-happens-before.

我们定义了两种独立的先发生后关系:全局先发生和局部先发生。

A global memory action A global-happens-before a global memory action B if

全局内存操作A全局发生在全局内存操作B之前,如果

  • A is sequenced before B, or

  • A在B之前排序,或者

  • A global-synchronizes-with B, or

  • 全局与B同步,或

  • For some global memory action CA global-happens-before C and C global-happens-before B.

  • 对于某些全局内存操作C,A全局发生在C之前,C全局发生在B之前。

A local memory action A local-happens-before a local memory action B if

本地内存操作A本地发生在本地内存操作B之前,如果

  • A is sequenced before B, or

  • A在B之前排序,或者

  • A local-synchronizes-with B, or

  • A本地与B同步,或

  • For some local memory action CA local-happens-before C and C local-happens-before B.

  • 对于某些局部内存操作C,A局部发生在C之前,C局部发生在B之前。

An OpenCL 2.x implementation shall ensure that no program execution demonstrates a cycle in either the local-happens-before relation or the global-happens-before relation.

OpenCL 2.x实现应确保没有任何程序执行表明本地先发生后关系或全局先发生后关联中的循环。

The global- and local-happens-before relations are critical to defining what values are read and when data races occur. The global-happens-before relation, for example, defines what global memory operations definitely happen before what other global memory operations. If an operation A global-happens-before operation B then A must occur before B; in particular, any write done by A will be visible to B. The local-happens-before relation has similar properties for local memory. Programmers can use the local- and global-happens-before relations to reason about the order of program actions.

全局和局部先发生在关系对于定义读取什么值以及何时发生数据竞争至关重要之前。例如,全局先发生后关系定义了哪些全局内存操作肯定会在其他全局内存操作之前发生。如果操作A全局发生在操作B之前,则A必须发生在B之前;特别地,A所做的任何写入对B都是可见的。局部先发生后关系对于局部内存具有类似的属性。程序员可以使用局部和全局先发关系来推理程序操作的顺序。

A visible side effect A on a global object M with respect to a value computation B of M satisfies the conditions:

关于M的值计算B,全局对象M上的可见副作用A满足以下条件:

  • A global-happens-before B, and

  • A全局发生在B之前,并且

  • there is no other side effect X to M such that A global-happens-before X and X global-happens-before B.

  • X对M没有其他副作用,使得A全局发生在X之前,X全局发生在B之前。

We define visible side effects for local objects M similarly. The value of a non-atomic scalar object M, as determined by evaluation B, shall be the value stored by the visible side effect A[C11 standard, Section 5.1.2.4, paragraph 19, modified.]

​我们类似地定义局部对象M的可见副作用。由评估B确定的非原子标量对象M的值应为可见副作用a存储的值。[C11标准,第5.1.2.4节,第19段,修改。]

The execution of a program contains a data race if it contains two conflicting actions A and B in different units of execution, and

如果程序的执行在不同的执行单元中包含两个冲突的操作A和B,则该程序的执行包含数据竞争,并且

  • (1) at least one of A or B is not atomic, or A and B do not have inclusive memory scope, and

  • (1) A或B中至少有一个不是原子的,或者A和B不具有包含的内存范围,以及

  • (2) the actions are global actions unordered by the global-happens-before relation or are local actions unordered by the local-happens-before relation.

  • (2) 操作是全局先发生后关系无序的全局操作,或者是局部先发生后发生关系无序的局部动作。

Any such data race results in undefined behavior. [C11 standard, Section 5.1.2.4, paragraph 25, modified.]

​任何这样的数据竞争都会导致未定义的行为。[C11标准,第5.1.2.4节,第25段,修改。]

We also define the visible sequence of side effects on local and global atomic objects. The remaining paragraphs of this subsection define this sequence for a global atomic object M; the visible sequence of side effects for a local atomic object is defined similarly by using the local-happens-before relation.

我们还定义了局部和全局原子对象上的可见副作用序列。本小节的其余段落定义了全局原子对象M的该序列;类似地,通过使用局部先发生后关系来定义局部原子对象的可见副作用序列。

The visible sequence of side effects on a global atomic object M, with respect to a value computation B of M, is a maximal contiguous sub-sequence of side effects in the modification order of M, where the first side effect is visible with respect to B, and for every side effect, it is not the case that B global-happens-before it. The value of M, as determined by evaluation B, shall be the value stored by some operation in the visible sequence of M with respect to B[C11 standard, Section 5.1.2.4, paragraph 22, modified.]

​全局原子对象M上的可见副作用序列,相对于M的值计算B,是按M的修改顺序的最大连续副作用子序列,其中第一个副作用相对于B是可见的,并且对于每个副作用,B全局发生在它之前的情况并非如此。由评估B确定的M的值,应是通过M相对于B的可见序列中的一些运算存储的值。[C11标准,第5.1.2.4节,第22段,修改。]

If an operation A that modifies an atomic object M global-happens-before an operation B that modifies M, then A shall be earlier than B in the modification order of M. This requirement is known as write-write coherence.

如果修改原子对象M全局的操作A发生在修改M的操作B之前,则A的修改顺序应早于B。这一要求被称为写-写一致性。

If a value computation A of an atomic object M global-happens-before a value computation B of M, and A takes its value from a side effect X on M, then the value computed by B shall either equal the value stored by X, or be the value stored by a side effect Y on M, where Y follows X in the modification order of M. This requirement is known as read-read coherence. [C11 standard, Section 5.1.2.4, paragraph 22, modified.]

​如果原子对象M全局的值计算A发生在M的值计算B之前,并且A从M上的副作用X获得其值,则由B计算的值应等于由X存储的值,或者是由副作用Y存储在M上的值,其中Y按照M的修改顺序在X之后。这一要求被称为读-读一致性。[C11标准,第5.1.2.4节,第22段,修改。]

If a value computation A of an atomic object M global-happens-before an operation B on M, then A shall take its value from a side effect X on M, where X precedes B in the modification order of M. This requirement is known as read-write coherence.

如果原子对象M全局的值计算A发生在对M的操作B之前,则A应从对M的副作用X中获取其值,其中X在M的修改顺序中先于B。这一要求被称为读写一致性。

If a side effect X on an atomic object M global-happens-before a value computation B of M, then the evaluation B shall take its value from X or from a side effect Y that follows X in the modification order of M. This requirement is known as write-read coherence.

如果原子对象M全局上的副作用X发生在M的值计算B之前,则评估B应从X或从X后面的副作用Y中取其值,按M的修改顺序。这一要求被称为读写一致性。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值