OpenCL™规范 5.8.6.3. 优化选项

5.8.6.3. Optimization Options
5.8.6.3. 优化选项

These options control various sorts of optimizations. Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.

这些选项控制各种优化。启用优化标志会使编译器试图以牺牲编译时间和调试程序的能力为代价来提高性能或代码大小。

-cl-opt-disable

This option disables all optimizations. The default is optimizations are enabled.

此选项禁用所有优化。默认设置是启用优化。

-cl-strict-aliasing

This option allows the compiler to assume the strictest aliasing rules.

此选项允许编译器采用最严格的别名规则。

Note: This option is deprecated by version 1.1.

​注意:1.1版本已弃用此选项。

-cl-uniform-work-group-size

This requires that the global work-size be a multiple of the work-group size specified to clEnqueueNDRangeKernel. Allow optimizations that are made possible by this restriction.

​这要求全局工作大小是为clEnqueueNDRangeKernel指定的工作组大小的倍数。允许通过此限制进行优化。

Note: This option is missing before version 2.0.

​注意:2.0版本之前缺少此选项。

-cl-no-subgroup-ifp

This indicates that kernels in this program do not require sub-groups to make independent forward progress. Allows optimizations that are made possible by this restriction. This option has no effect for devices that do not support independent forward progress for sub-groups.

这表明此程序中的内核不需要子组进行独立的前进。允许通过此限制进行优化。此选项对不支持子组独立前进的设备无效。

Note: This option is missing before version 2.1.

​注意:2.1版本之前缺少此选项。

The following options control compiler behavior regarding floating-point arithmetic. These options trade off between performance and correctness and must be specifically enabled. These options are not turned on by default since it can result in incorrect output for programs which depend on an exact implementation of IEEE 754 rules/specifications for math functions.

以下选项控制编译器关于浮点运算的行为。这些选项在性能和正确性之间进行权衡,必须专门启用。默认情况下,这些选项不会打开,因为它可能会导致依赖于数学函数的IEEE 754规则/规范的精确实现的程序输出不正确。

-cl-mad-enable

Allow a * b + c to be replaced by a mad instruction. The mad instruction may compute a * b + c with reduced accuracy in the embedded profile. See the OpenCL C or OpenCL SPIR-V Environment specification for accuracy details. On some hardware the mad instruction may provide better performance than the expanded computation.

允许用疯狂指令替换a*b+c。mad指令可能会在嵌入式配置文件中以较低的精度计算a*b+c。有关精度的详细信息,请参阅OpenCL C或OpenCL SPIR-V环境规范。在某些硬件上,mad指令可能比扩展计算提供更好的性能。

-cl-no-signed-zeros

Allow optimizations for floating-point arithmetic that ignore the signedness of zero. IEEE 754 arithmetic specifies the distinct behavior of +0.0 and -0.0 values, which then prohibits simplification of expressions such as x + 0.0 or 0.0 * x (even with -cl-finite-math-only). This option implies that the sign of a zero result is not significant.

允许对忽略零符号的浮点运算进行优化。IEEE 754算法规定了+0.0和-0.0值的不同行为,从而禁止简化x+0.0或0.0*x等表达式(即使使用-cl-finite-math-only)。此选项意味着零结果的符号不重要。

-cl-unsafe-math-optimizations

Allow optimizations for floating-point arithmetic that (a) assume that arguments and results are valid, (b) may violate the IEEE 754 standard, (c) assume relaxed OpenCL numerical compliance requirements as defined in the unsafe math optimization section of the OpenCL C or OpenCL SPIR-V Environment specifications, and (d) may violate edge case behavior in the OpenCL C or OpenCL SPIR-V Environment specifications. This option includes the -cl-no-signed-zeros-cl-mad-enable, and -cl-denorms-are-zero [25] options.

​允许对浮点算术进行优化,这些优化(a)假设参数和结果有效,(b)可能违反IEEE 754标准,(c)假设OpenCL c或OpenCL SPIR-V环境规范的不安全数学优化部分中定义的放宽的OpenCL数值合规性要求,以及(d)可能违反OpenCL C或者OpenCL SPIR-V环境规范中的边缘情况行为。此选项包括-cl-no-signed-zeros-cl-mad-enable-cl-denorms-are-zero选项[25]。

-cl-finite-math-only

Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs, +Inf, -Inf. This option may violate the OpenCL numerical compliance requirements for single precision and double precision floating-point, as well as edge case behavior.

允许对浮点运算进行优化,假设参数和结果不是NaN、+Inf、-Inf。此选项可能违反单精度和双精度浮点的OpenCL数值合规性要求,以及边缘情况行为。

-cl-fast-relaxed-math

Sets the optimization options -cl-finite-math-only and -cl-unsafe-math-optimizations. This option causes the preprocessor macro __FAST_RELAXED_MATH__ to be defined in the OpenCL program.

设置优化选项--cl-finite-math-only-cl-unsafe-math-optimizations。此选项导致在OpenCL程序中定义预处理器宏__FAST_REAXED_MATH__。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值