LLVM-IR

指令选项

Reassoc 重新关联,用来改变加法或乘法结合方式
Nsz 禁止有符号零,浮点数运算中,有符号零和无符号零有区别
Arcp approxiamate reciprocal,近似倒数,通常用于浮点运算中,以加快运算速度,使用近似倒数可以在某些情况下提高性能,尽管可能会引入一些误差。
contract 合并,合并相邻浮点数运算,以减少指令数量,有两种模式,fast和on,fast模式下使用contract会牺牲数值精度。

指令

%16 = and i32 %1, 3 与运算
%4 = fpext double %0 to x86_fp80 fpext是类型转换指令
%10 = icmp sgt i32 %1, 0 sgt表示有符号数比较
%18 = add nsw i64 %17, -4 nsw(no singed wrapper)表示溢出的话进行截断,而不会未定义
%19 = lshr exact i64 %18, 2 lshr表示逻辑右移,exact表示精确右移,意味着移位后结果表示精确。
%20 = add nuw nsw i64 %19, 1nuw表示没有无符号溢出,nsw表示没有有符号溢出。
%46 = fcmp oge double %41, %45 oge表示(greater than or equal)
%0 = bitcast [10 x i8]* %str to i8* 类型转换
%4 = getelementptr inbounds i8*, i8** %1, i64 1getelementprt用于在内存中获取特定元素的指针
%104 = insertelement <2 x double> undef, double %6, i32 0用于将元素插入到向量或数组中的特定索引位置。
%129 = shufflevector <2 x double> %128, <2 x double> undef, <2 x i32> zeroinitializershufflevector用于根据给定的索引重新排列向量中的元素

函数属性

uwtable 这个属性指示编译器需要为某个函数生成一个展开表(unwind table)的条目,即使我们可以确定该函数不会抛出异常。展开表是一种数据结构,用于在程序执行期间跟踪函数调用的堆栈信息,以便在出现异常时进行栈展开操作
readnone这个属性表明函数不会对传入的指针参数进行解引用操作,即不会直接使用该指针来访问指针所指向的内存。然而,函数可能会通过其他指针来访问和读取或写入该指针所指向的内存
nounwind这个函数属性表明函数不会引发异常。也就是说,程序员可以信赖该函数在正常情况下不会触发异常
speculatable这个函数属性表示函数在执行时只负责计算其结果,不会有其他效果,而且不会导致未定义的行为。这意味着函数在其运行期间不会对程序状态造成任何可观察的变化,也不会导致不符合规范的行为

fast-math浮点属性

LLVM IR浮点操作(fneg、fadd、fsub、fmul、fdiv、frem、fcmp)、phi、select和call可以使用以下标志来启用不安全的浮点转换。如
nnan
No NaNs - Allow optimizations to assume the arguments and result are not NaN. If an argument is a nan, or the result would be a nan, it produces a poison value instead.

ninf
No Infs - Allow optimizations to assume the arguments and result are not +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it produces a poison value instead.

nsz
No Signed Zeros - Allow optimizations to treat the sign of a zero argument or zero result as insignificant. This does not imply that -0.0 is poison and/or guaranteed to not exist in the operation.

arcp
Allow Reciprocal - Allow optimizations to use the reciprocal of an argument rather than perform division.

contract
Allow floating-point contraction (e.g. fusing a multiply followed by an addition into a fused multiply-and-add). This does not enable reassociating to form arbitrary contractions. For example, (ab) + (cd) + e can not be transformed into (ab) + ((cd) + e) to create two fma operations.

afn
Approximate functions - Allow substitution of approximate calculations for functions (sin, log, sqrt, etc). See floating-point intrinsic definitions for places where this can apply to LLVM’s intrinsic math functions.

reassoc
Allow reassociation transformations for floating-point instructions. This may dramatically change results in floating-point.

fast
This flag implies all of the others.

C99 浮点环境支持科学和数学级别的应用,这些应用必须有相当高的精度,但是某些应用却不是如此,注重速度高于精度。对于这些以速度为重的应用, -ffast-math 选项定义了预处理器宏 FAST_MATH, 指示编译不必遵循 IEEE 和 ISO 的浮点运算标准。-ffast-math标记是一个群组选项,可以分别启用下面六个优化选项:

-fno-math-errno

禁用了代表单一浮点运算的数学函数使用全局变量 errno 的功能。

-funsafe-math-optimizations

指的是一些可能违反浮点数学标准或省略对参数和结果的验证的数学计算优化方法。使用这些优化方法可能涉及到链接(连接)修改浮点处理器控制标志的代码。

-fno-trapping-math

这表示编译器或优化工具会生成所谓的"nonstop"代码,即不会包含用于处理数学异常的代码,因为假定不会出现需要用户程序干预的异常情况。
数学异常可以包括浮点溢出、除以零等。

-ffinite-math-only

生成的可执行代码会忽略函数的输入参数和结果中的无穷大(infinities)和 NaN(“not a number”,非数字)值。这意味着在执行这段代码时,如果函数的输入参数包含无穷大或 NaN 值,或者函数的结果会产生这些值,那么这些特殊值将被忽略,而不会引发异常或导致错误处理。

-fno-rounding-math

这个选项表明你的程序不依赖于特定的舍入行为。在浮点数计算中,舍入行为决定了小数部分如何四舍五入或截断。这个选项表示你的程序不要求特定的舍入方式来产生正确的结果。

-fno-signaling-nans

这个选项允许进行一些优化,以限制由"signaling NaNs"(“信号NaN”)引发的浮点异常的数量。在浮点计算中,“信号NaN” 是一种特殊的NaN值,当涉及到它们时,它们可以引发浮点异常。这个选项的目的是通过优化来减少这种异常的数量。

-ffast-math会开启以下选项:
-fno-honor-infinities
Allow floating-point optimizations that assume arguments and results are not ±Inf. Defaults to -fhonor-infinities.

If both -fno-honor-infinities and -fno-honor-nans are used, has the same effect as specifying -ffinite-math-only.
-fno-honor-nans
Allow floating-point optimizations that assume arguments and results are not NaNs. Defaults to -fhonor-nans.
-fapprox-func
Allow unsafe floating-point optimizations. -funsafe-math-optimizations also implies:
-fapprox-func
-fassociative-math
-freciprocal-math
-fno-signed-zeros
-fno-trapping-math
-ffp-contract=fast
-fno-math-errno
Require math functions to indicate errors by setting errno. The default varies by ToolChain. -fno-math-errno allows optimizations that might cause standard C math functions to not set errno. For example, on some systems, the math function sqrt is specified as setting errno to EDOM when the input is negative. On these systems, the compiler cannot normally optimize a call to sqrt to use inline code (e.g. the x86 sqrtsd instruction) without additional checking to ensure that errno is set appropriately. -fno-math-errno permits these transformations.

On some targets, math library functions never set errno, and so -fno-math-errno is the default. This includes most BSD-derived systems, including Darwin.
-ffinite-math-only
Allow floating-point optimizations that assume arguments and results are not NaNs or ±Inf. -ffinite-math-only defines the FINITE_MATH_ONLY preprocessor macro.
-fassociative-math
-freciprocal-math
-fno-signed-zeros
-fno-trapping-math
Control floating point exception behavior. -fno-trapping-math allows optimizations that assume that floating point operations cannot generate traps such as divide-by-zero, overflow and underflow.
-fno-rounding-math
Force floating-point operations to honor the dynamically-set rounding mode by default.

The result of a floating-point operation often cannot be exactly represented in the result type and therefore must be rounded. IEEE 754 describes different rounding modes that control how to perform this rounding, not all of which are supported by all implementations. C provides interfaces (fesetround and fesetenv) for dynamically controlling the rounding mode, and while it also recommends certain conventions for changing the rounding mode, these conventions are not typically enforced in the ABI. Since the rounding mode changes the numerical result of operations, the compiler must understand something about it in order to optimize floating point operations.

Note that floating-point operations performed as part of constant initialization are formally performed prior to the start of the program and are therefore not subject to the current rounding mode. This includes the initialization of global variables and local static variables. Floating-point operations in these contexts will be rounded using FE_TONEAREST.

The option -fno-rounding-math allows the compiler to assume that the rounding mode is set to FE_TONEAREST. This is the default.

The option -frounding-math forces the compiler to honor the dynamically-set rounding mode. This prevents optimizations which might affect results if the rounding mode changes or is different from the default; for example, it prevents floating-point operations from being reordered across most calls and prevents constant-folding when the result is not exactly representable.
-ffp-contract=fast
Specify when the compiler is permitted to form fused floating-point operations, such as fused multiply-add (FMA). Fused operations are permitted to produce more precise results than performing the same operations separately.

The C standard permits intermediate floating-point results within an expression to be computed with more precision than their type would normally allow. This permits operation fusing, and Clang takes advantage of this by default. This behavior can be controlled with the FP_CONTRACT and clang fp contract pragmas. Please refer to the pragma documentation for a description of how the pragmas interact with this option.

Valid values are:

fast (fuse across statements disregarding pragmas, default for CUDA)

on (fuse in the same statement unless dictated by pragmas, default for languages other than CUDA/HIP)

off (never fuse)

fast-honor-pragmas (fuse across statements unless dictated by pragmas, default for HIP)

clang官网: https://clang.llvm.org/docs/UsersManual.html#options-to-emit-optimization-reports

结果不一致

-O3 -ffast-math -fno-finite-math-only 相比-O3有如下变化

  • 乘法、除法、加法、减法指令增加了四个选项:Reassoc、nsz、arcp、contract,这些选项关注浮点优化
  • 调用的log、pow函数变成了llvm.log10.f80和llvm.pow.f80
  • “no-signed-zeros-fp-math”、“no-trapping-math”、"unsafe-fp-math"由false变成了true,这些选项关注函数行为

`

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值