LLVM 编译器学习笔记之六十七 -- 浮点优化-ffast-math

清钟沁桐

已于 2024-04-29 17:45:12 修改

阅读量376

点赞数

文章标签：学习笔记个人开发

于 2023-07-31 11:42:38 首次发布

本文链接：https://blog.csdn.net/zhongyunde/article/details/132019342

版权

1、-ffast-math 包括一系列的子优化特性（对应fast属性），预期该选项能得到最好的性能，参考

https://llvm.org/docs/LangRef.html#fast-math-flags

-fno-honor-infinities
-fno-honor-nans
-fno-math-errno
-ffinite-math
-fassociative-math
-freciprocal-math
-fno-signed-zeros
-fno-trapping-math
-ffp-contract=fast ，对应contract属性，而不是fast

2、只有选项-ffast-math使能情况下，才允许对浮点的访存操作向量化，否则提示unsafe algebra

3、一些特别的分析[InstCombine] optimize powi(X,Y)/X with Ofast by vfdff · Pull Request #67236 · llvm/llvm-project · GitHub

reassoc is the general flag we've been using for pow combines, so that checks out.

Special case analysis:

X is nonspecial, Y is INT_MIN -> result should be +/-0, Y wraparound produces +/-infinity instead

X is +/-0, Y is INT_MIN -> result should be +/-infinity, Y wraparound produces +/-0 instead

X is +/-inf, Y is INT_MIN -> result should be +/-0, Y wraparound produces +/-infinity instead

X is NaN, Y is INT_MIN -> result should be NaN, wraparound produces NaN

X is NaN, Y is not 1 -> result should be NaN, transform is correct

X is NaN, Y is 1 -> result should be NaN, transform makes it 1 instead

X is +/- 0, Y > 1 -> result should be NaN, transform makes it +/- 0 instead

X is +/- 0, Y is 1 -> result should be NaN, transform makes it 1 instead

X is +/- 0, Y is 0 -> result should be +/-inf, transform is correct

X is +/- 0, Y < 0 -> result should be inf, transform is correct

X is +/- inf, Y > 1 -> result should be NaN, transform makes it +/-inf instead

X is +/- inf, Y is 1 -> result should be NaN, transform makes it 1 instead

X is +/- inf, Y is 0 -> result should be +/-0, transform is correct

X is +/- inf, Y < 0 -> result should be +/-0, transform is correct

(assuming powi(X, 1) is exactly X and powi(X, -1) is exactly 1.0/X)

X is nonspecial, Y is 1 -> result should be 1, transform is correct
X is nonspecial, Y is 0 -> result should be 1/x, transform is correct

Ignoring the issue of INT_MIN - 1 wraparound, the cases where the transformation is incorrect is when the result would have been NaN (largely via intermediate 0.0/0.0 or infinity/infinity), so nnan is sufficient for that case. Taking into account the potential for wraparound, however, there is no set of fast-math flags that makes the transformation legal: powi(2.0, INT_MIN) is 0.0, and 0.0/2.0 is legally 0.0.

4、fastmath的详细介绍，参考https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffast-math

nnan, ninf, and poison - #2 by rotateright - LLVM Dev List Archives - LLVM Discussion Forums

5、float_control(precise) 可以设置某个区间的代码有特别的优化要求

pragma float_control - Intel Community

Propogation of fpclass assumptions vis a vis fast-math flags - #2 by efriedma-quic - IR & Optimizations - LLVM Discussion Forums

6、创建IR 时Builder.setFastMathFlags设置fast, 则新创建的每个IR均带有fast属性commit 92057604

7、RP63476：fp-contract=fast => fmul contract + fadd contract. fp-contract=on => fmuladd

8、pragma 也可以指示fast属性，参考https://github.com/llvm/llvm-project/pull/90377

清钟沁桐

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
LLVM 编译器学习笔记之六十七 -- 浮点优化-ffast-math

-fno-honor-infinities-fno-honor-nans-fno-math-errno-ffinite-math-fassociative-math-freciprocal-math-fno-signed-zeros-fno-trapping-math-ffp-contract=fast
复制链接

扫一扫