【LLVM】nsw和nuw的一个例子

道在屎溺

于 2024-02-20 00:09:11 发布

阅读量1.3k

点赞数 16

分类专栏：编程语言文章标签： llvm IR 编译

本文链接：https://blog.csdn.net/weixin_45207619/article/details/136175204

版权

编程语言专栏收录该内容

38 篇文章 1 订阅

订阅专栏

nsw和nuw是LLVMIR提供给二元运算的flag。分别表示not signed wrap和not unsigned wrap。在LLVM2.6的更新日志中表述如下：
The add, sub and mul instructions now support optional “nsw” and “nuw” bits which indicate that the operation is guaranteed to not overflow (in the signed or unsigned case, respectively). This gives the optimizer more information and can be used for things like C signed integer values, which are undefined on overflow.
给出了和官方文档中不同的角度，该flag的添加相当于向编译器提供了一个不溢出的保证，让编译器基于此进行优化。

一个例子

在查找相关资料时，发现了一篇帖子，是LLVM开发者讨论nsw和非nsw的相似性，当两者共同出现时，GVN的消除方式是否安全。

int func(int a, int b) {
    int c = a + b;
    return c;
}

上述例子使用clang14进行编译：

clang -emit-llvm -S -O example.cc

得到的IR为：

; ModuleID = 'nuw_and_nsw/nsw_example1.cc'
source_filename = "nuw_and_nsw/nsw_example1.cc"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
define dso_local noundef i32 @_Z4funcii(i32 noundef %0, i32 noundef %1) local_unnamed_addr #0 {
  %3 = add nsw i32 %1, %0
  ret i32 %3
}

attributes #0 = { mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"uwtable", i32 1}
!2 = !{!"clang version 14.0.0"}

其中对add运算添加了一个nsw的flag，因为在此例子中源代码蕴含的意思不考虑溢出。

修改之后给出一个IR，和帖子中给出的例子一致。

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
define dso_local noundef i32 @_Z4funcii(i32 noundef %0, i32 noundef %1) local_unnamed_addr #0 {
  %3 = add nsw i32 %1, %0
  %4 = add i32 %1, %0
  ret i32 %4
}

通过如下命令调用GVN优化:

opt -gvn source.ll -o target.ll

优化结果的核心部分如下：

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
define dso_local noundef i32 @_Z4funcii(i32 noundef %0, i32 noundef %1) local_unnamed_addr #0 {
  %3 = add i32 %1, %0
  ret i32 %3
}

另一种尝试输入的源IR如下：

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
define dso_local noundef i32 @_Z4funcii(i32 noundef %0, i32 noundef %1) local_unnamed_addr #0 {
  %3 = add i32 %1, %0
  %4 = add nsw i32 %1, %0
  ret i32 %4
}

最后得到的优化结果依旧是将带有nsw的结果优化掉。

如何生成一个不带nsw的加法？

int func(unsigned a, unsigned b) {
    unsigned c = a + b;
    return (int)c;
}

生成的IR核心部分为：

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
define dso_local noundef i32 @_Z4funcjj(i32 noundef %0, i32 noundef %1) local_unnamed_addr #0 {
  %3 = add i32 %1, %0
  ret i32 %3
}

生成nsw flag的主要代码

首先关注clang/lib/CodeGen/CGExprScalar.cpp中的一个名为EmitAdd函数，该函数可以创建含有nsw flag的IR。现将部分函数片段摘录如下：

Value *ScalarExprEmitter::EmitAdd(const BinOpInfo &op) {
  if (op.LHS->getType()->isPointerTy() ||
      op.RHS->getType()->isPointerTy())
    return emitPointerArithmetic(CGF, op, CodeGenFunction::NotSubtraction);

  if (op.Ty->isSignedIntegerOrEnumerationType()) {
    switch (CGF.getLangOpts().getSignedOverflowBehavior()) {
    case LangOptions::SOB_Defined:
      return Builder.CreateAdd(op.LHS, op.RHS, "add");
    case LangOptions::SOB_Undefined:
      if (!CGF.SanOpts.has(SanitizerKind::SignedIntegerOverflow))
        return Builder.CreateNSWAdd(op.LHS, op.RHS, "add");
      [[fallthrough]];
    case LangOptions::SOB_Trapping:
      if (CanElideOverflowCheck(CGF.getContext(), op))
        return Builder.CreateNSWAdd(op.LHS, op.RHS, "add");
      return EmitOverflowCheckedBinOp(op);
    }
  }

可以看到，首先判断两个操作数是否为指针，如果都不是，判断当前表达式是否是符号整数类型或枚举类型，然后根据CGF.getLangOpts().getSignedOverflowBehavior()判断不同的情况。
在clang/include/clang/Basic/LangOptions.h中可以找到SignedOverflowBehaviorTy的定义：

  enum SignedOverflowBehaviorTy {
    // Default C standard behavior.
    SOB_Undefined,

    // -fwrapv
    SOB_Defined,

    // -ftrapv
    SOB_Trapping
  };

可以看到，默认情况下SignedOverflowBehaviorTy的值为SOB_Undefined，而其余两行的flag分别是与溢出和陷阱相关的flag。可以参考笔者往期的文章。
如果将上述能够产生带有nsw的IR对应的编译过程添加-fwrapv flag，可以发现生成的IR中不再包含nsw flag。

一个包含nuw flag的例子

#include <stdio.h>
void func(unsigned a, unsigned b) {
    unsigned c = 0;
    if(a < 10000) {
        c = a + 10;
    }
    printf("%ud", c);
}

在该例子中，可以看到，a+10一定不会出现溢出的情况，因此智能的分析应该能够识别到此情况并将其设置为nuw和nsw。
上述程序生成的核心IR为：

; Function Attrs: mustprogress nofree nounwind uwtable
define dso_local void @_Z4funcjj(i32 noundef %a, i32 noundef %b) local_unnamed_addr #0 {
entry:
  %cmp = icmp ult i32 %a, 10000
  %add = add nuw nsw i32 %a, 10
  %spec.select = select i1 %cmp, i32 %add, i32 0
  %call = tail call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str, i32 noundef %spec.select)
  ret void
}

也就是说，包含nuw和nsw的表达式是比未包含上述情况的表达式更precise的，因此如果在优化中遇到了这两种同时出现的情况，应该注意不能直接将未包含nsw和nuw的那个表达式删除。这样会造成分析的精度下降。

通过调试发现，-O0生成的add表达式是不包含nuw和nsw flag的，因此是通过后续的优化将其设置上的。简化的调用栈如下：

llvm::Instruction::setHasNoUnsignedWrap llvm/lib/IR/Instruction.cpp:349
llvm::refineInstruction llvm/lib/Transforms/Utils/SCCPSolver.cpp:130

再次印证了nuw和nsw是和优化密切相关的两个flag。

道在屎溺

关注

16
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录