LLVM-TransformUtils-Mem2Reg

最新推荐文章于 2023-04-06 07:41:25 发布

yeshahayes

最新推荐文章于 2023-04-06 07:41:25 发布

阅读量2.6k

点赞数 4

分类专栏：编译与反编译

本文链接：https://blog.csdn.net/yeshahayes/article/details/97018670

版权

本文详细介绍了LLVM IR如何通过mem2reg pass从非strict-SSA结构转化为prune-SSA，包括预处理、放置PHI节点、重命名和指令化简四个步骤。讲解了在提升alloca指令到寄存器过程中涉及的优化策略和SSA构造算法。

摘要由CSDN通过智能技术生成

众所周知LLVM IR其实在clang的codegen后并不是strict-SSA结构，因为这时候局部变量表现为alloca指令，同时对局部变量通过load和store进行读写操作，这会导致局部变量可能会存在多个def(多个store指令)，而SSA要求每个变量只能有一个def。这时LLVM会通过标准的SSA构造算法来将原始IR转换成minimal-SSA并最终转换成prune-SSA，这一切都在mem2reg的pass中实现。

预处理

Mem2Reg.cpp是一个入口，主要功能实现在PromoteMemoryToRegister.cpp中。
Mem2Reg.cpp中的主要代码：

    // Find allocas that are safe to promote, by looking at all instructions in
    // the entry node
    for (BasicBlock::iterator I = BB.begin(), E = --BB.end(); I != E; ++I)
      if (AllocaInst *AI = dyn_cast<AllocaInst>(I)) // Is it an alloca?
      	// 如果可以提升，则加入到数组中
        if (isAllocaPromotable(AI))
          Allocas.push_back(AI);

    if (Allocas.empty())
      break;
	// 对所有可以提升的alloca进行提升
    PromoteMemToReg(Allocas, DT, &AC);

这里isAllocaPromotable用于判断是否可以提升，那么什么情况下局部变量不能提升到寄存器呢，首先是volatile属性的变量，很显然，这些变量必须要有内存空间。其次，如果有对局部变量取址，那么也是无法提升的，取址在LLVMIR中可能表现为对alloca指令进行加减操作。后面还有bitcast和gep的条件，gep是针对数组或结构体的操作，这些对SSA来说太难了。
在收集了所有可以提升的alloca后，进入PromoteMemToReg对他们进行提升。
PromoteMem2Reg中执行的是标准的SSA构造算法，算法原理不细讲，感兴趣的可以看论文《Efficiently computing static single assignment form and the control dependence graph》。过程分为以下几步：

placing PHI-node.
rename.
指令化简.

placing PHI-node.

首先确认alloca是不是死代码，对于SSA很简单，看他的user是不是空：

		if (AI->use_empty()) {
   
			// If there are no uses of the alloca, just delete it now.
			AI->eraseFromParent();

			// Remove the alloca from the Allocas list, since it has been processed
			RemoveFromAllocasList(AllocaNum);
			++NumDeadAlloca;
			continue;
		}

然后LLVM会通过一个辅助类计算alloca的相关属性，因为某些情况可以进行剪枝优化的，一般论文里或者书里不会介绍这些工程实践得到优化经验，因此看这些源码还是所有收获的。
这里AllocaInfo记录了所有的DefiningBlock,UsingBlock以及是否是唯一的store(def)或者唯一的load(use)，下面会看到如何用这些信息进行剪枝优化：

		// 分析alloca的一些属性
		Info.AnalyzeAlloca(AI);

		// 如果只有一个store，则将所有被其支配的def的user替换掉
		if (Info.DefiningBlocks.size() == 1) {
   
			if (rewriteSingleStoreAlloca(AI, Info, LBI, SQ.DL, DT, AC)) {
   
				// The alloca has been processed, move on.
				RemoveFromAllocasList(AllocaNum);
				++NumSingleStore;
				continue;
			}
		}

		// 如果只在一个block中使用到了这个alloca，那么就使用最近的store操作数来替代load的user
		if (Info.OnlyUsedInOneBlock &&
			promoteSingleBlockAlloca(AI, Info, LBI, SQ.DL, DT, AC)) {
   
			// The alloca has been processed, move on.
			RemoveFromAllocasList(AllocaNum);
			continue;
		}

第一种情况：只有一个store。如果其余的load都被这个store所支配，那么这时就已经符合strict-SSA的要求。因此只需要找到所有被store支配的load，将这些load的use用store的操作数来替换就可以了。还有一种情况，就是在唯一的store之前有一个未定义的load，但我们不能说这个程序就是错的，LLVM的注释给出了一种情况：

///  for (...) {
   
///    int t = *A;
///    if (!first_iteration)
///      use(t);
///    *A = 42;
///  }

这时虽然第一次A的使用（int t = *A）是未定义的，但程序是正确的。所以对于这种未定义load，还是要走正常的流程。

static bool rewriteSingleStoreAlloca(AllocaInst* AI, AllocaInfo& Info,
	LargeBlockInfo& LBI, const DataLayout& DL,
	DominatorTree& DT, AssumptionCache* AC) {
   
	StoreInst* OnlyStore = Info.OnlyStore;
	// store的是否是全局变量
	bool StoringGlobalVal = !isa<Instruction>(OnlyStore->getOperand(0));
	BasicBlock* StoreBB = OnlyStore->getParent();
	int StoreIndex = -1;

	// Clear out UsingBlocks.  We will reconstruct it here if needed.
	Info.UsingBlocks.clear();

	for (auto UI = AI->user_begin(), E = AI->user_end(); UI != E;) {
   
		Instruction* UserInst = cast<Instruction>(*UI++);
		// 被搜集的alloca应该只有load和store
		// store不管，这里只看load是否被store所支配
		if (!isa<LoadInst>(UserInst)) {
   
			assert(UserInst == OnlyStore && "Should only have load/stores");
			continue;
		}
		LoadInst* LI = cast<LoadInst>(UserInst);

		// Okay, if we have a load from the alloca, we want to replace it with the
		// only value stored to the alloca.  We can do this if the value is
		// dominated by the store.  If not, we use the rest of the mem2reg machinery
		// to insert the phi nodes as needed.
		if (!StoringGlobalVal) {
    // Non-instructions are always dominated.
			// load和store在一个block中，需要考虑他们之间的顺序
			// 如果load在store之前，则不处理这个load
			if (LI->getParent() == StoreBB) {
   

				if (StoreIndex == -1)
					StoreIndex = LB

最低0.47元/天解锁文章

yeshahayes

关注

4
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
LLVM-TransformUtils-Mem2Reg

众所周知LLVM IR其实在clang的codegen后并不是strict-SSA结构，因为这时候局部变量表现为alloca指令，同时对局部变量通过load和store进行读写操作，这会导致局部变量可能会存在多个def(多个store指令)，而SSA要求每个变量只能有一个def。这时LLVM会通过标准的SSA构造算法来将原始IR转换成minimal-SSA并最终转换成prune-SSA，这一切...
复制链接

扫一扫

专栏目录