1.3.5. LoopInfo遍
像DominatorTree一样,LoopInfo是派生自FunctionPass。它有着与DominatorTree类似的要求。这个遍的作用是找出函数中的自然循环。
1.3.5.1. LoopInfo的实例化
类似的,在LoopInfo遍注册时即发生了LoopInfo实例的构造。
567 LoopInfo() : FunctionPass(ID) {
568 initializeLoopInfoPass(*PassRegistry::getPassRegistry ());
569 }
initializeLoopInfoPass及相关的初始化方法由下面的宏来定义。这里定义了两个方法:initializeLoopInfoPassOnce与initializeLoopInfoPass,同样前者保证初始化操作只执行一次。
49 INITIALIZE_PASS_BEGIN(LoopInfo,"loops", "Natural Loop Information", true, true)
50 INITIALIZE_PASS_DEPENDENCY(DominatorTree)
51 INITIALIZE_PASS_END(LoopInfo,"loops", "Natural Loop Information", true, true)
164 #defineINITIALIZE_PASS_BEGIN(passName, arg, name, cfg, analysis) \
165 static void*initialize##passName##PassOnce(PassRegistry &Registry) {
166 #defineINITIALIZE_PASS_DEPENDENCY(depName) \
167 initialize##depName##Pass(Registry);
172 #define INITIALIZE_PASS_END(passName,arg, name, cfg, analysis) \
173 PassInfo *PI = newPassInfo(name, arg, & passName ::ID, \
174 PassInfo::NormalCtor_t(callDefaultCtor< passName >), cfg,analysis); \
175 Registry.registerPass(*PI, true); \
176 return PI;\
177 } \
178 void llvm::initialize##passName##Pass(PassRegistry&Registry) { \
179 CALL_ONCE_INITIALIZATION(initialize##passName##PassOnce)\
180 }
由于50行的宏,LoopInfo在初始化时会调用initializeDominatorTreePass,不过这个方法已经在DominatorTree遍的注册中调用过了,在这里不起任何作用。
1.3.5.2. 寻找自然循环
作为FunctionPass的派生类,LoopInfo的主要工作由runOnFunction来完成。
576 bool LoopInfo::runOnFunction(Function &) {
577 releaseMemory();
578 LI.Analyze(getAnalysis<DominatorTree>().getBase());
579 return false;
580 }
在578行,getAnalysis<DominatorTree>返回前面看到的DominatorTree的实例,它的getBase方法返回前面熟悉的DT的引用。而同一行的LI的类型是LoopInfoBase<BasicBlock,Loop>,其Analyze方法执行具体的自然循环查找。鲸书给出了自然循环的定义。首先是“回边”(back edge):其头是其尾的必经节点。那么已知回边màn,màn的自然循环是流图中由满足以下条件的节点集合与边集合组成的子图:其中,节点集合由节点n及流图中那些从它们可以到达m但不经过n的所有节点组成,边集合由所有连接其节点集合中节点的边组成。节点n是循环首节点。
517 template<classBlockT, class LoopT>
518 void LoopInfoBase<BlockT, LoopT>::
519 Analyze(DominatorTreeBase<BlockT>&DomTree) {
520
521 // Postordertraversal of the dominator tree.
522 DomTreeNodeBase<BlockT>* DomRoot =DomTree.getRootNode();
523 for(po_iterator<DomTreeNodeBase<BlockT>*> DomIter = po_begin(DomRoot),
524 DomEnd = po_end(DomRoot); DomIter !=DomEnd; ++DomIter) {
525
526 BlockT *Header = DomIter->getBlock();
527 SmallVector<BlockT *, 4> Backedges;
528
529 // Check eachpredecessor of the potential loop header.
530 typedef GraphTraits<Inverse<BlockT*>> InvBlockTraits;
531 for (typename InvBlockTraits::ChildIteratorType PI =
532 InvBlockTraits::child_begin(Header),
533 PE =InvBlockTraits::child_end(Header); PI != PE; ++PI) {
534
535 BlockT *Backedge = *PI;
536
537 // If Headerdominates predBB, this is a new loop. Collect the backedges.
538 if (DomTree.dominates(Header, Backedge)
539 &&DomTree.isReachableFromEntry(Backedge)) {
540 Backedges.push_back(Backedge);
541 }
542 }
543 // Perform abackward CFG traversal to discover and map blocks in this loop.
544 if (!Backedges.empty()) {
545 LoopT *L = newLoopT(Header);
546 discoverAndMapSubloop(L,ArrayRef<BlockT*>(Backedges), this, DomTree);
547 }
548 }
549 // Perform a singleforward CFG traversal to populate block and subloop
550 // vectors for allloops.
551 PopulateLoopsDFS<BlockT, LoopT>DFS(this);
552 DFS.traverse(DomRoot->getBlock());
553 }
523行的po_iterator的模板参数是DomTreeNodeBase<BlockT>*,它将在523行的循环中后续遍历前面产生的支配树。530行InvBlockTraits的类型是GraphTraits<Inverse <BasicBlock*> >,该类定义了“反向”遍历函数基本块的功能。这里“反向”的概念,因不同的参数而异,对于BasicBlock,所谓“正向”就是访问节点的后继者,“反向”则是访问节点的前驱。531行的InvBlockTraits::ChildIteratorType就是pred_iterator,那么这行的循环就是访问指定节点的所有前驱(predecessor)。
因为回边的头必是其尾的必经节点,即头是尾的支配者节点,因此538行判断这个条件是否成立。比如下图所示,节点1和2分别是两个循环的头节点,它们分别是其前驱4及3的支配者节点, 4-->1与3-->2都是回边。
538行的dominates的定义如下:
785 inline bool dominates(constBasicBlock* A, const BasicBlock* B) const {
786 returnDT->dominates(A, B);
787 }
786行上的dominates的定义如下:
681 template<classNodeT>
682 boolDominatorTreeBase<NodeT>::dominates(const NodeT *A, constNodeT *B) {
683 if (A == B)
684 returntrue;
685
686 // Cast away theconst qualifiers here. This is ok since
687 // this functiondoesn't actually return the values returned
688 // from getNode.
689 returndominates(getNode(const_cast<NodeT*>(A)),
690 getNode(const_cast<NodeT*>(B)));
681 }
DomTreeNodeBase里的DomTreeNodes的类型是DenseMap<NodeT*,DomTreeNodeBase< NodeT>*>, NodeT在这里是BasicBlock,而其中的映射关系在前面构建支配树时就已经完成了,因此336行的lookup会返回对应的DomTreeNodeBase<BasicBlock>指针。
335 inlineDomTreeNodeBase<NodeT> *getNode(NodeT*BB) const {
336 returnDomTreeNodes.lookup(BB);
337 }
下面的isReachableFromEntry对DomTreeNodeBase<NodeT> *的重载不做任何事情,直接返回参数。
378 inline booldominates(const DomTreeNodeBase<NodeT>*A,
379 const DomTreeNodeBase<NodeT> *B) {
380 // A nodetrivially dominates itself.
381 if (B == A)
382 returntrue;
383
384 // An unreachablenode is dominated by anything.
385 if (!isReachableFromEntry(B))
386 returntrue;
387
388 // And dominatesnothing.
389 if (!isReachableFromEntry(A))
390 return false;
391
392 // Compare theresult of the tree walk and the dfs numbers, if expensive
393 // checks areenabled.
394 #ifdef XDEBUG
395 assert((!DFSInfoValid||
396 (dominatedBySlowTreeWalk(A, B) ==B->DominatedBy(A))) &&
397 "Tree walk disagrees with dfsnumbers!");
398 #endif
399
400 if (DFSInfoValid)
401 returnB->DominatedBy(A);
402
403 // If we end upwith too many slow queries, just update the
404 // DFS numbers on the theory that we are going to keepquerying.
405 SlowQueries++;
406 if (SlowQueries > 32) {
407 updateDFSNumbers();
408 returnB->DominatedBy(A);
409 }
410
411 returndominatedBySlowTreeWalk(A, B);
412 }
上面400行的DFSInfoValid在前面的updateDFSNumbers方法里已经设置为true,因此下面的DominatedBy就能告诉我们答案。updateDFSNumbers设置DFSNumIn与DFSNumOut的方式使得这个判断十分简洁。
145 bool DominatedBy(const DomTreeNodeBase<NodeT> *other) const {
146 returnthis->DFSNumIn >= other->DFSNumIn &&
147 this->DFSNumOut <=other->DFSNumOut;
148 }
确定了指定节点的所有回边后,在Analyze的545行首先创建一个循环的llvm表示——Loop类实例,然后调用下面的函数来发现并映射子循环。在376行我们必须使用back的原因是390行处的insert。这确保我们总是优先遍历组成特定循环的节点。
364 template<classBlockT, class LoopT>
365 static void discoverAndMapSubloop(LoopT *L,ArrayRef<BlockT*> Backedges,
366 LoopInfoBase<BlockT, LoopT> *LI,
367 DominatorTreeBase<BlockT> &DomTree) {
368 typedefGraphTraits<Inverse<BlockT*> > InvBlockTraits;
369
370 unsigned NumBlocks = 0;
371 unsigned NumSubloops = 0;
372
373 // Perform abackward CFG traversal using a worklist.
374 std::vector<BlockT *>ReverseCFGWorklist(Backedges.begin(), Backedges.end());
375 while(!ReverseCFGWorklist.empty()) {
376 BlockT *PredBB = ReverseCFGWorklist.back();
377 ReverseCFGWorklist.pop_back();
378
379 LoopT *Subloop = LI->getLoopFor(PredBB);
380 if (!Subloop) {
381 if(!DomTree.isReachableFromEntry(PredBB))
382 continue;
383
384 // This is anundiscovered block. Map it to the current loop.
385 LI->changeLoopFor(PredBB, L);
386 ++NumBlocks;
387 if (PredBB == L->getHeader())
388 continue;
389 // Push allblock predecessors on the worklist.
390 ReverseCFGWorklist.insert(ReverseCFGWorklist.end(),
391 InvBlockTraits::child_begin(PredBB),
392 InvBlockTraits::child_end(PredBB));
393 }
394 else {
395 // This is adiscovered block. Find its outermost discovered loop.
396 while(LoopT *Parent = Subloop->getParentLoop())
397 Subloop = Parent;
398
399 // If it isalready discovered to be a subloop of this loop, continue.
400 if (Subloop == L)
401 continue;
402
403 // Discover asubloop of this loop.
404 Subloop->setParentLoop(L);
405 ++NumSubloops;
406 NumBlocks +=Subloop->getBlocks().capacity();
407 PredBB = Subloop->getHeader();
408 // Continuetraversal along predecessors that are not loop-back edges from
409 // within thissubloop tree itself. Note that a predecessor may directly
410 // reachanother subloop that is not yet discovered to be a subloop of
411 // this loop,which we must traverse.
412 for (typename InvBlockTraits::ChildIteratorType PI =
413 InvBlockTraits::child_begin(PredBB),
414 PE =InvBlockTraits::child_end(PredBB); PI != PE; ++PI) {
415 if (LI->getLoopFor(*PI) != Subloop)
416 ReverseCFGWorklist.push_back(*PI);
417 }
418 }
419 }
420 L->getSubLoopsVector().reserve(NumSubloops);
421 L->getBlocksVector().reserve(NumBlocks);
422 }
在LoopInfoBase的定义中,BBMap的类型是DenseMap<BasicBlock*, Loop*>,它把循环与其构成的基本块映射起来。在379行,如果PredBB还没有映射到某一个Loop,在385行把它映射到L。在Analyze的545行,在构建Loop对象时,传入了Header,这是个循环首节点,387行的getHeader就是返回这个基本块对象的地址。如果PredBB不是这个基本块,那么继续访问其前驱节点,直到碰到循环首节点,或者内嵌循环的首节点。而对于这个内层循环来说,Subloop是其基本块对应的Loop对象,因此如果满足400条件,就表明我们在处理当前的循环。在最接近当前循环的内嵌循环Loop对象里,396行的getParentLoop返回0,它里面的循环的getParentLoop则返回指向它的Loop*(保存在Subloop中,397行),404行把当前循环设置为最接近的内嵌循环的父循环。412行的遍历,是为了涵盖内外循环不同的出现次序(即一是先碰到外层循环,一是先碰到内层循环)所必须的。
在Analyze的最后,把上面产生的信息汇集到循环所对应的Loop对象里(这些Loop对象现在保存在LoopInfo中),通过下面的PopulateLoopsDFS::traverse函数。
455 template<classBlockT, class LoopT>
456 voidPopulateLoopsDFS<BlockT, LoopT>::traverse(BlockT*EntryBlock) {
457 pushBlock(EntryBlock);
458 VisitedBlocks.insert(EntryBlock);
459 while(!DFSStack.empty()) {
460 // Traverse theleftmost path as far as possible.
461 while(dfsSucc() != dfsSuccEnd()) {
462 BlockT *BB = *dfsSucc();
463 ++dfsSucc();
464 if (!VisitedBlocks.insert(BB).second)
465 continue;
466
467 // Push thenext DFS successor onto the stack.
468 pushBlock(BB);
469 }
470 // Visit thetop of the stack in postorder and backtrack.
471 insertIntoLoop(dfsSource());
472 DFSStack.pop_back();
473 }
474 }
PopulateLoopsDFS中的DFSStack具有类型std::vector<std::pair<BlockT*,SuccIterTy> >,其中BlockT是BasicBlock,SuccIterTy是SuccIterator<TerminatorInst*, BasicBlock>,因此下面449行加入基本块及其后继者的迭代器。
448 void pushBlock(BlockT*Block) {
449 DFSStack.push_back(std::make_pair(Block,BlockTraits::child_begin(Block)));
450 }
458行的VisitedBlocks是DenseSet<constBasicBlock *>,461行的dfsSuccEnd及相关的函数则是有如下定义:
444 BlockT *dfsSource() { return DFSStack.back().first; }
445 SuccIterTy &dfsSucc() { return DFSStack.back().second; }
446 SuccIterTy dfsSuccEnd() { return BlockTraits::child_end(dfsSource()); }
显然461行是迭代指定基本块的所有后继者,以及这些后继者的后继者,这是一个深度优先的遍历,并以后序遍历调用下面的函数。
479 template<classBlockT, class LoopT>
480 void PopulateLoopsDFS<BlockT,LoopT>::insertIntoLoop(BlockT *Block) {
481 LoopT *Subloop = LI->getLoopFor(Block);
482 if (Subloop && Block ==Subloop->getHeader()) {
483 // We reachthis point once per subloop after processing all the blocks in
484 // the subloop.
485 if (Subloop->getParentLoop())
486 Subloop->getParentLoop()->getSubLoopsVector().push_back(Subloop);
487 else
488 LI->addTopLevelLoop(Subloop);
489
490 // Forconvenience, Blocks and Subloops are inserted in postorder. Reverse
491 // the lists,except for the loop header, which is always at the beginning.
492 std::reverse(Subloop->getBlocksVector().begin()+1,
493 Subloop->getBlocksVector().end());
494 std::reverse(Subloop->getSubLoopsVector().begin(),
495 Subloop->getSubLoopsVector().end());
496
497 Subloop = Subloop->getParentLoop();
498 }
499 for (;Subloop; Subloop = Subloop->getParentLoop())
500 Subloop->getBlocksVector().push_back(Block);
501 }
前面看到,481行的getLoopFor返回指定基本块所在循环的Loop对象,而482行的getHeader返回该循环的首节点,485行的getParentLoop返回该循环的父循环(如果没有就是null)。LoopBase(Loop的基类,函数中的LI变量)使用SubLoops保存内嵌的子循环,使用Blocks保存循环内包含的基本块。
至此,LoopInfo查找自然循环的工作完成。