3.4.3. 指令选择代码的自动生成
LLVM指令选择代码有相当部分是自动生成的(TableGen目前尚要依靠手写代码来进行特定的处理。未来TableGen的目标是后端代码完全自动生成——雄伟而野心勃勃的目标!J)。这里生成的文件是X86GenDAGISel.inc(以X86为例)。
122 void DAGISelEmitter::run(raw_ostream &OS) {
123 emitSourceFileHeader("DAG Instruction Selector for the " +
124 CGP.getTargetInfo().getName() + " target", OS);
125
126 OS << "// *** NOTE: This file is #included into the middle of the target\n"
127 << "// *** instruction selector class. These functions are really "
128 << "methods.\n\n";
129
130 DEBUG(errs() << "\n\nALL PATTERNS TO MATCH:\n\n";
131 for (CodeGenDAGPatterns::ptm_iterator I = CGP.ptm_begin(),
132 E = CGP.ptm_end(); I != E; ++I) {
133 errs() << "PATTERN: "; I->getSrcPattern()->dump();
134 errs() << "\nRESULT: "; I->getDstPattern()->dump();
135 errs() << "\n";
136 });
137
138 // Add all the patterns to a temporary list so we can sort them.
139 std::vector<const PatternToMatch*> Patterns;
140 for (CodeGenDAGPatterns::ptm_iterator I = CGP.ptm_begin(), E = CGP.ptm_end();
141 I != E; ++I)
142 Patterns.push_back(&*I);
143
144 // We want to process the matches in order of minimal cost. Sort the patterns
145 // so the least cost one is at the start.
146 std::sort(Patterns.begin(), Patterns.end(), PatternSortingPredicate(CGP)); <-- v7.0删除
std::stable_sort(Patterns.begin(), Patterns.end(), <-- v7.0增加
PatternSortingPredicate(CGP));
123行的emitSourceFileHeader()为生成的文件产生一个文件头,内容大致上是这样的:
/*===- TableGen'erated file -------------------------------------*- C++ -*-===*\
|* *|
|* DAG Instruction Selector for the X86 target *|
|* *|
|* Automatically generated file, do not edit! *|
|* *|
\*===----------------------------------------------------------------------===*/
// *** NOTE: This file is #included into the middle of the target
// *** instruction selector class. These functions are really methods.
接着在146行对所有的PatternToMatch实例进行排序(v7.0使用std::stable_sort()方法)。因为在进行指令选择时,是按照顺序匹配的。有可能多个模式能匹配一条指令,则第一个匹配上的模式被选中,因此需要进行排序,确保最优的模式被优先选中。
3.4.3.1. PatternToMatch的排序
影响PatternToMatch排序的因素有几个。首先是返回值类型,标量类型好过浮点类型,浮点类型又胜过向量类型。
78 struct PatternSortingPredicate {
79 PatternSortingPredicate(CodeGenDAGPatterns &cgp) : CGP(cgp) {}
80 CodeGenDAGPatterns &CGP;
81
82 bool operator()(const PatternToMatch *LHS, const PatternToMatch *RHS) {
83 const TreePatternNode *LHSSrc = LHS->getSrcPattern();
84 const TreePatternNode *RHSSrc = RHS->getSrcPattern();
85
86 MVT LHSVT = (LHSSrc->getNumTypes() != 0 ? LHSSrc->getType(0) : MVT::Other);
87 MVT RHSVT = (RHSSrc->getNumTypes() != 0 ? RHSSrc->getType(0) : MVT::Other);
88 if (LHSVT.isVector() != RHSVT.isVector())
89 return RHSVT.isVector();
90
91 if (LHSVT.isFloatingPoint() != RHSVT.isFloatingPoint())
92 return RHSVT.isFloatingPoint();
93
94 // Otherwise, if the patterns might both match, sort based on complexity,
95 // which means that we prefer to match patterns that cover more nodes in the
96 // input over nodes that cover fewer.
97 int LHSSize = LHS->getPatternComplexity(CGP);
98 int RHSSize = RHS->getPatternComplexity(CGP);
99 if (LHSSize > RHSSize) return true; // LHS -> bigger -> less cost
100 if (LHSSize < RHSSize) return false;
101
102 // If the patterns have equal complexity, compare generated instruction cost
103 unsigned LHSCost = getResultPatternCost(LHS->getDstPattern(), CGP);
104 unsigned RHSCost = getResultPatternCost(RHS->getDstPattern(), CGP);
105 if (LHSCost < RHSCost) return true;
106 if (LHSCost > RHSCost) return false;
107
108 unsigned LHSPatSize = getResultPatternSize(LHS->getDstPattern(), CGP);
109 unsigned RHSPatSize = getResultPatternSize(RHS->getDstPattern(), CGP);
110 if (LHSPatSize < RHSPatSize) return true;
111 if (LHSPatSize > RHSPatSize) return false;
112
113 // Sort based on the UID of the pattern, giving us a deterministic ordering <-- v7.0删除
114 // if all other sorting conditions fail.
115 assert(LHS == RHS || LHS->ID != RHS->ID);
// Sort based on the UID of the pattern, to reflect source order. <-- v7.0增加
// Note that this is not guaranteed to be unique, since a single source
// pattern may have been resolved into multiple match patterns due to
// alternative fragments. To ensure deterministic output, always use
// std::stable_sort with this predicate.
116 return LHS->ID < RHS->ID;
117 }
118 };
其次,匹配模式越复杂越占优(即包含节点越多)。在指令定义的AddedComplexity域可用于描述指令的额外复杂度(默认为0)。下面836行的getAddedComplexity()方法获取这个值(上面v7.0引入的改动描述了为什么要使用std::stable_sort(),以及为什么要去掉断言)。
834 int PatternToMatch::
835 getPatternComplexity(const CodeGenDAGPatterns &CGP) const {
836 return getPatternSize(getSrcPattern(), CGP) + getAddedComplexity();
837 }
方法getPatternSize()则是获取所谓的模式“尺寸”。TablegGen希望在生成的代码里“大模式”优先于“小模式”匹配。在这里可以看出所谓的“大模式”能匹配更大的DAG,这通常意味着生成更好、更高效的代码。
787 static unsigned getPatternSize(const TreePatternNode *P,
788 const CodeGenDAGPatterns &CGP) {
789 unsigned Size = 3; // The node itself.
790 // If the root node is a ConstantSDNode, increases its size.
791 // e.g. (set R32:$dst, 0).
792 if (P->isLeaf() && isa<IntInit>(P->getLeafValue()))
793 Size += 2;
794
795 // FIXME: This is a hack to statically increase the priority of patterns <- v7.0删除
796 // which maps a sub-dag to a complex pattern. e.g. favors LEA over ADD.
797 // Later we can allow complexity / cost for each pattern to be (optionally)
798 // specified. To get best possible pattern match we'll need to dynamically
799 // calculate the complexity of all patterns a dag can potentially map to.
800 const ComplexPattern *AM = P->getComplexPatternInfo(CGP);
801 if (AM) {
802 Size += AM->getNumOperands() * 3; <-- v7.0删除
Size += AM->getComplexity(); <-- v7.0增加
803
804 // We don't want to count any children twice, so return early.
805 return Size;
806 }
807
808 // If this node has some predicate function that must match, it adds to the
809 // complexity of this node.
810 if (!P->getPredicateFns().empty())
811 ++Size;
812
813 // Count children in the count if they are also nodes.
814 for (unsigned i = 0, e = P->getNumChildren(); i != e; ++i) {
815 TreePatternNode *Child = P->getChild(i);
816 if (!Child->isLeaf() && Child->getNumTypes() && <-- v7.0删除
817 Child->getType(0) != MVT::Other)
818 Size += getPatternSize(Child, CGP);
819 else if (Child->isLeaf()) {
820 if (isa<IntInit>(Child->getLeafValue()))
821 Size += 5; // Matches a ConstantSDNode (+3) and a specific value (+2).
822 else if (Child->getComplexPatternInfo(CGP))
823 Size += getPatternSize(Child, CGP);
824 else if (!Child->getPredicateFns().empty())
825 ++Size;
826 }
if (!Child->isLeaf() && Child->getNumTypes()) { <-- v7.0增加
const TypeSetByHwMode &T0 = Child->getType(0);
// At this point, all variable type sets should be simple, i.e. only
// have a default mode.
if (T0.getMachineValueType() != MVT::Other) {
Size += getPatternSize(Child, CGP);
continue;
}
}
if (Child->isLeaf()) {
if (isa<IntInit>(Child->getLeafValue()))
Size += 5; // Matches a ConstantSDNode (+3) and a specific value (+2).
else if (Child->getComplexPatternInfo(CGP))
Size += getPatternSize(Child, CGP);
else if (!Child->getPredicateFns().empty())
++Size;
}
827 }
828
829 return Size;
830 }
如果“大小”不相上下,接着比较目标模板的“代价”。50行的usesCustomInserter如果是1,表示插入该指令需要特殊的支持。显然,目标模板生成的指令越多,代价越大。
41 static unsigned getResultPatternCost(TreePatternNode *P,
42 CodeGenDAGPatterns &CGP) {
43 if (P->isLeaf()) return 0;
44
45 unsigned Cost = 0;
46 Record *Op = P->getOperator();
47 if (Op->isSubClassOf("Instruction")) {
48 Cost++;
49 CodeGenInstruction &II = CGP.getTargetInfo().getInstruction(Op);
50 if (II.usesCustomInserter)
51 Cost += 10;
52 }
53 for (unsigned i = 0, e = P->getNumChildren(); i != e; ++i)
54 Cost += getResultPatternCost(P->getChild(i), CGP);
55 return Cost;
56 }
如果还是不相上下,接着比较目标模板的“大小”。这次起作用的是Instruction定义里的CodeSize域。
60 static unsigned getResultPatternSize(TreePatternNode *P,
61 CodeGenDAGPatterns &CGP) {
62 if (P->isLeaf()) return 0;
63
64 unsigned Cost = 0;
65 Record *Op = P->getOperator();
66 if (Op->isSubClassOf("Instruction")) {
67 Cost += Op->getValueAsInt("CodeSize");
68 }
69 for (unsigned i = 0, e = P->getNumChildren(); i != e; ++i)
70 Cost += getResultPatternSize(P->getChild(i), CGP);
71 return Cost;
72 }
如果这样还是没辙,就只能比较ID了:谁先定义,谁就是老大。