2.2.2. 参数描述
Instruction定义中的OutOperandList与InOperandList分别是这样的dag:(outs op1, op2, …),(ins op1, op2, …)。Op可以是寄存器(这时通过RegisterClass来说明其寄存器类型)
2.2.2.1. 寄存器
2.2.2.1.1. Register
今天,可用的目标机器一定有寄存器(完全依靠栈运行的机器已经没有了)。在LLVM的后端,对寄存器的描述也是一个重要的工作。这些寄存器定义保存在目标机器文件TargetRegisterInfo.td中,它们都用到下面的基类(target.td)。
80 class Register<string n, list<string> altNames = []> {
81 string Namespace = "";
82 string AsmName = n;
83 list<string> AltNames = altNames;
84
85 // Aliases - A list of registers that this register overlaps with. A read or
86 // modification of this register can potentially read or modify the aliased
87 // registers.
88 list<Register> Aliases = [];
89
90 // SubRegs - A list of registers that are parts of this register. Note these
91 // are "immediate" sub-registers and the registers within the list do not
92 // themselves overlap. e.g. For X86, EAX's SubRegs list contains only [AX],
93 // not [AX, AH, AL].
94 list<Register> SubRegs = [];
95
96 // SubRegIndices - For each register in SubRegs, specify the SubRegIndex used
97 // to address it. Sub-sub-register indices are automatically inherited from
98 // SubRegs.
99 list<SubRegIndex> SubRegIndices = [];
100
101 // RegAltNameIndices - The alternate name indices which are valid for this
102 // register.
103 list<RegAltNameIndex> RegAltNameIndices = [];
104
105 // DwarfNumbers - Numbers used internally by gcc/gdb to identify the register.
106 // These values can be determined by locating the <target>.h file in the
107 // directory llvmgcc/gcc/config/<target>/ and looking for REGISTER_NAMES. The
108 // order of these names correspond to the enumeration used by gcc. A value of
109 // -1 indicates that the gcc number is undefined and -2 that register number
110 // is invalid for this mode/flavour.
111 list<int> DwarfNumbers = [];
112
113 // CostPerUse - Additional cost of instructions using this register compared
114 // to other registers in its class. The register allocator will try to
115 // minimize the number of instructions using a register with a CostPerUse.
116 // This is used by the x86-64 and ARM Thumb targets where some registers
117 // require larger instruction encodings.
118 int CostPerUse = 0;
119
120 // CoveredBySubRegs - When this bit is set, the value of this register is
121 // completely determined by the value of its sub-registers. For example, the
122 // x86 register AX is covered by its sub-registers AL and AH, but EAX is not
123 // covered by its sub-register AX.
124 bit CoveredBySubRegs = 0;
125
126 // HWEncoding - The target specific hardware encoding for this register.
127 bits<16> HWEncoding = 0;
128 }
V7.0在Register定义中添加了一个域:bit isArtificial = 0;
目标机器根据需要进一步派生。如X86机器的派生定义是X86Reg(X86RegisterInfo.td):
16 class X86Reg<string n, bits<16> Enc, list<Register> subregs = []> : Register<n> {
17 let Namespace = "X86";
18 let HWEncoding = Enc;
19 let SubRegs = subregs;
20 }
Register的定义比想象的要复杂,这是因为有些寄存器是有可援引部分的。比如X86的EAX,AX与AL都是可援引寄存器,在TD文件里都有一个Register定义。但实际上EAX,AX与AL援引的都是同一个寄存器,而且EAX包含了AX,AX包含了AL。为了更好地使用寄存器,后端需要知道这些关系。因此,需要用到上面的SubRegs 、SubRegIndices。其中SubRegIndices是一个SubRegIndex类型列表(target.td),它给出了一个偏移及大小,唯一确定了一个寄存器中的可援引部分(子寄存器)。注意,寄存器索引与具体的寄存器是无关的,一个指定的偏移与大小,只需要一个寄存器索引定义来表述。SubRegs与SubRegIndices必须是一一对应的。
25 class SubRegIndex<int size, int offset = 0> {
26 string Namespace = "";
27
28 // Size - Size (in bits) of the sub-registers represented by this index.
29 int Size = size;
30
31 // Offset - Offset of the first bit that is part of this sub-register index.
32 // Set it to -1 if the same index is used to represent sub-registers that can
33 // be at different offsets (for example when using an index to access an
34 // element in a register tuple).
35 int Offset = offset;
36
37 // ComposedOf - A list of two SubRegIndex instances, [A, B].
38 // This indicates that this SubRegIndex is the result of composing A and B.
39 // See ComposedSubRegIndex.
40 list<SubRegIndex> ComposedOf = [];
41
42 // CoveringSubRegIndices - A list of two or more sub-register indexes that
43 // cover this sub-register.
44 //
45 // This field should normally be left blank as TableGen can infer it.
46 //
47 // TableGen automatically detects sub-registers that straddle the registers
48 // in the SubRegs field of a Register definition. For example:
49 //
50 // Q0 = dsub_0 -> D0, dsub_1 -> D1
51 // Q1 = dsub_0 -> D2, dsub_1 -> D3
52 // D1_D2 = dsub_0 -> D1, dsub_1 -> D2
53 // QQ0 = qsub_0 -> Q0, qsub_1 -> Q1
54 //
55 // TableGen will infer that D1_D2 is a sub-register of QQ0. It will be given
56 // the synthetic index dsub_1_dsub_2 unless some SubRegIndex is defined with
57 // CoveringSubRegIndices = [dsub_1, dsub_2].
58 list<SubRegIndex> CoveringSubRegIndices = [];
59 }
TableGen能根据Register定义中SubRegIndices的声明自动推导寄存器索引间的关系。但存在复杂的、难以描述与推导的情形(参考下面ARM的例子)。这时需要使用SubRegIndex的ComposedOf以及CoveringSubRegIndices域来描述(LLVM-3.6实际上还没有使用CoveringSubRegIndices。7.0也没有)。
例如,索引Id1援引寄存器R的子寄存器Rr,索引Id2援引Rr的子寄存器r,如果索引Id3援引R对应r的部分,Id3等效于对R先后施行Id1与Id2(下面称为复合,Compose)。通常,TableGen能推导出这样的关系,否则需要在ComposedOf域中明确指出。
2.2.2.1.2. X86的例子
举例来说,在X86RegisterInfo.td文件中对X86目标机器定义了以下的SubRegIndex:
23 let Namespace = "X86" in {
24 def sub_8bit : SubRegIndex<8>;
25 def sub_8bit_hi : SubRegIndex<8, 8>;
26 def sub_16bit : SubRegIndex<16>;
27 def sub_32bit : SubRegIndex<32>;
28 def sub_xmm : SubRegIndex<128>;
29 def sub_ymm : SubRegIndex<256>;
30 }
那么X86目标机器可以使用的寄存器具有以下定义。48~74行是X86中最小的可援引寄存器,它们不需要使用索引。但从78行的AX开始的寄存器都存在可援引部分,每个可援引部分都定义一个索引,以及该部分的定义,比如索引sub_8bit与sub_8bit_hi分别援引AX中的AL与AH。
V7.0增加了这两个SubRegIndex:
def sub_8bit_hi_phony : SubRegIndex<8, 8>;
def sub_16bit_hi : SubRegIndex<16, 16>;
原因是从下面83行开始的寄存器定义,其SubRegs声明部分是不完整的。以87行的SP为例,v7.0现在声明为:def SP : X86Reg<"sp", 4, [SPL,SPH]>;
SPH的定义是(v7.0增加了若干类似定义,不一一列举):
let isArtificial = 1 in {
…
def SPH : X86Reg<"", -1>;
…
}
不过,这里的描述是不完整的。比如:R8àR8DàR8WàR8B,R8B在R8中是可直接援引的,但相应的索引没有直接定义。推导这样的索引是TableGen的工作,下面可以看到。
46 // 8-bit registers
47 // Low registers
48 def AL : X86Reg<"al", 0>;
49 def DL : X86Reg<"dl", 2>;
50 def CL : X86Reg<"cl", 1>;
51 def BL : X86Reg<"bl", 3>;
52
53 // High registers. On x86-64, these cannot be used in any instruction
54 // with a REX prefix.
55 def AH : X86Reg<"ah", 4>;
56 def DH : X86Reg<"dh", 6>;
57 def CH : X86Reg<"ch", 5>;
58 def BH : X86Reg<"bh", 7>;
59
60 // X86-64 only, requires REX.
61 let CostPerUse = 1 in {
62 def SIL : X86Reg<"sil", 6>;
63 def DIL : X86Reg<"dil", 7>;
64 def BPL : X86Reg<"bpl", 5>;
65 def SPL : X86Reg<"spl", 4>;
66 def R8B : X86Reg<"r8b", 8>;
67 def R9B : X86Reg<"r9b", 9>;
68 def R10B : X86Reg<"r10b", 10>;
69 def R11B : X86Reg<"r11b", 11>;
70 def R12B : X86Reg<"r12b", 12>;
71 def R13B : X86Reg<"r13b", 13>;
72 def R14B : X86Reg<"r14b", 14>;
73 def R15B : X86Reg<"r15b", 15>;
74 }
75
76 // 16-bit registers
77 let SubRegIndices = [sub_8bit, sub_8bit_hi], CoveredBySubRegs = 1 in {
78 def AX : X86Reg<"ax", 0, [AL,AH]>;
79 def DX : X86Reg<"dx", 2, [DL,DH]>;
80 def CX : X86Reg<"cx", 1, [CL,CH]>;
81 def BX : X86Reg<"bx", 3, [BL,BH]>;
82 }
83 let SubRegIndices = [sub_8bit] in {
84 def SI : X86Reg<"si", 6, [SIL]>;
85 def DI : X86Reg<"di", 7, [DIL]>;
86 def BP : X86Reg<"bp", 5, [BPL]>;
87 def SP : X86Reg<"sp", 4, [SPL]>;
88 }
89 def IP : X86Reg<"ip", 0>;
90
91 // X86-64 only, requires REX.
92 let SubRegIndices = [sub_8bit], CostPerUse = 1 in {
93 def R8W : X86Reg<"r8w", 8, [R8B]>;
94 def R9W : X86Reg<"r9w", 9, [R9B]>;
95 def R10W : X86Reg<"r10w", 10, [R10B]>;
96 def R11W : X86Reg<"r11w", 11, [R11B]>;
97 def R12W : X86Reg<"r12w", 12, [R12B]>;
98 def R13W : X86Reg<"r13w", 13, [R13B]>;
99 def R14W : X86Reg<"r14w", 14, [R14B]>;
100 def R15W : X86Reg<"r15w", 15, [R15B]>;
101 }
102
103 // 32-bit registers
104 let SubRegIndices = [sub_16bit] in {
105 def EAX : X86Reg<"eax", 0, [AX]>, DwarfRegNum<[-2, 0, 0]>;
106 def EDX : X86Reg<"edx", 2, [DX]>, DwarfRegNum<[-2, 2, 2]>;
107 def ECX : X86Reg<"ecx", 1, [CX]>, DwarfRegNum<[-2, 1, 1]>;
108 def EBX : X86Reg<"ebx", 3, [BX]>, DwarfRegNum<[-2, 3, 3]>;
109 def ESI : X86Reg<"esi", 6, [SI]>, DwarfRegNum<[-2, 6, 6]>;
110 def EDI : X86Reg<"edi", 7, [DI]>, DwarfRegNum<[-2, 7, 7]>;
111 def EBP : X86Reg<"ebp", 5, [BP]>, DwarfRegNum<[-2, 4, 5]>;
112 def ESP : X86Reg<"esp", 4, [SP]>, DwarfRegNum<[-2, 5, 4]>;
113 def EIP : X86Reg<"eip", 0, [IP]>, DwarfRegNum<[-2, 8, 8]>;
114
115 // X86-64 only, requires REX
116 let CostPerUse = 1 in {
117 def R8D : X86Reg<"r8d", 8, [R8W]>;
118 def R9D : X86Reg<"r9d", 9, [R9W]>;
119 def R10D : X86Reg<"r10d", 10, [R10W]>;
120 def R11D : X86Reg<"r11d", 11, [R11W]>;
121 def R12D : X86Reg<"r12d", 12, [R12W]>;
122 def R13D : X86Reg<"r13d", 13, [R13W]>;
123 def R14D : X86Reg<"r14d", 14, [R14W]>;
124 def R15D : X86Reg<"r15d", 15, [R15W]>;
125 }}
126
127 // 64-bit registers, X86-64 only
128 let SubRegIndices = [sub_32bit] in {
129 def RAX : X86Reg<"rax", 0, [EAX]>, DwarfRegNum<[0, -2, -2]>;
130 def RDX : X86Reg<"rdx", 2, [EDX]>, DwarfRegNum<[1, -2, -2]>;
131 def RCX : X86Reg<"rcx", 1, [ECX]>, DwarfRegNum<[2, -2, -2]>;
132 def RBX : X86Reg<"rbx", 3, [EBX]>, DwarfRegNum<[3, -2, -2]>;
133 def RSI : X86Reg<"rsi", 6, [ESI]>, DwarfRegNum<[4, -2, -2]>;
134 def RDI : X86Reg<"rdi", 7, [EDI]>, DwarfRegNum<[5, -2, -2]>;
135 def RBP : X86Reg<"rbp", 5, [EBP]>, DwarfRegNum<[6, -2, -2]>;
136 def RSP : X86Reg<"rsp", 4, [ESP]>, DwarfRegNum<[7, -2, -2]>;
137
138 // These also require REX.
139 let CostPerUse = 1 in {
140 def R8 : X86Reg<"r8", 8, [R8D]>, DwarfRegNum<[ 8, -2, -2]>;
141 def R9 : X86Reg<"r9", 9, [R9D]>, DwarfRegNum<[ 9, -2, -2]>;
142 def R10 : X86Reg<"r10", 10, [R10D]>, DwarfRegNum<[10, -2, -2]>;
143 def R11 : X86Reg<"r11", 11, [R11D]>, DwarfRegNum<[11, -2, -2]>;
144 def R12 : X86Reg<"r12", 12, [R12D]>, DwarfRegNum<[12, -2, -2]>;
145 def R13 : X86Reg<"r13", 13, [R13D]>, DwarfRegNum<[13, -2, -2]>;
146 def R14 : X86Reg<"r14", 14, [R14D]>, DwarfRegNum<[14, -2, -2]>;
147 def R15 : X86Reg<"r15", 15, [R15D]>, DwarfRegNum<[15, -2, -2]>;
148 def RIP : X86Reg<"rip", 0, [EIP]>, DwarfRegNum<[16, -2, -2]>;
149 }}
150
151 // MMX Registers. These are actually aliased to ST0 .. ST7
152 def MM0 : X86Reg<"mm0", 0>, DwarfRegNum<[41, 29, 29]>;
153 def MM1 : X86Reg<"mm1", 1>, DwarfRegNum<[42, 30, 30]>;
154 def MM2 : X86Reg<"mm2", 2>, DwarfRegNum<[43, 31, 31]>;
155 def MM3 : X86Reg<"mm3", 3>, DwarfRegNum<[44, 32, 32]>;
156 def MM4 : X86Reg<"mm4", 4>, DwarfRegNum<[45, 33, 33]>;
157 def MM5 : X86Reg<"mm5", 5>, DwarfRegNum<[46, 34, 34]>;
158 def MM6 : X86Reg<"mm6", 6>, DwarfRegNum<[47, 35, 35]>;
159 def MM7 : X86Reg<"mm7", 7>, DwarfRegNum<[48, 36, 36]>;
160
161 // Pseudo Floating Point registers
162 def FP0 : X86Reg<"fp0", 0>;
163 def FP1 : X86Reg<"fp1", 0>;
164 def FP2 : X86Reg<"fp2", 0>;
165 def FP3 : X86Reg<"fp3", 0>;
166 def FP4 : X86Reg<"fp4", 0>;
167 def FP5 : X86Reg<"fp5", 0>;
168 def FP6 : X86Reg<"fp6", 0>;
169 def FP7 : X86Reg<"fp7", 0>;
170
171 // XMM Registers, used by the various SSE instruction set extensions.
172 def XMM0: X86Reg<"xmm0", 0>, DwarfRegNum<[17, 21, 21]>;
173 def XMM1: X86Reg<"xmm1", 1>, DwarfRegNum<[18, 22, 22]>;
174 def XMM2: X86Reg<"xmm2", 2>, DwarfRegNum<[19, 23, 23]>;
175 def XMM3: X86Reg<"xmm3", 3>, DwarfRegNum<[20, 24, 24]>;
176 def XMM4: X86Reg<"xmm4", 4>, DwarfRegNum<[21, 25, 25]>;
177 def XMM5: X86Reg<"xmm5", 5>, DwarfRegNum<[22, 26, 26]>;
178 def XMM6: X86Reg<"xmm6", 6>, DwarfRegNum<[23, 27, 27]>;
179 def XMM7: X86Reg<"xmm7", 7>, DwarfRegNum<[24, 28, 28]>;
180
181 // X86-64 only
182 let CostPerUse = 1 in {
183 def XMM8: X86Reg<"xmm8", 8>, DwarfRegNum<[25, -2, -2]>;
184 def XMM9: X86Reg<"xmm9", 9>, DwarfRegNum<[26, -2, -2]>;
185 def XMM10: X86Reg<"xmm10", 10>, DwarfRegNum<[27, -2, -2]>;
186 def XMM11: X86Reg<"xmm11", 11>, DwarfRegNum<[28, -2, -2]>;
187 def XMM12: X86Reg<"xmm12", 12>, DwarfRegNum<[29, -2, -2]>;
188 def XMM13: X86Reg<"xmm13", 13>, DwarfRegNum<[30, -2, -2]>;
189 def XMM14: X86Reg<"xmm14", 14>, DwarfRegNum<[31, -2, -2]>;
190 def XMM15: X86Reg<"xmm15", 15>, DwarfRegNum<[32, -2, -2]>;
191
192 def XMM16: X86Reg<"xmm16", 16>, DwarfRegNum<[60, -2, -2]>;
193 def XMM17: X86Reg<"xmm17", 17>, DwarfRegNum<[61, -2, -2]>;
194 def XMM18: X86Reg<"xmm18", 18>, DwarfRegNum<[62, -2, -2]>;
195 def XMM19: X86Reg<"xmm19", 19>, DwarfRegNum<[63, -2, -2]>;
196 def XMM20: X86Reg<"xmm20", 20>, DwarfRegNum<[64, -2, -2]>;
197 def XMM21: X86Reg<"xmm21", 21>, DwarfRegNum<[65, -2, -2]>;
198 def XMM22: X86Reg<"xmm22", 22>, DwarfRegNum<[66, -2, -2]>;
199 def XMM23: X86Reg<"xmm23", 23>, DwarfRegNum<[67, -2, -2]>;
200 def XMM24: X86Reg<"xmm24", 24>, DwarfRegNum<[68, -2, -2]>;
201 def XMM25: X86Reg<"xmm25", 25>, DwarfRegNum<[69, -2, -2]>;
202 def XMM26: X86Reg<"xmm26", 26>, DwarfRegNum<[70, -2, -2]>;
203 def XMM27: X86Reg<"xmm27", 27>, DwarfRegNum<[71, -2, -2]>;
204 def XMM28: X86Reg<"xmm28", 28>, DwarfRegNum<[72, -2, -2]>;
205 def XMM29: X86Reg<"xmm29", 29>, DwarfRegNum<[73, -2, -2]>;
206 def XMM30: X86Reg<"xmm30", 30>, DwarfRegNum<[74, -2, -2]>;
207 def XMM31: X86Reg<"xmm31", 31>, DwarfRegNum<[75, -2, -2]>;
208
209 } // CostPerUse
210
211 // YMM0-15 registers, used by AVX instructions and
212 // YMM16-31 registers, used by AVX-512 instructions.
213 let SubRegIndices = [sub_xmm] in {
214 foreach Index = 0-31 in {
215 def YMM#Index : X86Reg<"ymm"#Index, Index, [!cast<X86Reg>("XMM"#Index)]>,
216 DwarfRegAlias<!cast<X86Reg>("XMM"#Index)>;
217 }
218 }
219
220 // ZMM Registers, used by AVX-512 instructions.
221 let SubRegIndices = [sub_ymm] in {
222 foreach Index = 0-31 in {
223 def ZMM#Index : X86Reg<"zmm"#Index, Index, [!cast<X86Reg>("YMM"#Index)]>,
224 DwarfRegAlias<!cast<X86Reg>("XMM"#Index)>;
225 }
226 }
227
228 // Mask Registers, used by AVX-512 instructions.
229 def K0 : X86Reg<"k0", 0>, DwarfRegNum<[118, -2, -2]>;
230 def K1 : X86Reg<"k1", 1>, DwarfRegNum<[119, -2, -2]>;
231 def K2 : X86Reg<"k2", 2>, DwarfRegNum<[120, -2, -2]>;
232 def K3 : X86Reg<"k3", 3>, DwarfRegNum<[121, -2, -2]>;
233 def K4 : X86Reg<"k4", 4>, DwarfRegNum<[122, -2, -2]>;
234 def K5 : X86Reg<"k5", 5>, DwarfRegNum<[123, -2, -2]>;
235 def K6 : X86Reg<"k6", 6>, DwarfRegNum<[124, -2, -2]>;
236 def K7 : X86Reg<"k7", 7>, DwarfRegNum<[125, -2, -2]>;
237
238 // Floating point stack registers. These don't map one-to-one to the FP
239 // pseudo registers, but we still mark them as aliasing FP registers. That
240 // way both kinds can be live without exceeding the stack depth. ST registers
241 // are only live around inline assembly.
242 def ST0 : X86Reg<"st(0)", 0>, DwarfRegNum<[33, 12, 11]>;
243 def ST1 : X86Reg<"st(1)", 1>, DwarfRegNum<[34, 13, 12]>;
244 def ST2 : X86Reg<"st(2)", 2>, DwarfRegNum<[35, 14, 13]>;
245 def ST3 : X86Reg<"st(3)", 3>, DwarfRegNum<[36, 15, 14]>;
246 def ST4 : X86Reg<"st(4)", 4>, DwarfRegNum<[37, 16, 15]>;
247 def ST5 : X86Reg<"st(5)", 5>, DwarfRegNum<[38, 17, 16]>;
248 def ST6 : X86Reg<"st(6)", 6>, DwarfRegNum<[39, 18, 17]>;
249 def ST7 : X86Reg<"st(7)", 7>, DwarfRegNum<[40, 19, 18]>;
250
251 // Floating-point status word
252 def FPSW : X86Reg<"fpsw", 0>;
253
254 // Status flags register
255 def EFLAGS : X86Reg<"flags", 0>;
256
257 // Segment registers
258 def CS : X86Reg<"cs", 1>;
259 def DS : X86Reg<"ds", 3>;
260 def SS : X86Reg<"ss", 2>;
261 def ES : X86Reg<"es", 0>;
262 def FS : X86Reg<"fs", 4>;
263 def GS : X86Reg<"gs", 5>;
264
265 // Debug registers
266 def DR0 : X86Reg<"dr0", 0>;
267 def DR1 : X86Reg<"dr1", 1>;
268 def DR2 : X86Reg<"dr2", 2>;
269 def DR3 : X86Reg<"dr3", 3>;
270 def DR4 : X86Reg<"dr4", 4>;
271 def DR5 : X86Reg<"dr5", 5>;
272 def DR6 : X86Reg<"dr6", 6>;
273 def DR7 : X86Reg<"dr7", 7>;
274 def DR8 : X86Reg<"dr8", 8>;
275 def DR9 : X86Reg<"dr9", 9>;
276 def DR10 : X86Reg<"dr10", 10>;
277 def DR11 : X86Reg<"dr11", 11>;
278 def DR12 : X86Reg<"dr12", 12>;
279 def DR13 : X86Reg<"dr13", 13>;
280 def DR14 : X86Reg<"dr14", 14>;
281 def DR15 : X86Reg<"dr15", 15>;
282
283 // Control registers
284 def CR0 : X86Reg<"cr0", 0>;
285 def CR1 : X86Reg<"cr1", 1>;
286 def CR2 : X86Reg<"cr2", 2>;
287 def CR3 : X86Reg<"cr3", 3>;
288 def CR4 : X86Reg<"cr4", 4>;
289 def CR5 : X86Reg<"cr5", 5>;
290 def CR6 : X86Reg<"cr6", 6>;
291 def CR7 : X86Reg<"cr7", 7>;
292 def CR8 : X86Reg<"cr8", 8>;
293 def CR9 : X86Reg<"cr9", 9>;
294 def CR10 : X86Reg<"cr10", 10>;
295 def CR11 : X86Reg<"cr11", 11>;
296 def CR12 : X86Reg<"cr12", 12>;
297 def CR13 : X86Reg<"cr13", 13>;
298 def CR14 : X86Reg<"cr14", 14>;
299 def CR15 : X86Reg<"cr15", 15>;
300
301 // Pseudo index registers
302 def EIZ : X86Reg<"eiz", 4>;
303 def RIZ : X86Reg<"riz", 4>;
304
305 // Bound registers, used in MPX instructions
306 def BND0 : X86Reg<"bnd0", 0>;
307 def BND1 : X86Reg<"bnd1", 1>;
308 def BND2 : X86Reg<"bnd2", 2>;
309 def BND3 : X86Reg<"bnd3", 3>;
这些定义可以参考《Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1》中的表3-2(这些都是程序员可访问的寄存器)。
寄存器类型 | 没有REX前缀 | 有REX前缀 |
字节寄存器 | AL, BL, CL, DL, AH, BH, CH, DH | AL, BL, CL, DL, DIL, SIL, BPL, SPL, R8L-R15L |
字寄存器 | AX, BX, CX, DX, DI, SI, BP, SP | AX, BX, CX, DX, DI, SI, BP, SP, R8W-R15W |
双字寄存器 | EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP | EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP, R8D-R15D |
四字寄存器 | N.A. | RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8-R15 |
// The direction flag.
def DF : X86Reg<"dirflag", 0>;
// CET registers - Shadow Stack Pointer
def SSP : X86Reg<"ssp", 0>;
2.2.2.1.3. RegisterClass
寄存器也是有类型的,比如浮点值不能存入EBX这样的通用寄存器,但与普通的类型系统又有所不同,比如MMX寄存器就可以放入整数或浮点值,在通用寄存器不够用时,可以用来存放整数值,反之不可。因此,为了描述寄存器的用途,LLVM定义了RegisterClass。相同用途的寄存器归入同一个RegisterClass,同一个RegisterClass中的寄存器可以互换。
151 class RegisterClass<string namespace, list<ValueType> regTypes, int alignment,
152 dag regList, RegAltNameIndex idx = NoRegAltName>
153 : DAGOperand {
154 string Namespace = namespace;
155
156 // RegType - Specify the list ValueType of the registers in this register
157 // class. Note that all registers in a register class must have the same
158 // ValueTypes. This is a list because some targets permit storing different
159 // types in same register, for example vector values with 128-bit total size,
160 // but different count/size of items, like SSE on x86.
161 //
162 list<ValueType> RegTypes = regTypes;
163
164 // Size - Specify the spill size in bits of the registers. A default value of
165 // zero lets tablgen pick an appropriate size.
166 int Size = 0;
167
168 // Alignment - Specify the alignment required of the registers when they are
169 // stored or loaded to memory.
170 //
171 int Alignment = alignment;
172
173 // CopyCost - This value is used to specify the cost of copying a value
174 // between two registers in this register class. The default value is one
175 // meaning it takes a single instruction to perform the copying. A negative
176 // value means copying is extremely expensive or impossible.
177 int CopyCost = 1;
178
179 // MemberList - Specify which registers are in this class. If the
180 // allocation_order_* method are not specified, this also defines the order of
181 // allocation used by the register allocator.
182 //
183 dag MemberList = regList;
184
185 // AltNameIndex - The alternate register name to use when printing operands
186 // of this register class. Every register in the register class must have
187 // a valid alternate name for the given index.
188 RegAltNameIndex altNameIndex = idx;
189
190 // isAllocatable - Specify that the register class can be used for virtual
191 // registers and register allocation. Some register classes are only used to
192 // model instruction operand constraints, and should have isAllocatable = 0.
193 bit isAllocatable = 1;
194
195 // AltOrders - List of alternative allocation orders. The default order is
196 // MemberList itself, and that is good enough for most targets since the
197 // register allocators automatically remove reserved registers and move
198 // callee-saved registers to the end.
199 list<dag> AltOrders = [];
200
201 // AltOrderSelect - The body of a function that selects the allocation order
202 // to use in a given machine function. The code will be inserted in a
203 // function like this:
204 //
205 // static inline unsigned f(const MachineFunction &MF) { ... }
206 //
207 // The function should return 0 to select the default order defined by
208 // MemberList, 1 to select the first AltOrders entry and so on.
209 code AltOrderSelect = [{}];
210
211 // Specify allocation priority for register allocators using a greedy
212 // heuristic. Classes with higher priority values are assigned first. This is
213 // useful as it is sometimes beneficial to assign registers to highly
214 // constrained classes first. The value has to be in the range [0,63].
215 int AllocationPriority = 0;
216 }
162行的RegTypes是该类别寄存器支持的类型,支持的类型可有多个,因此需要list。183行的MemberList指定同一个RegisterClass中寄存器的分配顺序(在前的先用)。但是,对某些处理器家族,比如X86,不同类型CPU的寄存器类型、数量有很大的差异,MemberList只适用其中的部分CPU,对其他的CPU需要另一个序列,这就是AltOrders。这时需要一个方法,指明到底用谁(返回0使用MemberList的顺序,返回1使用AltOrders的顺序),因此AltOrderSelect用于封装嵌入的选择函数的代码片段。
V7.0增加了以下的域:
// The register size/alignment information, parameterized by a HW mode.
RegInfoByHwMode RegInfos;
string DiagnosticType = "";
string DiagnosticString = "";
其中,RegInfoByHwMode是这样定义的:
class RegInfoByHwMode<list<HwMode> Ms = [], list<RegInfo> Ts = []>
: HwModeSelect<Ms> {
// The length of this list must be the same as the length of Ms.
list<RegInfo> Objects = Ts;
}
class HwModeSelect<list<HwMode> Ms> {
list<HwMode> Modes = Ms;
}
class HwMode<string FS> {
// A string representing subtarget features that turn on this HW mode.
// For example, "+feat1,-feat2" will indicate that the mode is active
// when "feat1" is enabled and "feat2" is disabled at the same time.
// Any other features are not checked.
// When multiple modes are used, they should be mutually exclusive,
// otherwise the results are unpredictable.
string Features = FS;
}
class RegInfo<int RS, int SS, int SA> {
int RegSize = RS; // Register size in bits.
int SpillSize = SS; // Spill slot size in bits.
int SpillAlignment = SA; // Spill slot alignment in bits.
}
RegInfoByHwMode定义不同硬件模式下寄存器的使用细节(大小、对齐、溅出大小)。
X86目标机器定义了这些RegisterClass:
328 def GR8 : RegisterClass<"X86", [i8], 8,
329 (add AL, CL, DL, AH, CH, DH, BL, BH, SIL, DIL, BPL, SPL,
330 R8B, R9B, R10B, R11B, R14B, R15B, R12B, R13B)> {
331 let AltOrders = [(sub GR8, AH, BH, CH, DH)];
332 let AltOrderSelect = [{
333 return MF.getSubtarget<X86Subtarget>().is64Bit();
334 }];
335 }
336
337 def GR16 : RegisterClass<"X86", [i16], 16,
338 (add AX, CX, DX, SI, DI, BX, BP, SP,
339 R8W, R9W, R10W, R11W, R14W, R15W, R12W, R13W)>;
340
341 def GR32 : RegisterClass<"X86", [i32], 32,
342 (add EAX, ECX, EDX, ESI, EDI, EBX, EBP, ESP,
343 R8D, R9D, R10D, R11D, R14D, R15D, R12D, R13D)>;
344
345 // GR64 - 64-bit GPRs. This oddly includes RIP, which isn't accurate, since
346 // RIP isn't really a register and it can't be used anywhere except in an
347 // address, but it doesn't cause trouble.
348 def GR64 : RegisterClass<"X86", [i64], 64,
349 (add RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11,
350 RBX, R14, R15, R12, R13, RBP, RSP, RIP)>;
351
352 // Segment registers for use by MOV instructions (and others) that have a
353 // segment register as one operand. Always contain a 16-bit segment
354 // descriptor.
355 def SEGMENT_REG : RegisterClass<"X86", [i16], 16, (add CS, DS, SS, ES, FS, GS)>;
356
357 // Debug registers.
358 def DEBUG_REG : RegisterClass<"X86", [i32], 32, (sequence "DR%u", 0, 7)>;
359
360 // Control registers.
361 def CONTROL_REG : RegisterClass<"X86", [i64], 64, (sequence "CR%u", 0, 15)>;
362
363 // GR8_ABCD_L, GR8_ABCD_H, GR16_ABCD, GR32_ABCD, GR64_ABCD - Subclasses of
364 // GR8, GR16, GR32, and GR64 which contain just the "a" "b", "c", and "d"
365 // registers. On x86-32, GR16_ABCD and GR32_ABCD are classes for registers
366 // that support 8-bit subreg operations. On x86-64, GR16_ABCD, GR32_ABCD,
367 // and GR64_ABCD are classes for registers that support 8-bit h-register
368 // operations.
369 def GR8_ABCD_L : RegisterClass<"X86", [i8], 8, (add AL, CL, DL, BL)>;
370 def GR8_ABCD_H : RegisterClass<"X86", [i8], 8, (add AH, CH, DH, BH)>;
371 def GR16_ABCD : RegisterClass<"X86", [i16], 16, (add AX, CX, DX, BX)>;
372 def GR32_ABCD : RegisterClass<"X86", [i32], 32, (add EAX, ECX, EDX, EBX)>;
373 def GR64_ABCD : RegisterClass<"X86", [i64], 64, (add RAX, RCX, RDX, RBX)>;
374 def GR32_TC : RegisterClass<"X86", [i32], 32, (add EAX, ECX, EDX)>;
375 def GR64_TC : RegisterClass<"X86", [i64], 64, (add RAX, RCX, RDX, RSI, RDI,
376 R8, R9, R11, RIP)>;
377 def GR64_TCW64 : RegisterClass<"X86", [i64], 64, (add RAX, RCX, RDX,
378 R8, R9, R11)>;
379
380 // GR8_NOREX - GR8 registers which do not require a REX prefix.
381 def GR8_NOREX : RegisterClass<"X86", [i8], 8,
382 (add AL, CL, DL, AH, CH, DH, BL, BH)> {
383 let AltOrders = [(sub GR8_NOREX, AH, BH, CH, DH)];
384 let AltOrderSelect = [{
385 return MF.getSubtarget<X86Subtarget>().is64Bit();
386 }];
387 }
388 // GR16_NOREX - GR16 registers which do not require a REX prefix.
389 def GR16_NOREX : RegisterClass<"X86", [i16], 16,
390 (add AX, CX, DX, SI, DI, BX, BP, SP)>;
391 // GR32_NOREX - GR32 registers which do not require a REX prefix.
392 def GR32_NOREX : RegisterClass<"X86", [i32], 32,
393 (add EAX, ECX, EDX, ESI, EDI, EBX, EBP, ESP)>;
394 // GR64_NOREX - GR64 registers which do not require a REX prefix.
395 def GR64_NOREX : RegisterClass<"X86", [i64], 64,
396 (add RAX, RCX, RDX, RSI, RDI, RBX, RBP, RSP, RIP)>;
397
398 // GR32_NOAX - GR32 registers except EAX. Used by AddRegFrm of XCHG32 in 64-bit
399 // mode to prevent encoding using the 0x90 NOP encoding. xchg %eax, %eax needs
400 // to clear upper 32-bits of RAX so is not a NOP.
401 def GR32_NOAX : RegisterClass<"X86", [i32], 32, (sub GR32, EAX)>;
402
403 // GR32_NOSP - GR32 registers except ESP.
404 def GR32_NOSP : RegisterClass<"X86", [i32], 32, (sub GR32, ESP)>;
405
406 // GR64_NOSP - GR64 registers except RSP (and RIP).
407 def GR64_NOSP : RegisterClass<"X86", [i64], 64, (sub GR64, RSP, RIP)>;
408
409 // GR32_NOREX_NOSP - GR32 registers which do not require a REX prefix except
410 // ESP.
411 def GR32_NOREX_NOSP : RegisterClass<"X86", [i32], 32,
412 (and GR32_NOREX, GR32_NOSP)>;
413
414 // GR64_NOREX_NOSP - GR64_NOREX registers except RSP.
415 def GR64_NOREX_NOSP : RegisterClass<"X86", [i64], 64,
416 (and GR64_NOREX, GR64_NOSP)>;
417
418 // A class to support the 'A' assembler constraint: EAX then EDX.
419 def GR32_AD : RegisterClass<"X86", [i32], 32, (add EAX, EDX)>;
420
421 // Scalar SSE2 floating point registers.
422 def FR32 : RegisterClass<"X86", [f32], 32, (sequence "XMM%u", 0, 15)>;
423
424 def FR64 : RegisterClass<"X86", [f64], 64, (add FR32)>;
425
426
427 // FIXME: This sets up the floating point register files as though they are f64
428 // values, though they really are f80 values. This will cause us to spill
429 // values as 64-bit quantities instead of 80-bit quantities, which is much much
430 // faster on common hardware. In reality, this should be controlled by a
431 // command line option or something.
432
433 def RFP32 : RegisterClass<"X86",[f32], 32, (sequence "FP%u", 0, 6)>;
434 def RFP64 : RegisterClass<"X86",[f64], 32, (add RFP32)>;
435 def RFP80 : RegisterClass<"X86",[f80], 32, (add RFP32)>;
436
437 // Floating point stack registers (these are not allocatable by the
438 // register allocator - the floating point stackifier is responsible
439 // for transforming FPn allocations to STn registers)
440 def RST : RegisterClass<"X86", [f80, f64, f32], 32, (sequence "ST%u", 0, 7)> {
441 let isAllocatable = 0;
442 }
443
444 // Generic vector registers: VR64 and VR128.
445 def VR64: RegisterClass<"X86", [x86mmx], 64, (sequence "MM%u", 0, 7)>;
446 def VR128 : RegisterClass<"X86", [v16i8, v8i16, v4i32, v2i64, v4f32, v2f64],
447 128, (add FR32)>;
448 def VR256 : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],
449 256, (sequence "YMM%u", 0, 15)>;
450
451 // Status flags registers.
452 def CCR : RegisterClass<"X86", [i32], 32, (add EFLAGS)> {
453 let CopyCost = -1; // Don't allow copying of status registers.
454 let isAllocatable = 0;
455 }
456 def FPCCR : RegisterClass<"X86", [i16], 16, (add FPSW)> {
457 let CopyCost = -1; // Don't allow copying of status registers.
458 let isAllocatable = 0;
459 }
460
461 // AVX-512 vector/mask registers.
462 def VR512 : RegisterClass<"X86", [v16f32, v8f64, v64i8, v32i16, v16i32, v8i64], 512,
463 (sequence "ZMM%u", 0, 31)>;
464
465 // Scalar AVX-512 floating point registers.
466 def FR32X : RegisterClass<"X86", [f32], 32, (sequence "XMM%u", 0, 31)>;
467
468 def FR64X : RegisterClass<"X86", [f64], 64, (add FR32X)>;
469
470 // Extended VR128 and VR256 for AVX-512 instructions
471 def VR128X : RegisterClass<"X86", [v16i8, v8i16, v4i32, v2i64, v4f32, v2f64],
472 128, (add FR32X)>;
473 def VR256X : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],
474 256, (sequence "YMM%u", 0, 31)>;
475
476 // Mask registers
477 def VK1 : RegisterClass<"X86", [i1], 8, (sequence "K%u", 0, 7)> {let Size = 8;}
478 def VK2 : RegisterClass<"X86", [v2i1], 8, (add VK1)> {let Size = 8;}
479 def VK4 : RegisterClass<"X86", [v4i1], 8, (add VK2)> {let Size = 8;}
480 def VK8 : RegisterClass<"X86", [v8i1], 8, (add VK4)> {let Size = 8;}
481 def VK16 : RegisterClass<"X86", [v16i1], 16, (add VK8)> {let Size = 16;}
482 def VK32 : RegisterClass<"X86", [v32i1], 32, (add VK16)> {let Size = 32;}
483 def VK64 : RegisterClass<"X86", [v64i1], 64, (add VK32)> {let Size = 64;}
484
485 def VK1WM : RegisterClass<"X86", [i1], 8, (sub VK1, K0)> {let Size = 8;}
486 def VK2WM : RegisterClass<"X86", [v2i1], 8, (sub VK2, K0)> {let Size = 8;}
487 def VK4WM : RegisterClass<"X86", [v4i1], 8, (sub VK4, K0)> {let Size = 8;}
488 def VK8WM : RegisterClass<"X86", [v8i1], 8, (sub VK8, K0)> {let Size = 8;}
489 def VK16WM : RegisterClass<"X86", [v16i1], 16, (add VK8WM)> {let Size = 16;}
490 def VK32WM : RegisterClass<"X86", [v32i1], 32, (add VK16WM)> {let Size = 32;}
491 def VK64WM : RegisterClass<"X86", [v64i1], 64, (add VK32WM)> {let Size = 64;}
492
493 // Bound registers
494 def BNDR : RegisterClass<"X86", [v2i64], 128, (sequence "BND%u", 0, 3)>;
TD语言提供支持简单集合操作的关键字。像上面的add可向当前的dag添加指定的集合成员,sub则从指定的dag集合删除指定的成员。而sequence则可以方便地生成一系列成员并加入当前的dag,比如最后一行的sequence "BND%u", 0, 3将创建一个包含BND0,BND1,BND2与BND3的集合。TableGen的语法解析器在遇到这些关键字时完成相关的操作。
具体来说,在64位机器里,GR8与GR8_NOREX都要排除掉AH,BH,CH与DH,具体原因是:它们不能在一个要求REX前缀的指令里编码,而SIL,DIL,BPL,R8D等要求一个REX前缀。例如,addb %ah, %dil与movzbl %ah, %r8d不能被编码。
同样v7.0添加了SPH等对应的寄存器类别,不一一列举。
2.2.2.1.4. ARM的例子
X86的寄存器描述不算复杂。ARM则是一个极端。ARM架构有16个统一的(uniform)32位寄存器,另外,它的feature VPF与NEON还有额外的16✕64位寄存器。VPF与NEON可以把这些寄存器视为不同的大小(32, 64, 128, 256比特)。由于存在这样复杂的关系,因此,不像X86那样一段一段地使用SubRegIndex来描述,TableGen采用了比较自动化的方式。
29 let Namespace = "ARM" in {
30 def qqsub_0 : SubRegIndex<256>;
31 def qqsub_1 : SubRegIndex<256, 256>;
32
33 // Note: Code depends on these having consecutive numbers.
34 def qsub_0 : SubRegIndex<128>;
35 def qsub_1 : SubRegIndex<128, 128>;
36 def qsub_2 : ComposedSubRegIndex<qqsub_1, qsub_0>; // 偏移256,长度128
37 def qsub_3 : ComposedSubRegIndex<qqsub_1, qsub_1>; // 偏移384,长度128
38
39 def dsub_0 : SubRegIndex<64>;
40 def dsub_1 : SubRegIndex<64, 64>;
41 def dsub_2 : ComposedSubRegIndex<qsub_1, dsub_0>; // 偏移128,长度64
42 def dsub_3 : ComposedSubRegIndex<qsub_1, dsub_1>; // 偏移192,长度64
43 def dsub_4 : ComposedSubRegIndex<qsub_2, dsub_0>; // 偏移256,长度64
44 def dsub_5 : ComposedSubRegIndex<qsub_2, dsub_1>; // 偏移320,长度64
45 def dsub_6 : ComposedSubRegIndex<qsub_3, dsub_0>; // 偏移384,长度64
46 def dsub_7 : ComposedSubRegIndex<qsub_3, dsub_1>; // 偏移448,长度64
47
48 def ssub_0 : SubRegIndex<32>;
49 def ssub_1 : SubRegIndex<32, 32>;
50 def ssub_2 : ComposedSubRegIndex<dsub_1, ssub_0>; // 偏移64,长度32
51 def ssub_3 : ComposedSubRegIndex<dsub_1, ssub_1>; // 偏移96,长度32
52
53 def gsub_0 : SubRegIndex<32>;
54 def gsub_1 : SubRegIndex<32, 32>;
55 // Let TableGen synthesize the remaining 12 ssub_* indices.
56 // We don't need to name them.
57 }
在TD描述中,ARM的16个Q寄存器(64位)分为两组,每组512个比特。这样分是因为这些寄存器还可以用作D寄存器(32位),而不同的ARM版本看到16或32个D寄存器。因此,定义上面所示的SubRegIndex。
上面的ComposedSubRegIndex派生定义描述了SubRegIndex之间的复合关系。
63 class ComposedSubRegIndex<SubRegIndex A, SubRegIndex B>
64 : SubRegIndex<B.Size, !if(!eq(A.Offset, -1), -1,
65 !if(!eq(B.Offset, -1), -1,
66 !add(A.Offset, B.Offset)))> {
67 // See SubRegIndex.
68 let ComposedOf = [A, B];
69 }
复合出来的SubRegIndex的大小等于B的大小,偏移是A与B偏移的和(如果这两个偏移都不是-1,否则就是-1)。在上面的代码片段里,特别给出了这些SubRegIndex所描述的寄存器片段的注释。这些片段需要对应的Register定义,因此TableGen中也还有一个比较自动化描述Register的辅助类。
278 class RegisterTuples<list<SubRegIndex> Indices, list<dag> Regs> {
279 // SubRegs - N lists of registers to be zipped up. Super-registers are
280 // synthesized from the first element of each SubRegs list, the second
281 // element and so on.
282 list<dag> SubRegs = Regs;
283
284 // SubRegIndices - N SubRegIndex instances. This provides the names of the
285 // sub-registers in the synthesized super-registers.
286 list<SubRegIndex> SubRegIndices = Indices;
287 }
RegisterTuples的效果可用以下代码说明。
定义:def EvenOdd : RegisterTuples<[sube, subo], [(add R0, R2), (add R1, R3)]>;,将产生与下面代码等效的定义:
let SubRegIndices = [sube, subo] in {
def R0_R1 : RegisterWithSubRegs<"", [R0, R1]>;
def R2_R3 : RegisterWithSubRegs<"", [R2, R3]>;
}
以ARM本身来说,所采用的定义比上面的例子要更为复杂。比如ARM这样描述D类别的寄存器(双精度浮点或通用64位向量寄存器):
284 def DPR : RegisterClass<"ARM", [f64, v8i8, v4i16, v2i32, v1i64, v2f32], 64,
285 (sequence "D%u", 0, 31)> {
286 // Allocate non-VFP2 registers D16-D31 first.
287 let AltOrders = [(rotl DPR, 16)];
288 let AltOrderSelect = [{ return 1; }];
289 }
根据定义,属于DPR类别的寄存器为D0~D31。那么连续3个D寄存器所组成的超级寄存器的定义则是:
347 def Tuples3D : RegisterTuples<[dsub_0, dsub_1, dsub_2],
348 [(shl DPR, 0),
349 (shl DPR, 1),
350 (shl DPR, 2)]>;
(shl DPR, N)的含义是删除前N个成员。因此,(shl DPR, 0)生成D0~D31,(shl DPR, 1)生成D1~D31,(shl DPR, 2)生成D2~D31。最终会生成这些超级寄存器[D0, D1, D2],[D1, D2, D3],…[D29, D30, D31],而它们的索引则分别由dsub_0,dsub_1与dsub_2描述。
显然,这样的做法比一个个来定义要紧凑得多,但TableGen的处理相应大大地复杂了。