llvm学习笔记(3)

2.2.2. 参数描述

Instruction定义中的OutOperandList与InOperandList分别是这样的dag:(outs op1, op2, …),(ins op1, op2, …)。Op可以是寄存器(这时通过RegisterClass来说明其寄存器类型)

2.2.2.1. 寄存器

2.2.2.1.1. Register

今天,可用的目标机器一定有寄存器(完全依靠栈运行的机器已经没有了)。在LLVM的后端,对寄存器的描述也是一个重要的工作。这些寄存器定义保存在目标机器文件TargetRegisterInfo.td中,它们都用到下面的基类(target.td)。

80        class Register<string n, list<string> altNames = []> {

81          string Namespace = "";

82          string AsmName = n;

83          list<string> AltNames = altNames;

84       

85          // Aliases - A list of registers that this register overlaps with.  A read or

86          // modification of this register can potentially read or modify the aliased

87          // registers.

88          list<Register> Aliases = [];

89       

90          // SubRegs - A list of registers that are parts of this register. Note these

91          // are "immediate" sub-registers and the registers within the list do not

92          // themselves overlap. e.g. For X86, EAX's SubRegs list contains only [AX],

93          // not [AX, AH, AL].

94          list<Register> SubRegs = [];

95       

96          // SubRegIndices - For each register in SubRegs, specify the SubRegIndex used

97          // to address it. Sub-sub-register indices are automatically inherited from

98          // SubRegs.

99          list<SubRegIndex> SubRegIndices = [];

100     

101        // RegAltNameIndices - The alternate name indices which are valid for this

102        // register.

103        list<RegAltNameIndex> RegAltNameIndices = [];

104     

105        // DwarfNumbers - Numbers used internally by gcc/gdb to identify the register.

106        // These values can be determined by locating the <target>.h file in the

107        // directory llvmgcc/gcc/config/<target>/ and looking for REGISTER_NAMES.  The

108        // order of these names correspond to the enumeration used by gcc.  A value of

109        // -1 indicates that the gcc number is undefined and -2 that register number

110        // is invalid for this mode/flavour.

111        list<int> DwarfNumbers = [];

112     

113        // CostPerUse - Additional cost of instructions using this register compared

114        // to other registers in its class. The register allocator will try to

115        // minimize the number of instructions using a register with a CostPerUse.

116        // This is used by the x86-64 and ARM Thumb targets where some registers

117        // require larger instruction encodings.

118        int CostPerUse = 0;

119     

120        // CoveredBySubRegs - When this bit is set, the value of this register is

121        // completely determined by the value of its sub-registers.  For example, the

122        // x86 register AX is covered by its sub-registers AL and AH, but EAX is not

123        // covered by its sub-register AX.

124        bit CoveredBySubRegs = 0;

125     

126        // HWEncoding - The target specific hardware encoding for this register.

127        bits<16> HWEncoding = 0;

128      }

V7.0Register定义中添加了一个域:bit isArtificial = 0;

目标机器根据需要进一步派生。如X86机器的派生定义是X86Reg(X86RegisterInfo.td):

16        class X86Reg<string n, bits<16> Enc, list<Register> subregs = []> : Register<n> {

17          let Namespace = "X86";

18          let HWEncoding = Enc;

19          let SubRegs = subregs;

20        }

Register的定义比想象的要复杂,这是因为有些寄存器是有可援引部分的。比如X86的EAX,AX与AL都是可援引寄存器,在TD文件里都有一个Register定义。但实际上EAX,AX与AL援引的都是同一个寄存器,而且EAX包含了AX,AX包含了AL。为了更好地使用寄存器,后端需要知道这些关系。因此,需要用到上面的SubRegs 、SubRegIndices。其中SubRegIndices是一个SubRegIndex类型列表(target.td),它给出了一个偏移及大小,唯一确定了一个寄存器中的可援引部分(子寄存器)。注意,寄存器索引与具体的寄存器是无关的,一个指定的偏移与大小,只需要一个寄存器索引定义来表述。SubRegs与SubRegIndices必须是一一对应的。

25        class SubRegIndex<int size, int offset = 0> {

26          string Namespace = "";

27       

28          // Size - Size (in bits) of the sub-registers represented by this index.

29          int Size = size;

30       

31          // Offset - Offset of the first bit that is part of this sub-register index.

32          // Set it to -1 if the same index is used to represent sub-registers that can

33          // be at different offsets (for example when using an index to access an

34          // element in a register tuple).

35          int Offset = offset;

36       

37          // ComposedOf - A list of two SubRegIndex instances, [A, B].

38          // This indicates that this SubRegIndex is the result of composing A and B.

39          // See ComposedSubRegIndex.

40          list<SubRegIndex> ComposedOf = [];

41       

42          // CoveringSubRegIndices - A list of two or more sub-register indexes that

43          // cover this sub-register.

44          //

45          // This field should normally be left blank as TableGen can infer it.

46          //

47          // TableGen automatically detects sub-registers that straddle the registers

48          // in the SubRegs field of a Register definition. For example:

49          //

50          //   Q0    = dsub_0 -> D0, dsub_1 -> D1

51          //   Q1    = dsub_0 -> D2, dsub_1 -> D3

52          //   D1_D2 = dsub_0 -> D1, dsub_1 -> D2

53          //   QQ0   = qsub_0 -> Q0, qsub_1 -> Q1

54          //

55          // TableGen will infer that D1_D2 is a sub-register of QQ0. It will be given

56          // the synthetic index dsub_1_dsub_2 unless some SubRegIndex is defined with

57          // CoveringSubRegIndices = [dsub_1, dsub_2].

58          list<SubRegIndex> CoveringSubRegIndices = [];

59        }

TableGen能根据Register定义中SubRegIndices的声明自动推导寄存器索引间的关系。但存在复杂的、难以描述与推导的情形(参考下面ARM的例子)。这时需要使用SubRegIndex的ComposedOf以及CoveringSubRegIndices域来描述(LLVM-3.6实际上还没有使用CoveringSubRegIndices。7.0也没有)。

例如,索引Id1援引寄存器R的子寄存器Rr,索引Id2援引Rr的子寄存器r,如果索引Id3援引R对应r的部分,Id3等效于对R先后施行Id1与Id2(下面称为复合,Compose)。通常,TableGen能推导出这样的关系,否则需要在ComposedOf域中明确指出。

2.2.2.1.2. X86的例子

举例来说,在X86RegisterInfo.td文件中对X86目标机器定义了以下的SubRegIndex:

23        let Namespace = "X86" in {

24          def sub_8bit    : SubRegIndex<8>;

25          def sub_8bit_hi : SubRegIndex<8, 8>;

26          def sub_16bit   : SubRegIndex<16>;

27          def sub_32bit   : SubRegIndex<32>;

28          def sub_xmm     : SubRegIndex<128>;

29          def sub_ymm     : SubRegIndex<256>;

30        }

那么X86目标机器可以使用的寄存器具有以下定义。48~74行是X86中最小的可援引寄存器,它们不需要使用索引。但从78行的AX开始的寄存器都存在可援引部分,每个可援引部分都定义一个索引,以及该部分的定义,比如索引sub_8bit与sub_8bit_hi分别援引AX中的AL与AH。

V7.0增加了这两个SubRegIndex

def sub_8bit_hi_phony  : SubRegIndex<8, 8>;

def sub_16bit_hi : SubRegIndex<16, 16>;

原因是从下面83行开始的寄存器定义,其SubRegs声明部分是不完整的。以87行的SP为例,v7.0现在声明为:def SP : X86Reg<"sp", 4, [SPL,SPH]>;

SPH的定义是(v7.0增加了若干类似定义,不一一列举):

let isArtificial = 1 in {

     …

     def SPH   : X86Reg<"", -1>;

     …

}

不过,这里的描述是不完整的。比如:R8àR8DàR8WàR8B,R8B在R8中是可直接援引的,但相应的索引没有直接定义。推导这样的索引是TableGen的工作,下面可以看到。

46        // 8-bit registers

47        // Low registers

48        def AL : X86Reg<"al", 0>;

49        def DL : X86Reg<"dl", 2>;

50        def CL : X86Reg<"cl", 1>;

51        def BL : X86Reg<"bl", 3>;

52       

53        // High registers. On x86-64, these cannot be used in any instruction

54        // with a REX prefix.

55        def AH : X86Reg<"ah", 4>;

56        def DH : X86Reg<"dh", 6>;

57        def CH : X86Reg<"ch", 5>;

58        def BH : X86Reg<"bh", 7>;

59       

60        // X86-64 only, requires REX.

61        let CostPerUse = 1 in {

62        def SIL  : X86Reg<"sil",   6>;

63        def DIL  : X86Reg<"dil",   7>;

64        def BPL  : X86Reg<"bpl",   5>;

65        def SPL  : X86Reg<"spl",   4>;

66        def R8B  : X86Reg<"r8b",   8>;

67        def R9B  : X86Reg<"r9b",   9>;

68        def R10B : X86Reg<"r10b", 10>;

69        def R11B : X86Reg<"r11b", 11>;

70        def R12B : X86Reg<"r12b", 12>;

71        def R13B : X86Reg<"r13b", 13>;

72        def R14B : X86Reg<"r14b", 14>;

73        def R15B : X86Reg<"r15b", 15>;

74        }

75       

76        // 16-bit registers

77        let SubRegIndices = [sub_8bit, sub_8bit_hi], CoveredBySubRegs = 1 in {

78        def AX : X86Reg<"ax", 0, [AL,AH]>;

79        def DX : X86Reg<"dx", 2, [DL,DH]>;

80        def CX : X86Reg<"cx", 1, [CL,CH]>;

81        def BX : X86Reg<"bx", 3, [BL,BH]>;

82        }

83        let SubRegIndices = [sub_8bit] in {

84        def SI : X86Reg<"si", 6, [SIL]>;

85        def DI : X86Reg<"di", 7, [DIL]>;

86        def BP : X86Reg<"bp", 5, [BPL]>;

87        def SP : X86Reg<"sp", 4, [SPL]>;

88        }

89        def IP : X86Reg<"ip", 0>;

90       

91        // X86-64 only, requires REX.

92        let SubRegIndices = [sub_8bit], CostPerUse = 1 in {

93        def R8W  : X86Reg<"r8w",   8, [R8B]>;

94        def R9W  : X86Reg<"r9w",   9, [R9B]>;

95        def R10W : X86Reg<"r10w", 10, [R10B]>;

96        def R11W : X86Reg<"r11w", 11, [R11B]>;

97        def R12W : X86Reg<"r12w", 12, [R12B]>;

98        def R13W : X86Reg<"r13w", 13, [R13B]>;

99        def R14W : X86Reg<"r14w", 14, [R14B]>;

100      def R15W : X86Reg<"r15w", 15, [R15B]>;

101      }

102     

103      // 32-bit registers

104      let SubRegIndices = [sub_16bit] in {

105      def EAX : X86Reg<"eax", 0, [AX]>, DwarfRegNum<[-2, 0, 0]>;

106      def EDX : X86Reg<"edx", 2, [DX]>, DwarfRegNum<[-2, 2, 2]>;

107      def ECX : X86Reg<"ecx", 1, [CX]>, DwarfRegNum<[-2, 1, 1]>;

108      def EBX : X86Reg<"ebx", 3, [BX]>, DwarfRegNum<[-2, 3, 3]>;

109      def ESI : X86Reg<"esi", 6, [SI]>, DwarfRegNum<[-2, 6, 6]>;

110      def EDI : X86Reg<"edi", 7, [DI]>, DwarfRegNum<[-2, 7, 7]>;

111      def EBP : X86Reg<"ebp", 5, [BP]>, DwarfRegNum<[-2, 4, 5]>;

112      def ESP : X86Reg<"esp", 4, [SP]>, DwarfRegNum<[-2, 5, 4]>;

113      def EIP : X86Reg<"eip", 0, [IP]>, DwarfRegNum<[-2, 8, 8]>;

114     

115      // X86-64 only, requires REX

116      let CostPerUse = 1 in {

117      def R8D  : X86Reg<"r8d",   8, [R8W]>;

118      def R9D  : X86Reg<"r9d",   9, [R9W]>;

119      def R10D : X86Reg<"r10d", 10, [R10W]>;

120      def R11D : X86Reg<"r11d", 11, [R11W]>;

121      def R12D : X86Reg<"r12d", 12, [R12W]>;

122      def R13D : X86Reg<"r13d", 13, [R13W]>;

123      def R14D : X86Reg<"r14d", 14, [R14W]>;

124      def R15D : X86Reg<"r15d", 15, [R15W]>;

125      }}

126     

127      // 64-bit registers, X86-64 only

128      let SubRegIndices = [sub_32bit] in {

129      def RAX : X86Reg<"rax", 0, [EAX]>, DwarfRegNum<[0, -2, -2]>;

130      def RDX : X86Reg<"rdx", 2, [EDX]>, DwarfRegNum<[1, -2, -2]>;

131      def RCX : X86Reg<"rcx", 1, [ECX]>, DwarfRegNum<[2, -2, -2]>;

132      def RBX : X86Reg<"rbx", 3, [EBX]>, DwarfRegNum<[3, -2, -2]>;

133      def RSI : X86Reg<"rsi", 6, [ESI]>, DwarfRegNum<[4, -2, -2]>;

134      def RDI : X86Reg<"rdi", 7, [EDI]>, DwarfRegNum<[5, -2, -2]>;

135      def RBP : X86Reg<"rbp", 5, [EBP]>, DwarfRegNum<[6, -2, -2]>;

136      def RSP : X86Reg<"rsp", 4, [ESP]>, DwarfRegNum<[7, -2, -2]>;

137     

138      // These also require REX.

139      let CostPerUse = 1 in {

140      def R8  : X86Reg<"r8",   8, [R8D]>,  DwarfRegNum<[ 8, -2, -2]>;

141      def R9  : X86Reg<"r9",   9, [R9D]>,  DwarfRegNum<[ 9, -2, -2]>;

142      def R10 : X86Reg<"r10", 10, [R10D]>, DwarfRegNum<[10, -2, -2]>;

143      def R11 : X86Reg<"r11", 11, [R11D]>, DwarfRegNum<[11, -2, -2]>;

144      def R12 : X86Reg<"r12", 12, [R12D]>, DwarfRegNum<[12, -2, -2]>;

145      def R13 : X86Reg<"r13", 13, [R13D]>, DwarfRegNum<[13, -2, -2]>;

146      def R14 : X86Reg<"r14", 14, [R14D]>, DwarfRegNum<[14, -2, -2]>;

147      def R15 : X86Reg<"r15", 15, [R15D]>, DwarfRegNum<[15, -2, -2]>;

148      def RIP : X86Reg<"rip",  0, [EIP]>,  DwarfRegNum<[16, -2, -2]>;

149      }}

150     

151      // MMX Registers. These are actually aliased to ST0 .. ST7

152      def MM0 : X86Reg<"mm0", 0>, DwarfRegNum<[41, 29, 29]>;

153      def MM1 : X86Reg<"mm1", 1>, DwarfRegNum<[42, 30, 30]>;

154      def MM2 : X86Reg<"mm2", 2>, DwarfRegNum<[43, 31, 31]>;

155      def MM3 : X86Reg<"mm3", 3>, DwarfRegNum<[44, 32, 32]>;

156      def MM4 : X86Reg<"mm4", 4>, DwarfRegNum<[45, 33, 33]>;

157      def MM5 : X86Reg<"mm5", 5>, DwarfRegNum<[46, 34, 34]>;

158      def MM6 : X86Reg<"mm6", 6>, DwarfRegNum<[47, 35, 35]>;

159      def MM7 : X86Reg<"mm7", 7>, DwarfRegNum<[48, 36, 36]>;

160     

161      // Pseudo Floating Point registers

162      def FP0 : X86Reg<"fp0", 0>;

163      def FP1 : X86Reg<"fp1", 0>;

164      def FP2 : X86Reg<"fp2", 0>;

165      def FP3 : X86Reg<"fp3", 0>;

166      def FP4 : X86Reg<"fp4", 0>;

167      def FP5 : X86Reg<"fp5", 0>;

168      def FP6 : X86Reg<"fp6", 0>;

169      def FP7 : X86Reg<"fp7", 0>;

170     

171      // XMM Registers, used by the various SSE instruction set extensions.

172      def XMM0: X86Reg<"xmm0", 0>, DwarfRegNum<[17, 21, 21]>;

173      def XMM1: X86Reg<"xmm1", 1>, DwarfRegNum<[18, 22, 22]>;

174      def XMM2: X86Reg<"xmm2", 2>, DwarfRegNum<[19, 23, 23]>;

175      def XMM3: X86Reg<"xmm3", 3>, DwarfRegNum<[20, 24, 24]>;

176      def XMM4: X86Reg<"xmm4", 4>, DwarfRegNum<[21, 25, 25]>;

177      def XMM5: X86Reg<"xmm5", 5>, DwarfRegNum<[22, 26, 26]>;

178      def XMM6: X86Reg<"xmm6", 6>, DwarfRegNum<[23, 27, 27]>;

179      def XMM7: X86Reg<"xmm7", 7>, DwarfRegNum<[24, 28, 28]>;

180     

181      // X86-64 only

182      let CostPerUse = 1 in {

183      def XMM8:  X86Reg<"xmm8",   8>, DwarfRegNum<[25, -2, -2]>;

184      def XMM9:  X86Reg<"xmm9",   9>, DwarfRegNum<[26, -2, -2]>;

185      def XMM10: X86Reg<"xmm10", 10>, DwarfRegNum<[27, -2, -2]>;

186      def XMM11: X86Reg<"xmm11", 11>, DwarfRegNum<[28, -2, -2]>;

187      def XMM12: X86Reg<"xmm12", 12>, DwarfRegNum<[29, -2, -2]>;

188      def XMM13: X86Reg<"xmm13", 13>, DwarfRegNum<[30, -2, -2]>;

189      def XMM14: X86Reg<"xmm14", 14>, DwarfRegNum<[31, -2, -2]>;

190      def XMM15: X86Reg<"xmm15", 15>, DwarfRegNum<[32, -2, -2]>;

191     

192      def XMM16:  X86Reg<"xmm16", 16>, DwarfRegNum<[60, -2, -2]>;

193      def XMM17:  X86Reg<"xmm17", 17>, DwarfRegNum<[61, -2, -2]>;

194      def XMM18:  X86Reg<"xmm18", 18>, DwarfRegNum<[62, -2, -2]>;

195      def XMM19:  X86Reg<"xmm19", 19>, DwarfRegNum<[63, -2, -2]>;

196      def XMM20:  X86Reg<"xmm20", 20>, DwarfRegNum<[64, -2, -2]>;

197      def XMM21:  X86Reg<"xmm21", 21>, DwarfRegNum<[65, -2, -2]>;

198      def XMM22:  X86Reg<"xmm22", 22>, DwarfRegNum<[66, -2, -2]>;

199      def XMM23:  X86Reg<"xmm23", 23>, DwarfRegNum<[67, -2, -2]>;

200      def XMM24:  X86Reg<"xmm24", 24>, DwarfRegNum<[68, -2, -2]>;

201      def XMM25:  X86Reg<"xmm25", 25>, DwarfRegNum<[69, -2, -2]>;

202      def XMM26:  X86Reg<"xmm26", 26>, DwarfRegNum<[70, -2, -2]>;

203      def XMM27:  X86Reg<"xmm27", 27>, DwarfRegNum<[71, -2, -2]>;

204      def XMM28:  X86Reg<"xmm28", 28>, DwarfRegNum<[72, -2, -2]>;

205      def XMM29:  X86Reg<"xmm29", 29>, DwarfRegNum<[73, -2, -2]>;

206      def XMM30:  X86Reg<"xmm30", 30>, DwarfRegNum<[74, -2, -2]>;

207      def XMM31:  X86Reg<"xmm31", 31>, DwarfRegNum<[75, -2, -2]>;

208     

209      } // CostPerUse

210     

211      // YMM0-15 registers, used by AVX instructions and

212      // YMM16-31 registers, used by AVX-512 instructions.

213      let SubRegIndices = [sub_xmm] in {

214        foreach  Index = 0-31 in {

215          def YMM#Index : X86Reg<"ymm"#Index, Index, [!cast<X86Reg>("XMM"#Index)]>,

216                          DwarfRegAlias<!cast<X86Reg>("XMM"#Index)>;

217        }

218      }

219     

220      // ZMM Registers, used by AVX-512 instructions.

221      let SubRegIndices = [sub_ymm] in {

222        foreach  Index = 0-31 in {

223          def ZMM#Index : X86Reg<"zmm"#Index, Index, [!cast<X86Reg>("YMM"#Index)]>,

224                          DwarfRegAlias<!cast<X86Reg>("XMM"#Index)>;

225        }

226      }

227     

228        // Mask Registers, used by AVX-512 instructions.

229        def K0 : X86Reg<"k0", 0>, DwarfRegNum<[118, -2, -2]>;

230        def K1 : X86Reg<"k1", 1>, DwarfRegNum<[119, -2, -2]>;

231        def K2 : X86Reg<"k2", 2>, DwarfRegNum<[120, -2, -2]>;

232        def K3 : X86Reg<"k3", 3>, DwarfRegNum<[121, -2, -2]>;

233        def K4 : X86Reg<"k4", 4>, DwarfRegNum<[122, -2, -2]>;

234        def K5 : X86Reg<"k5", 5>, DwarfRegNum<[123, -2, -2]>;

235        def K6 : X86Reg<"k6", 6>, DwarfRegNum<[124, -2, -2]>;

236        def K7 : X86Reg<"k7", 7>, DwarfRegNum<[125, -2, -2]>;

237     

238      // Floating point stack registers. These don't map one-to-one to the FP

239      // pseudo registers, but we still mark them as aliasing FP registers. That

240      // way both kinds can be live without exceeding the stack depth. ST registers

241      // are only live around inline assembly.

242      def ST0 : X86Reg<"st(0)", 0>, DwarfRegNum<[33, 12, 11]>;

243      def ST1 : X86Reg<"st(1)", 1>, DwarfRegNum<[34, 13, 12]>;

244      def ST2 : X86Reg<"st(2)", 2>, DwarfRegNum<[35, 14, 13]>;

245      def ST3 : X86Reg<"st(3)", 3>, DwarfRegNum<[36, 15, 14]>;

246      def ST4 : X86Reg<"st(4)", 4>, DwarfRegNum<[37, 16, 15]>;

247      def ST5 : X86Reg<"st(5)", 5>, DwarfRegNum<[38, 17, 16]>;

248      def ST6 : X86Reg<"st(6)", 6>, DwarfRegNum<[39, 18, 17]>;

249      def ST7 : X86Reg<"st(7)", 7>, DwarfRegNum<[40, 19, 18]>;

250     

251      // Floating-point status word

252      def FPSW : X86Reg<"fpsw", 0>;

253     

254      // Status flags register

255      def EFLAGS : X86Reg<"flags", 0>;

256     

257      // Segment registers

258      def CS : X86Reg<"cs", 1>;

259      def DS : X86Reg<"ds", 3>;

260      def SS : X86Reg<"ss", 2>;

261      def ES : X86Reg<"es", 0>;

262      def FS : X86Reg<"fs", 4>;

263      def GS : X86Reg<"gs", 5>;

264     

265      // Debug registers

266      def DR0  : X86Reg<"dr0",   0>;

267      def DR1  : X86Reg<"dr1",   1>;

268      def DR2  : X86Reg<"dr2",   2>;

269      def DR3  : X86Reg<"dr3",   3>;

270      def DR4  : X86Reg<"dr4",   4>;

271      def DR5  : X86Reg<"dr5",   5>;

272      def DR6  : X86Reg<"dr6",   6>;

273      def DR7  : X86Reg<"dr7",   7>;

274      def DR8  : X86Reg<"dr8",   8>;

275      def DR9  : X86Reg<"dr9",   9>;

276      def DR10 : X86Reg<"dr10", 10>;

277      def DR11 : X86Reg<"dr11", 11>;

278      def DR12 : X86Reg<"dr12", 12>;

279      def DR13 : X86Reg<"dr13", 13>;

280      def DR14 : X86Reg<"dr14", 14>;

281      def DR15 : X86Reg<"dr15", 15>;

282     

283      // Control registers

284      def CR0  : X86Reg<"cr0",   0>;

285      def CR1  : X86Reg<"cr1",   1>;

286      def CR2  : X86Reg<"cr2",   2>;

287      def CR3  : X86Reg<"cr3",   3>;

288      def CR4  : X86Reg<"cr4",   4>;

289      def CR5  : X86Reg<"cr5",   5>;

290      def CR6  : X86Reg<"cr6",   6>;

291      def CR7  : X86Reg<"cr7",   7>;

292      def CR8  : X86Reg<"cr8",   8>;

293      def CR9  : X86Reg<"cr9",   9>;

294      def CR10 : X86Reg<"cr10", 10>;

295      def CR11 : X86Reg<"cr11", 11>;

296      def CR12 : X86Reg<"cr12", 12>;

297      def CR13 : X86Reg<"cr13", 13>;

298      def CR14 : X86Reg<"cr14", 14>;

299      def CR15 : X86Reg<"cr15", 15>;

300     

301      // Pseudo index registers

302      def EIZ : X86Reg<"eiz", 4>;

303      def RIZ : X86Reg<"riz", 4>;

304     

305      // Bound registers, used in MPX instructions

306      def BND0 : X86Reg<"bnd0",   0>;

307      def BND1 : X86Reg<"bnd1",   1>;

308      def BND2 : X86Reg<"bnd2",   2>;

309      def BND3 : X86Reg<"bnd3",   3>;

这些定义可以参考《Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1》中的表3-2(这些都是程序员可访问的寄存器)。

寄存器类型

没有REX前缀

有REX前缀

字节寄存器

AL, BL, CL, DL, AH, BH, CH, DH

AL, BL, CL, DL, DIL, SIL, BPL, SPL, R8L-R15L

字寄存器

AX, BX, CX, DX, DI, SI, BP, SP

AX, BX, CX, DX, DI, SI, BP, SP, R8W-R15W

双字寄存器

EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP

EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP, R8D-R15D

四字寄存器

N.A.

RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8-R15

V7.0增加了这两个寄存器的定义:

// The direction flag.

def DF : X86Reg<"dirflag", 0>;

// CET registers - Shadow Stack Pointer

def SSP : X86Reg<"ssp", 0>;

2.2.2.1.3. RegisterClass

寄存器也是有类型的,比如浮点值不能存入EBX这样的通用寄存器,但与普通的类型系统又有所不同,比如MMX寄存器就可以放入整数或浮点值,在通用寄存器不够用时,可以用来存放整数值,反之不可。因此,为了描述寄存器的用途,LLVM定义了RegisterClass。相同用途的寄存器归入同一个RegisterClass,同一个RegisterClass中的寄存器可以互换。

151      class RegisterClass<string namespace, list<ValueType> regTypes, int alignment,

152                          dag regList, RegAltNameIndex idx = NoRegAltName>

153        : DAGOperand {

154        string Namespace = namespace;

155     

156        // RegType - Specify the list ValueType of the registers in this register

157        // class.  Note that all registers in a register class must have the same

158        // ValueTypes.  This is a list because some targets permit storing different

159        // types in same register, for example vector values with 128-bit total size,

160        // but different count/size of items, like SSE on x86.

161        //

162        list<ValueType> RegTypes = regTypes;

163     

164        // Size - Specify the spill size in bits of the registers.  A default value of

165        // zero lets tablgen pick an appropriate size.

166        int Size = 0;

167     

168        // Alignment - Specify the alignment required of the registers when they are

169        // stored or loaded to memory.

170        //

171        int Alignment = alignment;

172     

173        // CopyCost - This value is used to specify the cost of copying a value

174        // between two registers in this register class. The default value is one

175        // meaning it takes a single instruction to perform the copying. A negative

176        // value means copying is extremely expensive or impossible.

177        int CopyCost = 1;

178     

179        // MemberList - Specify which registers are in this class.  If the

180        // allocation_order_* method are not specified, this also defines the order of

181        // allocation used by the register allocator.

182        //

183        dag MemberList = regList;

184     

185        // AltNameIndex - The alternate register name to use when printing operands

186        // of this register class. Every register in the register class must have

187        // a valid alternate name for the given index.

188        RegAltNameIndex altNameIndex = idx;

189     

190        // isAllocatable - Specify that the register class can be used for virtual

191        // registers and register allocation.  Some register classes are only used to

192        // model instruction operand constraints, and should have isAllocatable = 0.

193        bit isAllocatable = 1;

194     

195        // AltOrders - List of alternative allocation orders. The default order is

196        // MemberList itself, and that is good enough for most targets since the

197        // register allocators automatically remove reserved registers and move

198        // callee-saved registers to the end.

199        list<dag> AltOrders = [];

200     

201        // AltOrderSelect - The body of a function that selects the allocation order

202        // to use in a given machine function. The code will be inserted in a

203        // function like this:

204        //

205        //   static inline unsigned f(const MachineFunction &MF) { ... }

206        //

207        // The function should return 0 to select the default order defined by

208        // MemberList, 1 to select the first AltOrders entry and so on.

209        code AltOrderSelect = [{}];

210     

211        // Specify allocation priority for register allocators using a greedy

212        // heuristic. Classes with higher priority values are assigned first. This is

213        // useful as it is sometimes beneficial to assign registers to highly

214        // constrained classes first. The value has to be in the range [0,63].

215        int AllocationPriority = 0;

216      }

162行的RegTypes是该类别寄存器支持的类型,支持的类型可有多个,因此需要list。183行的MemberList指定同一个RegisterClass中寄存器的分配顺序(在前的先用)。但是,对某些处理器家族,比如X86,不同类型CPU的寄存器类型、数量有很大的差异,MemberList只适用其中的部分CPU,对其他的CPU需要另一个序列,这就是AltOrders。这时需要一个方法,指明到底用谁(返回0使用MemberList的顺序,返回1使用AltOrders的顺序),因此AltOrderSelect用于封装嵌入的选择函数的代码片段。

V7.0增加了以下的域:

// The register size/alignment information, parameterized by a HW mode.

RegInfoByHwMode RegInfos;

string DiagnosticType = "";

string DiagnosticString = "";

其中,RegInfoByHwMode是这样定义的:

class RegInfoByHwMode<list<HwMode> Ms = [], list<RegInfo> Ts = []>

    : HwModeSelect<Ms> {

  // The length of this list must be the same as the length of Ms.

  list<RegInfo> Objects = Ts;

}

class HwModeSelect<list<HwMode> Ms> {

  list<HwMode> Modes = Ms;

}

class HwMode<string FS> {

  // A string representing subtarget features that turn on this HW mode.

  // For example, "+feat1,-feat2" will indicate that the mode is active

  // when "feat1" is enabled and "feat2" is disabled at the same time.

  // Any other features are not checked.

  // When multiple modes are used, they should be mutually exclusive,

  // otherwise the results are unpredictable.

  string Features = FS;

}

class RegInfo<int RS, int SS, int SA> {

  int RegSize = RS;         // Register size in bits.

  int SpillSize = SS;       // Spill slot size in bits.

  int SpillAlignment = SA;  // Spill slot alignment in bits.

}

RegInfoByHwMode定义不同硬件模式下寄存器的使用细节(大小、对齐、溅出大小)。

X86目标机器定义了这些RegisterClass:

328      def GR8 : RegisterClass<"X86", [i8],  8,

329                              (add AL, CL, DL, AH, CH, DH, BL, BH, SIL, DIL, BPL, SPL,

330                                   R8B, R9B, R10B, R11B, R14B, R15B, R12B, R13B)> {

331        let AltOrders = [(sub GR8, AH, BH, CH, DH)];

332        let AltOrderSelect = [{

333          return MF.getSubtarget<X86Subtarget>().is64Bit();

334        }];

335      }

336     

337      def GR16 : RegisterClass<"X86", [i16], 16,

338                               (add AX, CX, DX, SI, DI, BX, BP, SP,

339                                    R8W, R9W, R10W, R11W, R14W, R15W, R12W, R13W)>;

340     

341      def GR32 : RegisterClass<"X86", [i32], 32,

342                               (add EAX, ECX, EDX, ESI, EDI, EBX, EBP, ESP,

343                                    R8D, R9D, R10D, R11D, R14D, R15D, R12D, R13D)>;

344     

345      // GR64 - 64-bit GPRs. This oddly includes RIP, which isn't accurate, since

346      // RIP isn't really a register and it can't be used anywhere except in an

347      // address, but it doesn't cause trouble.

348      def GR64 : RegisterClass<"X86", [i64], 64,

349                               (add RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11,

350                                    RBX, R14, R15, R12, R13, RBP, RSP, RIP)>;

351     

352      // Segment registers for use by MOV instructions (and others) that have a

353      //   segment register as one operand.  Always contain a 16-bit segment

354      //   descriptor.

355      def SEGMENT_REG : RegisterClass<"X86", [i16], 16, (add CS, DS, SS, ES, FS, GS)>;

356     

357      // Debug registers.

358      def DEBUG_REG : RegisterClass<"X86", [i32], 32, (sequence "DR%u", 0, 7)>;

359     

360      // Control registers.

361      def CONTROL_REG : RegisterClass<"X86", [i64], 64, (sequence "CR%u", 0, 15)>;

362     

363      // GR8_ABCD_L, GR8_ABCD_H, GR16_ABCD, GR32_ABCD, GR64_ABCD - Subclasses of

364      // GR8, GR16, GR32, and GR64 which contain just the "a" "b", "c", and "d"

365      // registers. On x86-32, GR16_ABCD and GR32_ABCD are classes for registers

366      // that support 8-bit subreg operations. On x86-64, GR16_ABCD, GR32_ABCD,

367      // and GR64_ABCD are classes for registers that support 8-bit h-register

368      // operations.

369      def GR8_ABCD_L : RegisterClass<"X86", [i8], 8, (add AL, CL, DL, BL)>;

370      def GR8_ABCD_H : RegisterClass<"X86", [i8], 8, (add AH, CH, DH, BH)>;

371      def GR16_ABCD : RegisterClass<"X86", [i16], 16, (add AX, CX, DX, BX)>;

372      def GR32_ABCD : RegisterClass<"X86", [i32], 32, (add EAX, ECX, EDX, EBX)>;

373      def GR64_ABCD : RegisterClass<"X86", [i64], 64, (add RAX, RCX, RDX, RBX)>;

374      def GR32_TC   : RegisterClass<"X86", [i32], 32, (add EAX, ECX, EDX)>;

375      def GR64_TC   : RegisterClass<"X86", [i64], 64, (add RAX, RCX, RDX, RSI, RDI,

376                                                           R8, R9, R11, RIP)>;

377      def GR64_TCW64 : RegisterClass<"X86", [i64], 64, (add RAX, RCX, RDX,

378                                                            R8, R9, R11)>;

379     

380      // GR8_NOREX - GR8 registers which do not require a REX prefix.

381      def GR8_NOREX : RegisterClass<"X86", [i8], 8,

382                                    (add AL, CL, DL, AH, CH, DH, BL, BH)> {

383        let AltOrders = [(sub GR8_NOREX, AH, BH, CH, DH)];

384        let AltOrderSelect = [{

385          return MF.getSubtarget<X86Subtarget>().is64Bit();

386        }];

387      }

388      // GR16_NOREX - GR16 registers which do not require a REX prefix.

389      def GR16_NOREX : RegisterClass<"X86", [i16], 16,

390                                     (add AX, CX, DX, SI, DI, BX, BP, SP)>;

391      // GR32_NOREX - GR32 registers which do not require a REX prefix.

392      def GR32_NOREX : RegisterClass<"X86", [i32], 32,

393                                     (add EAX, ECX, EDX, ESI, EDI, EBX, EBP, ESP)>;

394      // GR64_NOREX - GR64 registers which do not require a REX prefix.

395      def GR64_NOREX : RegisterClass<"X86", [i64], 64,

396                                  (add RAX, RCX, RDX, RSI, RDI, RBX, RBP, RSP, RIP)>;

397     

398      // GR32_NOAX - GR32 registers except EAX. Used by AddRegFrm of XCHG32 in 64-bit

399      // mode to prevent encoding using the 0x90 NOP encoding. xchg %eax, %eax needs

400      // to clear upper 32-bits of RAX so is not a NOP.

401      def GR32_NOAX : RegisterClass<"X86", [i32], 32, (sub GR32, EAX)>;

402     

403      // GR32_NOSP - GR32 registers except ESP.

404      def GR32_NOSP : RegisterClass<"X86", [i32], 32, (sub GR32, ESP)>;

405     

406      // GR64_NOSP - GR64 registers except RSP (and RIP).

407      def GR64_NOSP : RegisterClass<"X86", [i64], 64, (sub GR64, RSP, RIP)>;

408     

409      // GR32_NOREX_NOSP - GR32 registers which do not require a REX prefix except

410      // ESP.

411      def GR32_NOREX_NOSP : RegisterClass<"X86", [i32], 32,

412                                          (and GR32_NOREX, GR32_NOSP)>;

413     

414      // GR64_NOREX_NOSP - GR64_NOREX registers except RSP.

415      def GR64_NOREX_NOSP : RegisterClass<"X86", [i64], 64,

416                                          (and GR64_NOREX, GR64_NOSP)>;

417     

418      // A class to support the 'A' assembler constraint: EAX then EDX.

419      def GR32_AD : RegisterClass<"X86", [i32], 32, (add EAX, EDX)>;

420     

421      // Scalar SSE2 floating point registers.

422      def FR32 : RegisterClass<"X86", [f32], 32, (sequence "XMM%u", 0, 15)>;

423     

424      def FR64 : RegisterClass<"X86", [f64], 64, (add FR32)>;

425     

426     

427      // FIXME: This sets up the floating point register files as though they are f64

428      // values, though they really are f80 values.  This will cause us to spill

429      // values as 64-bit quantities instead of 80-bit quantities, which is much much

430      // faster on common hardware.  In reality, this should be controlled by a

431      // command line option or something.

432     

433      def RFP32 : RegisterClass<"X86",[f32], 32, (sequence "FP%u", 0, 6)>;

434      def RFP64 : RegisterClass<"X86",[f64], 32, (add RFP32)>;

435      def RFP80 : RegisterClass<"X86",[f80], 32, (add RFP32)>;

436     

437      // Floating point stack registers (these are not allocatable by the

438      // register allocator - the floating point stackifier is responsible

439      // for transforming FPn allocations to STn registers)

440      def RST : RegisterClass<"X86", [f80, f64, f32], 32, (sequence "ST%u", 0, 7)> {

441        let isAllocatable = 0;

442      }

443     

444      // Generic vector registers: VR64 and VR128.

445      def VR64: RegisterClass<"X86", [x86mmx], 64, (sequence "MM%u", 0, 7)>;

446      def VR128 : RegisterClass<"X86", [v16i8, v8i16, v4i32, v2i64, v4f32, v2f64],

447                                128, (add FR32)>;

448      def VR256 : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],

449                                256, (sequence "YMM%u", 0, 15)>;

450     

451      // Status flags registers.

452      def CCR : RegisterClass<"X86", [i32], 32, (add EFLAGS)> {

453        let CopyCost = -1;  // Don't allow copying of status registers.

454        let isAllocatable = 0;

455      }

456      def FPCCR : RegisterClass<"X86", [i16], 16, (add FPSW)> {

457        let CopyCost = -1;  // Don't allow copying of status registers.

458        let isAllocatable = 0;

459      }

460     

461      // AVX-512 vector/mask registers.

462      def VR512 : RegisterClass<"X86", [v16f32, v8f64, v64i8, v32i16, v16i32, v8i64], 512,

463          (sequence "ZMM%u", 0, 31)>;

464     

465      // Scalar AVX-512 floating point registers.

466      def FR32X : RegisterClass<"X86", [f32], 32, (sequence "XMM%u", 0, 31)>;

467     

468      def FR64X : RegisterClass<"X86", [f64], 64, (add FR32X)>;

469     

470      // Extended VR128 and VR256 for AVX-512 instructions

471      def VR128X : RegisterClass<"X86", [v16i8, v8i16, v4i32, v2i64, v4f32, v2f64],

472                                128, (add FR32X)>;

473      def VR256X : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],

474                                256, (sequence "YMM%u", 0, 31)>;

475     

476      // Mask registers

477      def VK1     : RegisterClass<"X86", [i1],    8,  (sequence "K%u", 0, 7)> {let Size = 8;}

478      def VK2     : RegisterClass<"X86", [v2i1],  8,  (add VK1)> {let Size = 8;}

479      def VK4     : RegisterClass<"X86", [v4i1],  8,  (add VK2)> {let Size = 8;}

480      def VK8     : RegisterClass<"X86", [v8i1],  8,  (add VK4)> {let Size = 8;}

481      def VK16    : RegisterClass<"X86", [v16i1], 16, (add VK8)> {let Size = 16;}

482      def VK32    : RegisterClass<"X86", [v32i1], 32, (add VK16)> {let Size = 32;}

483      def VK64    : RegisterClass<"X86", [v64i1], 64, (add VK32)> {let Size = 64;}

484     

485      def VK1WM   : RegisterClass<"X86", [i1],    8,  (sub VK1, K0)> {let Size = 8;}

486      def VK2WM   : RegisterClass<"X86", [v2i1],  8,  (sub VK2, K0)> {let Size = 8;}

487      def VK4WM   : RegisterClass<"X86", [v4i1],  8,  (sub VK4, K0)> {let Size = 8;}

488      def VK8WM   : RegisterClass<"X86", [v8i1],  8,  (sub VK8, K0)> {let Size = 8;}

489      def VK16WM  : RegisterClass<"X86", [v16i1], 16, (add VK8WM)>   {let Size = 16;}

490      def VK32WM  : RegisterClass<"X86", [v32i1], 32, (add VK16WM)> {let Size = 32;}

491      def VK64WM  : RegisterClass<"X86", [v64i1], 64, (add VK32WM)> {let Size = 64;}

492     

493      // Bound registers

494      def BNDR : RegisterClass<"X86", [v2i64], 128, (sequence "BND%u", 0, 3)>;

TD语言提供支持简单集合操作的关键字。像上面的add可向当前的dag添加指定的集合成员,sub则从指定的dag集合删除指定的成员。而sequence则可以方便地生成一系列成员并加入当前的dag,比如最后一行的sequence "BND%u", 0, 3将创建一个包含BND0,BND1,BND2与BND3的集合。TableGen的语法解析器在遇到这些关键字时完成相关的操作。

具体来说,在64位机器里,GR8与GR8_NOREX都要排除掉AH,BH,CH与DH,具体原因是:它们不能在一个要求REX前缀的指令里编码,而SIL,DIL,BPL,R8D等要求一个REX前缀。例如,addb %ah, %dil与movzbl %ah, %r8d不能被编码。

同样v7.0添加了SPH等对应的寄存器类别,不一一列举。

2.2.2.1.4. ARM的例子

X86的寄存器描述不算复杂。ARM则是一个极端。ARM架构有16个统一的(uniform)32位寄存器,另外,它的feature VPF与NEON还有额外的16✕64位寄存器。VPF与NEON可以把这些寄存器视为不同的大小(32, 64, 128, 256比特)。由于存在这样复杂的关系,因此,不像X86那样一段一段地使用SubRegIndex来描述,TableGen采用了比较自动化的方式。

29        let Namespace = "ARM" in {

30        def qqsub_0 : SubRegIndex<256>;

31        def qqsub_1 : SubRegIndex<256, 256>;

32       

33        // Note: Code depends on these having consecutive numbers.

34        def qsub_0 : SubRegIndex<128>;

35        def qsub_1 : SubRegIndex<128, 128>;

36        def qsub_2 : ComposedSubRegIndex<qqsub_1, qsub_0>; // 偏移256,长度128

37        def qsub_3 : ComposedSubRegIndex<qqsub_1, qsub_1>; // 偏移384,长度128

38       

39        def dsub_0 : SubRegIndex<64>;

40        def dsub_1 : SubRegIndex<64, 64>;

41        def dsub_2 : ComposedSubRegIndex<qsub_1, dsub_0>; // 偏移128,长度64

42        def dsub_3 : ComposedSubRegIndex<qsub_1, dsub_1>; // 偏移192,长度64

43        def dsub_4 : ComposedSubRegIndex<qsub_2, dsub_0>; // 偏移256,长度64

44        def dsub_5 : ComposedSubRegIndex<qsub_2, dsub_1>; // 偏移320,长度64

45        def dsub_6 : ComposedSubRegIndex<qsub_3, dsub_0>; // 偏移384,长度64

46        def dsub_7 : ComposedSubRegIndex<qsub_3, dsub_1>; // 偏移448,长度64

47       

48        def ssub_0  : SubRegIndex<32>;

49        def ssub_1  : SubRegIndex<32, 32>;

50        def ssub_2  : ComposedSubRegIndex<dsub_1, ssub_0>; // 偏移64,长度32

51        def ssub_3  : ComposedSubRegIndex<dsub_1, ssub_1>; // 偏移96,长度32

52       

53        def gsub_0  : SubRegIndex<32>;

54        def gsub_1  : SubRegIndex<32, 32>;

55        // Let TableGen synthesize the remaining 12 ssub_* indices.

56        // We don't need to name them.

57        }

在TD描述中,ARM的16个Q寄存器(64位)分为两组,每组512个比特。这样分是因为这些寄存器还可以用作D寄存器(32位),而不同的ARM版本看到16或32个D寄存器。因此,定义上面所示的SubRegIndex。

上面的ComposedSubRegIndex派生定义描述了SubRegIndex之间的复合关系。

63        class ComposedSubRegIndex<SubRegIndex A, SubRegIndex B>

64          : SubRegIndex<B.Size, !if(!eq(A.Offset, -1), -1,

65                                                   !if(!eq(B.Offset, -1), -1,

66                                                         !add(A.Offset, B.Offset)))> {

67          // See SubRegIndex.

68          let ComposedOf = [A, B];

69        }

复合出来的SubRegIndex的大小等于B的大小,偏移是A与B偏移的和(如果这两个偏移都不是-1,否则就是-1)。在上面的代码片段里,特别给出了这些SubRegIndex所描述的寄存器片段的注释。这些片段需要对应的Register定义,因此TableGen中也还有一个比较自动化描述Register的辅助类。

278      class RegisterTuples<list<SubRegIndex> Indices, list<dag> Regs> {

279        // SubRegs - N lists of registers to be zipped up. Super-registers are

280        // synthesized from the first element of each SubRegs list, the second

281       // element and so on.

282        list<dag> SubRegs = Regs;

283     

284        // SubRegIndices - N SubRegIndex instances. This provides the names of the

285        // sub-registers in the synthesized super-registers.

286        list<SubRegIndex> SubRegIndices = Indices;

287      }

RegisterTuples的效果可用以下代码说明。

定义:def EvenOdd : RegisterTuples<[sube, subo], [(add R0, R2), (add R1, R3)]>;,将产生与下面代码等效的定义:

let SubRegIndices = [sube, subo] in {

  def R0_R1 : RegisterWithSubRegs<"", [R0, R1]>;

  def R2_R3 : RegisterWithSubRegs<"", [R2, R3]>;

}

以ARM本身来说,所采用的定义比上面的例子要更为复杂。比如ARM这样描述D类别的寄存器(双精度浮点或通用64位向量寄存器):

284      def DPR : RegisterClass<"ARM", [f64, v8i8, v4i16, v2i32, v1i64, v2f32], 64,

285                              (sequence "D%u", 0, 31)> {

286        // Allocate non-VFP2 registers D16-D31 first.

287        let AltOrders = [(rotl DPR, 16)];

288        let AltOrderSelect = [{ return 1; }];

289      }

根据定义,属于DPR类别的寄存器为D0~D31。那么连续3个D寄存器所组成的超级寄存器的定义则是:

347      def Tuples3D : RegisterTuples<[dsub_0, dsub_1, dsub_2],

348                                    [(shl DPR, 0),

349                                     (shl DPR, 1),

350                                     (shl DPR, 2)]>;

(shl DPR, N)的含义是删除前N个成员。因此,(shl DPR, 0)生成D0~D31,(shl DPR, 1)生成D1~D31,(shl DPR, 2)生成D2~D31。最终会生成这些超级寄存器[D0, D1, D2],[D1, D2, D3],…[D29, D30, D31],而它们的索引则分别由dsub_0,dsub_1与dsub_2描述。

显然,这样的做法比一个个来定义要紧凑得多,但TableGen的处理相应大大地复杂了。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值