Llvm Simple Register Coalescing

Llvm Simple Register Coalescing

Simple Introduction

This pass is kind of like the copy propagation. Concisely, it will merge two liveintervals if possible. The pass will coordinate with phi-node-elimination, deletes the redundant COPY after phi-node-elimination.

Example

The following is the MI sequence before the pass.

# *** IR Dump Before Simple Register Coalescing (simple-register-coalescing) ***:
# Machine code for function mult: NoPHIs, TracksLiveness, TiedOpsRewritten
Function Live Ins: $edi in %3

0B	bb.0.entry:
	  successors: %bb.1(0x50000000), %bb.2(0x30000000); %bb.1(62.50%), %bb.2(37.50%)
	  liveins: $edi
16B	[[13]]  %3:gr32 = COPY $edi
32B	[[2]]  %4:gr32 = MOV32r0 implicit-def dead $eflags
48B	[[3]]  %5:gr8 = COPY %4.sub_8bit:gr32
64B	[[4]]  TEST8rr %5:gr8, %5:gr8, implicit-def $eflags
80B	[[5]]  JCC_1 %bb.2, 5, implicit killed $eflags
96B	[[6]]  JMP_1 %bb.1

112B	bb.1.if.then:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

128B	[[8]]  %0:gr32 = MOV32ri 2
144B	[[15]]  %7:gr32 = COPY %0:gr32
160B	[[9]]  JMP_1 %bb.3

176B	bb.2.if.else:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

192B	[[17]]  %1:gr32 = COPY %3:gr32
208B	[[7]]  %1:gr32 = nsw ADD32ri8 %1:gr32(tied-def 0), 2, implicit-def dead $eflags
224B	[[16]]  %7:gr32 = COPY %1:gr32

240B	bb.3.if.end:
	; predecessors: %bb.2, %bb.1

256B	[[14]]  %2:gr32 = COPY %7:gr32
272B	[[18]]  %6:gr32 = COPY %2:gr32
288B	[[10]]  %6:gr32 = nsw INC32r %6:gr32(tied-def 0), implicit-def dead $eflags
304B	[[11]]  $eax = COPY %6:gr32
320B	[[12]]  RET 0, killed $eax

After the pass, the sequence will be

# *** IR Dump Before Rename Disconnected Subregister Components (rename-independent-subregs) ***:
# Machine code for function mult: NoPHIs, TracksLiveness, TiedOpsRewritten
Function Live Ins: $edi in %3

0B	bb.0.entry:
	  successors: %bb.1(0x50000000), %bb.2(0x30000000); %bb.1(62.50%), %bb.2(37.50%)
	  liveins: $edi
16B	[[13]]  %7:gr32 = COPY $edi
32B	[[2]]  %4:gr32 = MOV32r0 implicit-def dead $eflags
64B	[[4]]  TEST8rr %4.sub_8bit:gr32, %4.sub_8bit:gr32, implicit-def $eflags
80B	[[5]]  JCC_1 %bb.2, 5, implicit killed $eflags
96B	[[6]]  JMP_1 %bb.1

112B	bb.1.if.then:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

128B	[[8]]  %7:gr32 = MOV32ri 2
160B	[[9]]  JMP_1 %bb.3

176B	bb.2.if.else:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

208B	[[7]]  %7:gr32 = nsw ADD32ri8 %7:gr32(tied-def 0), 2, implicit-def dead $eflags

240B	bb.3.if.end:
	; predecessors: %bb.2, %bb.1

288B	[[10]]  %7:gr32 = nsw INC32r %7:gr32(tied-def 0), implicit-def dead $eflags
304B	[[11]]  $eax = COPY %7:gr32
320B	[[12]]  RET 0, killed $eax

The following show the merging process.

Self interval:
32r::48r  %4  [[3]]  // If %4 is merged into 5%, then [[2]] must be devided and  generate two dests, %4 and %5. So here although the original dest is %5, but the pass will merge %5 into %4.
Other interval:
48r::64r  %5
join-end
join-begin
Self interval:
144r::176B  %7  [[15]]
224r::240B
240B::256r
Other interval:
128r::144r  %0
join-end
join-begin
Self interval:
192r::208r  %1  [[17]]  
208r::224r
Other interval:
16r::112B  %3
176B::192r
join-end
join-begin
Self interval:
128r::176B  {%0, %7}  [[16]]
224r::240B
240B::256r
Other interval:
16r::112B  {%1, %3}
176B::208r
208r::224r
join-end
join-begin
Self interval:
16r::112B  {%0, %1, %3, %7}  [[14]]  // Although the dest is %2, but after several rounds of joincopy, the src's liveinterval is larger than dest's. So the %2 will be merged and be deleted.
128r::176B
176B::208r
208r::240B
240B::256r
Other interval:
256r::272r  {%2}
join-end
join-begin
Self interval:
16r::112B  {%0, %1, %2, %3, %7}  [[18]]
128r::176B
176B::208r
208r::240B
240B::272r
Other interval:
272r::288r  {%6}
288r::304r
join-end

When a copy is meet, and its src and dest are not the same register, the pass will try to merge the intervals of dest and src registers. If either of two intervals is too large, the optimization will stop. If the two intervals have a conflict, then the optimization will stop likely. If the conflict is caused by a coalescable copy, then the intervals can be merged. If the conflict is caused by a not-coalescable copy, by walking through the copy link, if the end of dst and the end of src are the same, then the invtervals can be merged. Following example explains such condition.

x4 = y1 add y2
x3 copy x4
x1 copy x2
src copy x3
dest copy x1(because at this point, dest equals to x4, and src equals to x4 too, so the MI can be ereased. So analyzeValue will return CR_Erase)
y3 = src and src

Certainly, there are other conditions to think about. These conditions are descripted in the body of JoinVals::analyzeValue.

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值