Llvm Simple Register Coalescing
Simple Introduction
This pass is kind of like the copy propagation. Concisely, it will merge two liveintervals if possible. The pass will coordinate with phi-node-elimination, deletes the redundant COPY after phi-node-elimination.
Example
The following is the MI sequence before the pass.
# *** IR Dump Before Simple Register Coalescing (simple-register-coalescing) ***:
# Machine code for function mult: NoPHIs, TracksLiveness, TiedOpsRewritten
Function Live Ins: $edi in %3
0B bb.0.entry:
successors: %bb.1(0x50000000), %bb.2(0x30000000); %bb.1(62.50%), %bb.2(37.50%)
liveins: $edi
16B [[13]] %3:gr32 = COPY $edi
32B [[2]] %4:gr32 = MOV32r0 implicit-def dead $eflags
48B [[3]] %5:gr8 = COPY %4.sub_8bit:gr32
64B [[4]] TEST8rr %5:gr8, %5:gr8, implicit-def $eflags
80B [[5]] JCC_1 %bb.2, 5, implicit killed $eflags
96B [[6]] JMP_1 %bb.1
112B bb.1.if.then:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
128B [[8]] %0:gr32 = MOV32ri 2
144B [[15]] %7:gr32 = COPY %0:gr32
160B [[9]] JMP_1 %bb.3
176B bb.2.if.else:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
192B [[17]] %1:gr32 = COPY %3:gr32
208B [[7]] %1:gr32 = nsw ADD32ri8 %1:gr32(tied-def 0), 2, implicit-def dead $eflags
224B [[16]] %7:gr32 = COPY %1:gr32
240B bb.3.if.end:
; predecessors: %bb.2, %bb.1
256B [[14]] %2:gr32 = COPY %7:gr32
272B [[18]] %6:gr32 = COPY %2:gr32
288B [[10]] %6:gr32 = nsw INC32r %6:gr32(tied-def 0), implicit-def dead $eflags
304B [[11]] $eax = COPY %6:gr32
320B [[12]] RET 0, killed $eax
After the pass, the sequence will be
# *** IR Dump Before Rename Disconnected Subregister Components (rename-independent-subregs) ***:
# Machine code for function mult: NoPHIs, TracksLiveness, TiedOpsRewritten
Function Live Ins: $edi in %3
0B bb.0.entry:
successors: %bb.1(0x50000000), %bb.2(0x30000000); %bb.1(62.50%), %bb.2(37.50%)
liveins: $edi
16B [[13]] %7:gr32 = COPY $edi
32B [[2]] %4:gr32 = MOV32r0 implicit-def dead $eflags
64B [[4]] TEST8rr %4.sub_8bit:gr32, %4.sub_8bit:gr32, implicit-def $eflags
80B [[5]] JCC_1 %bb.2, 5, implicit killed $eflags
96B [[6]] JMP_1 %bb.1
112B bb.1.if.then:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
128B [[8]] %7:gr32 = MOV32ri 2
160B [[9]] JMP_1 %bb.3
176B bb.2.if.else:
; predecessors: %bb.0
successors: %bb.3(0x80000000); %bb.3(100.00%)
208B [[7]] %7:gr32 = nsw ADD32ri8 %7:gr32(tied-def 0), 2, implicit-def dead $eflags
240B bb.3.if.end:
; predecessors: %bb.2, %bb.1
288B [[10]] %7:gr32 = nsw INC32r %7:gr32(tied-def 0), implicit-def dead $eflags
304B [[11]] $eax = COPY %7:gr32
320B [[12]] RET 0, killed $eax
The following show the merging process.
Self interval:
32r::48r %4 [[3]] // If %4 is merged into 5%, then [[2]] must be devided and generate two dests, %4 and %5. So here although the original dest is %5, but the pass will merge %5 into %4.
Other interval:
48r::64r %5
join-end
join-begin
Self interval:
144r::176B %7 [[15]]
224r::240B
240B::256r
Other interval:
128r::144r %0
join-end
join-begin
Self interval:
192r::208r %1 [[17]]
208r::224r
Other interval:
16r::112B %3
176B::192r
join-end
join-begin
Self interval:
128r::176B {%0, %7} [[16]]
224r::240B
240B::256r
Other interval:
16r::112B {%1, %3}
176B::208r
208r::224r
join-end
join-begin
Self interval:
16r::112B {%0, %1, %3, %7} [[14]] // Although the dest is %2, but after several rounds of joincopy, the src's liveinterval is larger than dest's. So the %2 will be merged and be deleted.
128r::176B
176B::208r
208r::240B
240B::256r
Other interval:
256r::272r {%2}
join-end
join-begin
Self interval:
16r::112B {%0, %1, %2, %3, %7} [[18]]
128r::176B
176B::208r
208r::240B
240B::272r
Other interval:
272r::288r {%6}
288r::304r
join-end
When a copy is meet, and its src and dest are not the same register, the pass will try to merge the intervals of dest and src registers. If either of two intervals is too large, the optimization will stop. If the two intervals have a conflict, then the optimization will stop likely. If the conflict is caused by a coalescable copy, then the intervals can be merged. If the conflict is caused by a not-coalescable copy, by walking through the copy link, if the end of dst and the end of src are the same, then the invtervals can be merged. Following example explains such condition.
x4 = y1 add y2
x3 copy x4
x1 copy x2
src copy x3
dest copy x1(because at this point, dest equals to x4, and src equals to x4 too, so the MI can be ereased. So analyzeValue will return CR_Erase)
y3 = src and src
Certainly, there are other conditions to think about. These conditions are descripted in the body of JoinVals::analyzeValue.