Exploring ARM64 runtime patching alternatives

本文介绍了Linux内核的替代框架,该框架用于在启动时根据CPU特性动态补丁代码,以优化硬件功能或解决bug。通过ARM64架构的CRC32指令为例,详细阐述了如何使用宏来实现代码替换,并通过QEMU和GDB演示了补丁过程。
摘要由CSDN通过智能技术生成

https://blogs.oracle.com/linux/post/exploring-arm64-runtime-patching-alternativesicon-default.png?t=M3K6https://blogs.oracle.com/linux/post/exploring-arm64-runtime-patching-alternatives

目录

Introduction

Background

Linux Alternatives Framework

Syntax of the Framework's Macro

Examining the code with QEMU and GDB

Conclusion


Introduction

Some of today's modern CPUs come with dedicated instructions to optimize specific operations. For example, ARMv8 has CRC32 instructions to accelerate CRC calculations. The problem is that those instructions can only be executed by a processor that supports them. Although the CPU has a feature register to identify its capabilities, checking the register before executing an instruction is time-consuming. Fortunately, the Linux kernel has a set of macros and functions known as the Linux Alternatives Framework to help solve this problem. This blog gives an overview of the framework.

Background

Building and running the Linux kernel involves compiling the source code into an image file, loading the image file into memory, and then initiating execution. The image file’s format is in Executable and Linkable Format (ELF). The ELF file is comprised of multiple sections: the ".text" section stores the executable code, the ".data" section stores initialized data, and other sections that store different types of data. Usually, the code is executed without modification. In some cases, portions of the code need to be replaced (patched) to either optimize hardware features or work around bugs.

 

Linux Alternatives Framework

The Linux Alternatives Framework is a set of macros that kernel developers can use to prepare their code for boot time patching. It is available for multiple CPU architectures, including X86, ARM64, S390, and PA-RISC. The alternative macro stores the default original code in the .text 0 section and the replacement code in the .text 1 section. The macro also creates an 'alt_instr' structure containing the offset locations, instruction length, and the CPU feature bit. The structure is stored in the .alternative section.

struct alt_instr {
    s32 orig_offset; /* offset to original instruction          */
    s32 alt_offset;  /* offset to replacement instruction       */
    u16 cpufeature;  /* cpufeature bit set for replacement      */
    u8 orig_len;     /* size of original instruction(s)         */
    u8 alt_len;      /* size of new instruction(s), <= orig_len */
};

At boot time, the Linux kernel will walk through the .alternative section and compare each 'alt_instr' structure with the running CPU's features. If the machine does not have the specific feature, the default code remains unchanged. Otherwise, the kernel will replace the default code with the replacement code using the information available in the 'alt_instr' structure.

Syntax of the Framework's Macro

The macro syntax is similar to an if-then-else statement and is prefixed with the word alternative_. For example, the alternative_if is similar to the if statement, the alternative_if_not is similar to the if not, the alternative_else is similar to an else statement, and so on. The if macro marks the beginning of a code section, and the else macro starts a new code section. Finally, an endif macro ends the clause.

Let's pick the following 'crc32_le' function in arch/arm64/lib/crc32.S as an example. The example function assumes that the machine does not have the specific hardware capability and would branch to a routine that uses the software CRC function (b crc32_le_base). When the code is run on a machine with the hardware capability, the alternative macro causes the branch to be replaced by a NOP and continues to execute using hardware CRC instruction.

Original code segment                  Prefix removed
-----------------------------------    ------------------------------------------------------------------------------------------
SYM_FUNC_START(crc32_le)               SYM_FUNC_START(crc32_le)    /* start of the function                                    */
alternative_if_not ARM64_HAS_CRC32     if_not ARM64_HAS_CRC32      /* assuming the runtime machine has no hardware CRC feature */
    b crc32_le_base                        b crc32_le_base         /* default branch to software CRC routine                   */
alternative_else_nop_endif             else_nop_endif              /* patch with nop if machine has hardware CRC feature       */
    __crc32                                __crc32                 /* a macro which uses hardware CRC instructions             */
SYM_FUNC_END(crc32_le)                 SYM_FUNC_END(crc32_le)      /* end of function

After expanding the example macro, we can see how it creates the 'alt_instr' structure and stores the replacement code in a separate section. You can refer to the following code block for a line-by-line explanation. As a summary, the macro uses multiple assembler directives to calculate the offset of the original and replacement code. The replacement code is then stored in the text subsection 1. Once an 'alt_instr' is created, the kernel can use it to patch the code at boot time.

// SYM_FUNC_START(crc32_le)
// alternative_if_not ARM64_HAS_CRC32
//     b crc32_le_base
// ....................................................................................
crc32_le:                            // function name
  .set .Lasm_alt_mode, 0             // set asm_alt_mode to 0. The asm_alt_mode controls which section
                                     // to use in the else statement at label 662.
                                     // mode 0 = the code after the else statement stores in .text 1
                                     // mode 1 = the code after the else statement stores in .text 0
  .pushsection .altinstructions, "a" // append following data to .altinstructions
  .word 661f - .                     // offset to original instruction
  .word 663f - .                     // offset to replacement instruction
  .hword ARM64_HAS_CRC32             // cpufeature bit set for replacement
  .byte 662f-661f                    // size of the original instruction(s)
  .byte 664f-663f                    // size of new instructions(s)
  .popsection                        // restore to .text 0 section
661:                                 // start of the original instruction
  b    crc32_le_base                 // original instruction (software CRC)

// alternative_else_nop_endif
//     __crc32
// SYM_FUNC_END(crc32_le)
// ....................................................................................
662:                                 // end of the original instruction
    .if .Lasm_alt_mode==0            // if mode == 0 then stores the following code in .text 1
    .subsection 1                    // stores following inst in .text 1
    .else
    .previous
    .endif
663:                                 // start of the replacement code
   Nops (662b-661b) / AARCH64_INSN_SIZE // creates multipe nops matches the number of
                                     // original instruction(s). i.e., the length of the replacement
                                     // code must be the same as the original code
664:                                 // end of the replacement code
    .if .Lasm_alt_mode==0
    .previous                        // restore to .text 0
    .endif
    .org . - (664b-663b) + (662b-661b) // This is a build time check to make sure the length
                                       // of the replacement code is the same length as the original
                                       // code.
                                       // (664b - 663b) is length of the replacement code
                                       // (662b - 661b) is length of the original code
                                       // - (664b-663b) + (662b-661b) must be 0. Otherwise,
                                       // if - (664b-663b) + (662b-661b) < 0, a build error time will occur.
                                       // if - (664b-663b) + (662b-661b) > 0, the following line will cause build time error.
    .org . - (662b-661b) + (664b-663b)
    __crc32                          // Use hardware CRC

Examining the code with QEMU and GDB

After we have some idea of how the macro works, we can use QEMU and GDB to see how the Linux kernel performs the patching. On an ARMv8 host machine, start QEMU with -CPU host and use -S to cause QEMU to wait for the GDB connection.

qemu-system-aarch64 -machine virt,gic-version=3 \
     -cpu host -m 8192 -nographic -gdb \
     tcp::1234 -kernel Image -S

On a separate host terminal, start GDB and connect to the guest. Disassemble crc32_le before and after the patching.

# start gdb and load symbols from vmlinux
gdb vmlinux

# basic gdb setup and connect to the remote target
(gdb) set multiple-symbols ask
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x0000000040000000 in ?? ()

# set breakpoints before patching.
(gdb) hbreak arch/arm64/kernel/alternative.c:apply_alternatives_all
Hardware assisted breakpoint 1 at 0xffff8000113b4ea4: file arch/arm64/kernel/alternative.c, line 229.

# set a breakpoint after the patching. The breakpoint location in this example is
# arch/arm64/kernel/alternative.c, line 229 + 1. That is,
# arch/arm64/kernel/alternative.c, line 230
(gdb) hbreak arch/arm64/kernel/alternative.c:230
Hardware assisted breakpoint 2 at 0xffff8000113b4ec0: file arch/arm64/kernel/alternative.c, line 230.

# set breakpoints at crc32_le
(gdb) hbreak crc32_le
[0] cancel
[1] all
[2] arch/arm64/lib/crc32.S:crc32_le
[3] lib/crc32.c:crc32_le
> 2
# make note of the crc32_le address 0xffff800010590d60, we need it to disassemble the address
Hardware assisted breakpoint 3 at 0xffff800010590d60: file arch/arm64/lib/crc32.S, line 90.

# continue until we hit the first breakpoint
(gdb) continue
Continuing.

Breakpoint 1, apply_alternatives_all () at arch/arm64/kernel/alternative.c:229
229             stop_machine(__apply_alternatives_multi_stop, NULL, cpu_online_mask);

# disassemble the crc32_le before the patching.
# we should see "b crc32_le" which is the normal non-optimized version of calculating CRC.
(gdb) disassemble 0xffff800010590d60,+12
Dump of assembler code from 0xffff800010590d60 to 0xffff800010590d6c:
   0xffff800010590d60 <crc32_le+0>:     b       0xffff8000105ac928 <crc32_le> <== BEFORE PATCHING
   0xffff800010590d64 <crc32_le+4>:     cmp     x2, #0x10
   0xffff800010590d68 <crc32_le+8>:     b.lt    0xffff800010590e08 <crc32_le+168>
End of assembler dump.

# delete the old breakpoint
(gdb) delete 1

# continue until the 2nd breakpoint
(gdb) continue
Continuing.

Breakpoint 2, apply_alternatives_all () at arch/arm64/kernel/alternative.c:230
230     }

# disassemble the crc32_le after the patching. we should see the original
# code is patched with an nop, which causes the code to use dedicated
# CRC instruction later in the code path.
(gdb) disassemble 0xffff800010590d60,+52
Dump of assembler code from 0xffff800010590d60 to 0xffff800010590d6c:
   0xffff800010590d60 <crc32_le+0>:     nop  <== AFTER PATCHING
   0xffff800010590d64 <crc32_le+4>:     cmp     x2, #0x10
   0xffff800010590d68 <crc32_le+8>:     b.lt    0xffff800010590e08 <crc32_le+168>
   0xffff800010590d6c <crc32_le+12>:    and     x7, x2, #0x1f
   0xffff800010590d70 <crc32_le+16>:    and     x2, x2, #0xffffffffffffffe0
   0xffff800010590d74 <crc32_le+20>:    cbz     x7, 0xffff800010590de4 <crc32_le+132>
   0xffff800010590d78 <crc32_le+24>:    and     x8, x7, #0xf
   0xffff800010590d7c <crc32_le+28>:    ldp     x3, x4, [x1]
   0xffff800010590d80 <crc32_le+32>:    add     x8, x8, x1
   0xffff800010590d84 <crc32_le+36>:    add     x1, x1, x7
   0xffff800010590d88 <crc32_le+40>:    ldp     x5, x6, [x8]
   0xffff800010590d8c <crc32_le+44>:    tst     x7, #0x8
   0xffff800010590d90 <crc32_le+48>:    crc32x  w8, w0, x3 <== dedicated ARMv8 CRC instruction
End of assembler dump.

Conclusion

In this blog, we looked at the Linux Alternatives Framework. We discussed how this framework could enable CPU-specific instructions without incurring the runtime penalty of checking the feature register for every use and we gave a real-world example of the framework in operation.

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值