Using the Stack in AArch32 and AArch64

最新推荐文章于 2024-10-10 15:49:34 发布

maimang09

最新推荐文章于 2024-10-10 15:49:34 发布

阅读量153

点赞数

分类专栏： ARM 文章标签： arm开发

原文链接：https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/using-the-stack-in-aarch32-and-aarch64

版权

ARM 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

本文详细介绍了AArch32和AArch64架构中栈的使用规则，包括全递减栈指针、内存访问对齐要求以及公共接口的栈指针对齐。在AArch32中，栈指针必须始终4字节对齐，而在AArch64中，栈指针访问内存时必须16字节对齐。对于AArch64，由于硬件强制执行这一规则，实现通用的push或pop操作较为复杂。这些规则对于理解和编写涉及栈的汇编代码至关重要。

摘要由CSDN通过智能技术生成

Using the Stack in AArch32 and AArch64Using the Stack in AArch32 and AArch64 - Architectures and Processors blog - Arm Community blogs - Arm Communityhttps://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/using-the-stack-in-aarch32-and-aarch64

Jacob Bramley

November 19, 2015

When reading assembly-level code for any of the AArch32 or AArch64 instruction sets, you may have noticed that the stack pointer has various alignment and usage restrictions. These restrictions are part of the procedure-call standard – the set of common rules that allow functions to call one another. However, some of the rules also apply even if you aren't actually handling function calls. The stack is shared between parts of an application, any libraries that it uses as well as signal handlers, so it is important that these components agree on how the stack should behave.

If you're just writing C code, the compiler will sort this all out for you, but you'll need to understand the rules if you're dealing with any assembly code that needs to interact with the stack.

This article assumes that your platform uses ARM's AAPCS (for AArch32) or AAPCS64 (for AArch64). This is the case on Linux and Android, but other systems may define their own standards.

Shared Stack-Usage Rules

For both AArch32 and AArch64:

The stack is full-descending, meaning that sp – the stack pointer – points to the most recently pushed object on the stack, and it grows downwards, towards lower addresses.
sp must point to a valid address in the memory allocated for the stack.
- Formally, sp must lie in the range stack_limit < sp <= stack_base, though the values of stack_limit and stack_base are often inaccessible.
The memory below sp (but above stack_limit) must not be accessed by your code.
- In practice, signal handlers use this memory, so it can be corrupted unexpectedly and without warning.
At public interfaces, the alignment of sp must be two times the pointer size.
- For AArch32 that's 8 bytes, and for AArch64 it's 16 bytes.
- A "public interface" is typically a function that is visible to some other, separately-compiled code. The exact definition depends upon the language and the toolchain, and is out of scope of this article. It's reasonable to assume that any C or C++ functions that you interact with using assembly are treated as public interfaces.

Rules Specific to AArch32

For AArch32 (ARM or Thumb), sp must be at least 4-byte aligned at all times. As long as you only push and pop whole registers, this restriction will never be broken.

Rules Specific to AArch64

For AArch64, sp must be 16-byte aligned whenever it is used to access memory. This is enforced by AArch64 hardware.

This means that it is difficult to implement a generic push or pop operation for AArch64. There are no push or pop aliases like there are for ARM and Thumb.
The hardware checks can be disabled by privileged code, but they're enabled in at least Linux and Android.

C compilers will typically reserve stack space at the start of the function, then leave sp alone until the end, so the restriction is not as awkward as it first seems. However, you must be aware of it when handling assembly code, and it can be tricky for simple compilers (such as stack-based JIT compilers).

Note that unlike AArch32, arbitrarily-aligned values can be stored in sp, as long as the previously-described rules are followed for memory accesses and public interfaces. This is useful for allocating variable-length arrays of small values, for example:

// Allocate a variable-length array of bytes on the stack.
  sub sp, sp, x0                    // x0 holds the length.
  and sp, sp, #0xfffffffffffffff0   // Align sp.

Push and Pop on AArch64

The alignment-check-on-memory-access means that AArch64 cannot have general-purpose push- or pop-like operations.

For example:

// Broken AArch64 implementation of `push {x1}; push {x0};`.
  str   x1, [sp, #-8]!  // This works, but leaves `sp` with 8-byte alignment ...
  str   x0, [sp, #-8]!  // ... so the second `str` will fail.

In this particular case, the stores could be combined:

// AArch64 implementation of `push {x0, x1}`.
  stp   x0, x1, [sp, #-16]!

However, in a simple compiler, it is not always easy to combine instructions in that way.

If you're handling w registers, the problem will be even more apparent: these have to be pushed in sets of four to maintain stack pointer alignment, and since this isn't possible in a single instruction, the code can become difficult to follow. This is what VIXL generates, for example:

// AArch64 implementation of `push {w0, w1, w2, w3}`.
  stp   w0, w1, [sp, #-16]!   // Allocate four words and store w0 and w1 at the lower addresses.
  stp   w2, w3, [sp, #8]      // Store w2 and w3 at the upper addresses.

If you're dealing with hand-written AArch64 assembly code, you'll have to be aware of these patterns.

Many JIT compilers have a tricky situation, though: such compilers are built around a simple stack machine, and expect to be able to push and pop in an ad-hoc fashion. Managing this on AArch64 requires an inventive approach, and I'll describe a few possibilities in a follow-up article.

1Some time ago I was told that the 8-byte alignment restriction exists to allow the use of instructions such as ldrexd and strexd, which require an 8-byte-aligned address. Without a guarantee that a function will be entered with proper alignment, these instructions would be awkward to use on stack variables. There may also be other reasons, but I don't know what they are, and AAPCS doesn't document them.