【汇编优化】之ARM32与AARCH64指令集优化总结

最新推荐文章于 2025-03-21 08:03:16 发布

walkingMa

最新推荐文章于 2025-03-21 08:03:16 发布

阅读量2.2k

点赞数 2

本文链接：https://blog.csdn.net/listener51/article/details/82856001

版权

序

前文《arm64》、《arm32》已经介绍arm，aarch64优化的一些基本知识，本文着重介绍优化过程中容易混淆的点，或需注意的点。

1. 关于指令编码长度

1.1 aarch32

		A32模式（ARM instruction sets），指令固定的编码长度为32bit
		T32模式（Thumb instruction sets），指令可以编码成16bit长，也可编码成32bit长

1.2 aarch64

		指令固定的编码长度为32bit

参考https://static.docs.arm.com/ddi0487/ca/DDI0487C_a_armv8_arm.pdf A1.3.2 The ARM instruction sets

2. 关于当前指令的地址

2.1 aarch32

在ARM32状态下，当前执行指令的地址通常是pc-8，而在Thumb状态下通常是pc-4。参考地址：http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0013d/index.html 程序计数器(pc)
　
　疑问？
　指令的编码长度为32位，即4字节，为什么arm模式下，当前指令是pc-8：
　拿ARMv7三级流水线做示例，如图，假设add指令fetch时，指令地址为pc1; add指令decode时，下一条指令sub又进入fetch阶段，此时pc2 = pc1 + 4; add指令execute时，sub指令后的cmp油进入fetch阶段，此时pc = pc2 + 4，因此add指令执行时真正的pc地址pc1 = pc-8。
在这里插入图片描述
　参考https://blog.csdn.net/lee244868149/article/details/49488575/
　

2.2 aarch64

在arm64状态下，当前执行指令的地址通常是pc，英文原文：

Program counter
　The current Program Counter (PC) cannot be referred to by number as if part of the general register file and therefore cannot be used as the source or destination of arithmetic instructions, or as the base, index or transfer register of load and store instructions.
　The only instructions that read the PC are those whose function it is to compute a PC-relative address (ADR, ADRP, literal load, and direct branches), and the branch-and-link instructions that store a return address in the link register (BL and BLR). The only way to modify the program counter is using branch, exception generation and exception return instructions.
　Where the PC is read by an instruction to compute a PC-relative address, then its value is the address of that instruction. Unlike A32 and T32, there is no implied offset of 4 or 8 bytes.
参考http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch05s01s03.html 5.1.3. Registers