目录
1 ARM发展
ARM是Advanced RISC Machine的缩写,即进阶精简指令集机器。arm更早称为Acorn RISC Machine,是一个32位精简指令集(RISC)处理器架构。也有基于ARM设计的派生产品,主要产品包括Marvell的XScale架构和和德州仪器的OMAP系列。ARM家族中32位嵌入式处理器占比达75%,由于ARM的低功耗特性,被广泛反应于移动通信领域、便携式设备等领域。
1983年Acorn电脑公司(Acorn Computers Ltd)开始开发一颗主要用于路由器的Conexant ARM处理器,由Roger Wilson和Steve Furber带领团队,着手开发一种新架构,类似进阶的MOS Technology 6502处理器。Acorn有一大堆建构在6502架构上的电脑。该团队在1985年时开发出ARM1 Sample版,并于次年量产了ARM2,ARM2具有32位的数据总线、26位的寻址空间,并提供64 Mbyte的寻址范围与16个32-bit的暂存器。
在1980年代晚期,苹果电脑开始与Acorn合作开发新版的ARM核心。1990年将设计团队另组成一间名为安谋国际科技(Advanced RISC Machines Ltd.)的新公司,。1991年首版ARM6出样,然后苹果电脑使用ARM6架构的ARM 610来当作他们Apple Newton PDA的基础。在1994年,Acorn使用ARM 610做为他们Risc PC电脑内的CPU。
ARM是一家微处理器行业的知名企业,该企业设计了大量高性能、廉价、耗能低的RISC (精简指令集)处理器,它只设计芯片而不生产。ARM的经营模式在于出售其知识产权核(IP core),将技术授权给世界上许多著名的半导体、软件和OEM厂商,并提供技术服务。
ARM的版本分为两类,一个是内核版本,一个处理器版本。内核版本也就是ARM架构,如ARMv1、ARMv2、ARMv3、ARMv4、ARMv5、ARMv6、ARMv7、ARMv8等。处理器版本也就是ARM处理器,如ARM1、ARM9、ARM11、ARM Cortex-A(A7、A9、A15),ARM Cortex-M(M1、M3、M4)、ARM Cortex-R,这个也是我们通常意义上所指的ARM版本。
2 ARM版本
ARM版本信息简化表如下表所示。
内核(架构)版本 |
处理器版本 |
ARMv1 |
ARM1 |
ARMv2 |
ARM2、ARM3 |
ARMv3 |
ARM6、ARM7 |
ARMv4 |
StrongARM、ARM7TDMI、ARM9TDMI |
ARMv5 |
ARM7EJ、ARM9E、ARM10E、XScale |
ARMv6 |
ARM11、ARM Cortex-M |
ARMv7 |
ARM Cortex-A、ARM Cortex-M、ARM Cortex-R |
ARMv8 |
ARM Cortex-A30、ARM Cortex-A50、ARM Cortex-A70 |
ARM版本信息详细表如下表所示。(参考https://en.wikipedia.org/wiki/List_of_ARM_microarchitectures)
ARM family | ARM architecture | ARM core | Feature | Cache (I / D), MMU | Typical MIPS @ MHz | Reference |
---|---|---|---|---|---|---|
ARM1 | ARMv1 | ARM1 | First implementation | None | ||
ARM2 | ARMv2 | ARM2 | ARMv2 added the MUL (multiply) instruction | None | 4 MIPS @ 8 MHz 0.33 DMIPS/MHz |
|
ARMv2a | ARM250 | Integrated MEMC (MMU), graphics and I/O processor. ARMv2a added the SWP and SWPB (swap) instructions | None, MEMC1a | 7 MIPS @ 12 MHz | ||
ARM3 | ARMv2a | ARM3 | First integrated memory cache | 4 KB unified | 12 MIPS @ 25 MHz 0.50 DMIPS/MHz |
|
ARM6 | ARMv3 | ARM60 | ARMv3 first to support 32-bit memory address space (previously 26-bit). ARMv3M first added long multiply instructions (32x32=64). |
None | 10 MIPS @ 12 MHz | |
ARM600 | As ARM60, cache and coprocessor bus (for FPA10 floating-point unit) | 4 KB unified | 28 MIPS @ 33 MHz | |||
ARM610 | As ARM60, cache, no coprocessor bus | 4 KB unified | 17 MIPS @ 20 MHz 0.65 DMIPS/MHz |
[4] | ||
ARM7 | ARMv3 | ARM700 | 8 KB unified | 40 MHz | ||
ARM710 | As ARM700, no coprocessor bus | 8 KB unified | 40 MHz | [5] | ||
ARM710a | As ARM710 | 8 KB unified | 40 MHz 0.68 DMIPS/MHz |
|||
ARM7T | ARMv4T | ARM7TDMI(-S) | 3-stage pipeline, Thumb, ARMv4 first to drop legacy ARM 26-bit addressing | None | 15 MIPS @ 16.8 MHz 63 DMIPS @ 70 MHz |
|
ARM710T | As ARM7TDMI, cache | 8 KB unified, MMU | 36 MIPS @ 40 MHz | |||
ARM720T | As ARM7TDMI, cache | 8 KB unified, MMU with FCSE (Fast Context Switch Extension) | 60 MIPS @ 59.8 MHz | |||
ARM740T | As ARM7TDMI, cache | MPU | ||||
ARM7EJ | ARMv5TEJ | ARM7EJ-S | 5-stage pipeline, Thumb, Jazelle DBX, enhanced DSP instructions | None | ||
ARM8 | ARMv4 | ARM810 | 5-stage pipeline, static branch prediction, double-bandwidth memory | 8 KB unified, MMU | 84 MIPS @ 72 MHz 1.16 DMIPS/MHz |
[6][7] |
ARM9T | ARMv4T | ARM9TDMI | 5-stage pipeline, Thumb | None | ||
ARM920T | As ARM9TDMI, cache | 16 KB / 16 KB, MMU with FCSE (Fast Context Switch Extension) | 200 MIPS @ 180 MHz | [8] | ||
ARM922T | As ARM9TDMI, caches | 8 KB / 8 KB, MMU | ||||
ARM940T | As ARM9TDMI, caches | 4 KB / 4 KB, MPU | ||||
ARM9E | ARMv5TE | ARM946E-S | Thumb, enhanced DSP instructions, caches | Variable, tightly coupled memories, MPU | ||
ARM966E-S | Thumb, enhanced DSP instructions | No cache, TCMs | ||||
ARM968E-S | As ARM966E-S | No cache, TCMs | ||||
ARMv5TEJ | ARM926EJ-S | Thumb, Jazelle DBX, enhanced DSP instructions | Variable, TCMs, MMU | 220 MIPS @ 200 MHz | ||
ARMv5TE | ARM996HS | Clockless processor, as ARM966E-S | No caches, TCMs, MPU | |||
ARM10E | ARMv5TE | ARM1020E | 6-stage pipeline, Thumb, enhanced DSP instructions, (VFP) | 32 KB / 32 KB, MMU | ||
ARM1022E | As ARM1020E | 16 KB / 16 KB, MMU | ||||
ARMv5TEJ | ARM1026EJ-S | Thumb, Jazelle DBX, enhanced DSP instructions, (VFP) | Variable, MMU or MPU | |||
ARM11 | ARMv6 | ARM1136J(F)-S | 8-stage pipeline, SIMD, Thumb, Jazelle DBX, (VFP), enhanced DSP instructions, unaligned memory access | Variable, MMU | 740 @ 532–665 MHz (i.MX31 SoC), 400–528 MHz | [9] |
ARMv6T2 | ARM1156T2(F)-S | 9-stage pipeline, SIMD, Thumb-2, (VFP), enhanced DSP instructions | Variable, MPU | [10] | ||
ARMv6Z | ARM1176JZ(F)-S | As ARM1136EJ(F)-S | Variable, MMU + TrustZone | 965 DMIPS @ 772 MHz, up to 2,600 DMIPS with four processors | [11] | |
ARMv6K | ARM11MPCore | As ARM1136EJ(F)-S, 1–4 core SMP | Variable, MMU | |||
SecurCore | ARMv6-M | SC000 | 0.9 DMIPS/MHz | |||
ARMv4T | SC100 | |||||
ARMv7-M | SC300 | 1.25 DMIPS/MHz | ||||
Cortex-M | ARMv6-M | Cortex-M0[12] | Microcontroller profile, most Thumb + some Thumb-2,[13] hardware multiply instruction (optional small), optional system timer, optional bit-banding memory | Optional cache, no TCM, no MPU | 0.84 DMIPS/MHz | |
Cortex-M0+[14] | Microcontroller profile, most Thumb + some Thumb-2,[13] hardware multiply instruction (optional small), optional system timer, optional bit-banding memory | Optional cache, no TCM, optional MPU with 8 regions | 0.93 DMIPS/MHz | |||
Cortex-M1[15] | Microcontroller profile, most Thumb + some Thumb-2,[13] hardware multiply instruction (optional small), OS option adds SVC / banked stack pointer, optional system timer, no bit-banding memory | Optional cache, 0–1024 KB I-TCM, 0–1024 KB D-TCM, no MPU | 136 DMIPS @ 170 MHz,[16] (0.8 DMIPS/MHz FPGA-dependent)[17] | |||
ARMv7-M | Cortex-M3[18] | Microcontroller profile, Thumb / Thumb-2, hardware multiply and divide instructions, optional bit-banding memory | Optional cache, no TCM, optional MPU with 8 regions | 1.25 DMIPS/MHz | ||
ARMv7E-M | Cortex-M4[19] | Microcontroller profile, Thumb / Thumb-2 / DSP / optional VFPv4-SP single-precision FPU, hardware multiply and divide instructions, optional bit-banding memory | Optional cache, no TCM, optional MPU with 8 regions | 1.25 DMIPS/MHz (1.27 w/FPU) | ||
Cortex-M7[20] | Microcontroller profile, Thumb / Thumb-2 / DSP / optional VFPv5 single and double precision FPU, hardware multiply and divide instructions | 0−64 KB I-cache, 0−64 KB D-cache, 0–16 MB I-TCM, 0–16 MB D-TCM (all these w/optional ECC), optional MPU with 8 or 16 regions | 2.14 DMIPS/MHz | |||
ARMv8-M | Cortex-M23[21] | Microcontroller profile, Thumb-1 (most), Thumb-2 (some), Divide, TrustZone | Optional cache, no TCM, optional MPU with 16 regions | 0.99 DMIPS/MHz | ||
Cortex-M33[22] | Microcontroller profile, Thumb-1, Thumb-2, Saturated, DSP, Divide, FPU (SP), TrustZone, Co-processor | Optional cache, no TCM, optional MPU with 16 regions | 1.50 DMIPS/MHz | |||
Cortex-M35P[23] | Microcontroller profile, Thumb-1, Thumb-2, Saturated, DSP, Divide, FPU (SP), TrustZone, Co-processor | Built-in cache (with option 2–16 KB), I-cache, no TCM, optional MPU with 16 regions | 1.50 DMIPS/MHz |