前言:
本笔记是基于对RISC-V DSP扩展指令集文档总结的,《P-ext-proposal.pdf》文档的关键内容如下:
主要介绍了RISC-V的P扩展指令集及其相关细节。
首先,对P扩展指令进行了概述,并列出了其与其他扩展重复的指令。
接着,详细描述了P扩展的子集,包括Zbpbo扩展和Zpn扩展(适用于RV32和RV64)的指令。
此外,还提供了仅适用于RV64的详细指令描述。
文档还介绍了新的用户控制和状态寄存器,并提供了指令编码表。最后,列出了因RVB重叠而被移除的指令。
这份文档为RISC-V的P扩展指令集提供了全面而详细的信息,包括指令的描述、编码、以及与其他扩展的关系。这对于理解、开发和优化基于RISC-V架构的系统非常有价值。同时,文档也提醒了开发者在使用P扩展时需要注意的兼容性和优化问题。
【RISC-V 指令集】RISC-V DSP 扩展指令集介绍(一)-CSDN博客
3.2. 部分SIMD数据处理指令
3.2.1. 16位组包指令
Table 12. 16-bit Packing Instructions
序号 | 指令 | 说明 |
1 | PKBB16 rd, rs1, rs2 | Pack two 16-bit data from Bottoms |
2 | PKBT16 rd, rs1, rs2 | Pack two 16-bit data Bottom & Top |
3 | PKTB16 rd, rs1, rs2 | Pack two 16-bit data Top & Bottom |
4 | PKTT16 rd, rs1, rs2 | Pack two 16-bit data from Tops |
3.2.2. 最高有效字“32x32”乘法和加法指令
Table 13. Signed MSW 32x32 Multiply and Add Instructions
序号 | 指令 | 说明 |
1 | SMMUL rd, rs1, rs2 | MSW “32 x 32” Signed Multiplication (MSW 32 = 32x32) |
2 | SMMUL.u rd, rs1, rs2 | MSW “32 x 32” Signed Multiplication with Rounding (MSW 32 = 32x32) |
3 | KMMAC rd, rs1, rs2 | MSW “32 x 32” Signed Multiplication and Saturating Addition (MSW 32 = 32 + 32x32) |
4 | KMMAC.u rd, rs1, rs2 | MSW “32 x 32” Signed Multiplication and Saturating Addition with Rounding (MSW 32 = 32 + 32x32) |
5 | KMMSB rd, rs1, rs2 | MSW “32 x 32” Signed Multiplication and Saturating Subtraction (MSW 32 = 32 - 32x32) |
6 | KMMSB.u rd, rs1, rs2 | MSW “32 x 32” Signed Multiplication and Saturating Subtraction with Rounding (MSW 32 = 32 - 32x32) |
7 | KWMMUL rd, rs1, rs2 | MSW “32 x 32” Signed Multiplication & Double (MSW 32 = 32x32 << 1) |
8 | KWMMUL.u rd, rs1, rs2 | MSW “32 x 32” Signed Multiplication & Double with Rounding (MSW 32 = 32x32 << 1) |
3.2.3. 最高有效字“32x16”乘法和加法指令
Table 14. Signed MSW 32x16 Multiply and Add Instructions
序号 | 指令 | 说明 |
1 | SMMWB rd, rs1, rs2 | MSW “32 x Bottom 16” Signed Multiplication (MSW 32 = 32x16) |
2 | SMMWB.u rd, rs1, rs2 | MSW “32 x Bottom 16” Signed Multiplication with Rounding (MSW 32 = 32x16) |
3 | SMMWT rd, rs1, rs2 | MSW “32 x Top 16” Signed Multiplication (MSW 32 = 32x16) |
4 | SMMWT.u rd, rs1, rs2 | MSW “32 x Top 16” Signed Multiplication with Rounding (MSW 32 = 32x16) |
5 | KMMAWB rd, rs1, rs2 | MSW “32 x Bottom 16” Signed Multiplication and Saturating Addition (MSW 32 = 32 + 32x16) |
6 | KMMAWB.u rd, rs1, rs2 | MSW “32 x Bottom 16” Signed Multiplication and Saturating Addition with Rounding (MSW 32 = 32 + 32x16) |
7 | KMMAWT rd, rs1, rs2 | MSW “32 x Top 16” Signed Multiplication and Saturating Addition (MSW 32 = 32 + 32x16) |
8 | KMMAWT.u rd, rs1, rs2 | MSW “32 x Top 16” Signed Multiplication and Saturating Addition with Rounding (MSW 32 = 32 + 32x16) |
9 | KMMWB2 rd, rs1, rs2 | MSW “32 x Bottom 16” Saturating Signed Multiplication and double (MSW 32 = (32x16) << 1) |
10 | KMMWB2.u rd, rs1, rs2 | MSW “32 x Bottom 16” Saturating Signed Multiplication and double with Rounding (MSW 32 = (32x16) << 1) |
11 | KMMWT2 rd, rs1, rs2 | MSW “32 x Top 16” Saturating Signed Multiplication and double (MSW 32 = (32x16) << 1) |
12 | KMMWT2.u rd, rs1, rs2 | MSW “32 x Top 16” Saturating Signed Multiplication and double with Rounding (MSW 32 = (32x16) << 1) |
13 | KMMAWB2 rd, rs1, rs2 | MSW “32 x Bottom 16” Signed Multiplication & double and Saturating Addition (MSW 32 = 32 + (32x16)<<1) |
14 | KMMAWB2.u rd, rs1, rs2 | MSW “32 x Bottom 16” Signed Multiplication & double and Saturating Addition with Rounding (MSW 32 = 32 + (32x16)<<1) |
15 | KMMAWT2 rd, rs1, rs2 | MSW “32 x Top 16” Signed Multiplication & double and Saturating Addition (MSW 32 = 32 + (32x16)<<1) |
16 | KMMAWT2.u rd, rs1, rs2 | MSW “32 x Top 16” Signed Multiplication & double and Saturating Addition with Rounding (MSW 32 = 32 + (32x16)<<1) |
3.2.4. 带32位加/减法的有符号16位乘法指令
Table 15. Signed 16-bit Multiply 32-bit Add/Subtract Instructions
序号 | 指令 | 说明 |
1 | SMBB16 rd, rs1, rs2 | Signed Multiply Bottom 16 & Bottom 16 (32 = 16x16) |
2 | SMBT16 rd, rs1, rs2 | Signed Multiply Bottom 16 & Top 16 (32 = 16x16) |
3 | SMTT16 rd, rs1, rs2 | Signed Multiply Top 16 & Top 16 (32 = 16x16) |
4 | KMDA rd, rs1, rs2 | Two “16x16” and Signed Addition (32 = 16x16 + 16x16) |
5 | KMXDA rd, rs1, rs2 | Two Crossed “16x16” and Signed Addition (32 = 16x16 + 16x16) |
6 | SMDS rd, rs1, rs2 | Two “16x16” and Signed Subtraction (32 = 16x16 - 16x16) |
7 | SMDRS rd, rs1, rs2 | Two “16x16” and Signed Reversed Subtraction (32 = 16x16 - 16x16) |
8 | SMXDS rd, rs1, rs2 | Two Crossed “16x16” and Signed Subtraction (32 = 16x16 - 16x16) |
9 | KMABB rd, rs1, rs2 | “Bottom 16 x Bottom 16” with 32-bit Signed Addition (32 = 32 + 16x16) |
10 | KMABT rd, rs1, rs2 | “Bottom 16 x Top 16” with 32-bit Signed Addition (32 = 32 + 16x16) |
11 | KMATT rd, rs1, rs2 | “Top 16 x Top 16” with 32-bit Signed Addition (32 = 32 + 16x16) |
12 | KMADA rd, rs1, rs2 | Two “16x16” with 32-bit Signed Double Addition (32 = 32 + 16x16 + 16x16) |
13 | KMAXDA rd, rs1, rs2 | Two Crossed “16x16” with 32-bit Signed Double Addition (32 = 32 + 16x16 + 16x16) |
14 | KMADS rd, rs1, rs2 | Two “16x16” with 32-bit Signed Addition and Subtraction (32 = 32 + 16x16 - 16x16) |
15 | KMADRS rd, rs1, rs2 | Two “16x16” with 32-bit Signed Addition and Reversed Subtraction (32 = 32 + 16x16 - 16x16) |
16 | KMAXDS rd, rs1, rs2 | Two Crossed “16x16” with 32-bit Signed Addition and Subtraction (32 = 32 + 16x16 - 16x16) |
17 | KMSDA rd, rs1, rs2 | Two “16x16” with 32-bit Signed Double Subtraction (32 = 32 - 16x16 - 16x16) |
18 | KMSXDA rd, rs1, rs2 | Two Crossed “16x16” with 32-bit Signed Double Subtraction (32 = 32 - 16x16 - 16x16) |
3.2.5. 带64位加/减法的有符号16位乘法指令
Table 16. Signed 16-bit Multiply 64-bit Add/Subtract Instructions
序号 | 指令 | 说明 |
1 | SMAL rd, rs1, rs2 | “16 x 16” with 64-bit Signed Addition (64 = 64 + 16x16) |
3.2.6. 其他指令
Table 17. Partial-SIMD Miscellaneous Instructions
序号 | 指令 | 说明 |
1 | SCLIP32 rd, rs1, imm5u | Signed Clip Value |
2 | UCLIP32 rd, rs1, imm5u | Unsigned Clip Value |
3 | CLRS32 rd, rs1 | 32-bit Count Leading Redundant Sign |
4 | CLZ32 rd, rs1 | 32-bit Count Leading Zero |
5 | PBSAD rd, rs1, rs2 | Parallel Byte Sum of Absolute Difference |
6 | PBSADA rd, rs1, rs2 | Parallel Byte Sum of Absolute Difference Accumulation |
3.2.7. 带32位加法的8位乘法指令
Table 18. 8-bit Multiply with 32-bit Add Instructions
序号 | 指令 | 说明 |
1 | SMAQA rd, rs1, rs2 | Four signed “8x8” with 32-bit Signed Addition (32 = 32 + 8x8 + 8x8 + 8x8 + 8x8) |
2 | UMAQA rd, rs1, rs2 | Four unsigned “8x8” with 32-bit Unsigned Addition (32 = 32 + 8x8 + 8x8 + 8x8 + 8x8) |
3 | SMAQA.SU rd, rs1, rs2 | Four “signed 8 x unsigned 8” with 32- bit Signed Addition (32 = 32 + 8x8 + 8x8 + 8x8 + 8x8) |
3.3 64位数据计算指令
3.3.1 64位加减指令
Table 19. 64-bit Add/Subtract Instructions
序号 | 指令 | 说明 |
1 | ADD64 rd, rs1, rs2 | 64-bit Addition |
2 | RADD64 rd, rs1, rs2 | 64-bit Signed Halving Addition |
3 | URADD64 rd, rs1, rs2 | 64-bit Unsigned Halving Addition |
4 | KADD64 rd, rs1, rs2 | 64-bit Signed Saturating Addition |
5 | UKADD64 rd, rs1, rs2 | 64-bit Unsigned Saturating Addition |
6 | SUB64 rd, rs1, rs2 | 64-bit Subtraction |
7 | RSUB64 rd, rs1, rs2 | 64-bit Signed Halving Subtraction |
8 | URSUB64 rd, rs1, rs2 | 64-bit Unsigned Halving Subtraction |
9 | KSUB64 rd, rs1, rs2 | 64-bit Signed Saturating Subtraction |
10 | UKSUB64 rd, rs1, rs2 | 64-bit Unsigned Saturating Subtraction |
3.3.2 32位乘法与64位加减指令
Table 20. 32-bit Multiply 64-bit Add/Subtract Instructions
序号 | 指令 | 说明 |
1 | SMAR64 rd, rs1, rs2 | 32x32 with 64-bit Signed Addition |
2 | SMSR64 rd, rs1, rs2 | 32x32 with 64-bit Signed Subtraction |
3 | UMAR64 rd, rs1, rs2 | 32x32 with 64-bit Unsigned Addition |
4 | UMSR64 rd, rs1, rs2 | 32x32 with 64-bit Unsigned Subtraction |
5 | KMAR64 rd, rs1, rs2 | 32x32 with Saturating 64-bit Signed Addition |
6 | KMSR64 rd, rs1, rs2 | 32x32 with Saturating 64-bit Signed Subtraction |
7 | UKMAR64 rd, rs1, rs2 | 32x32 with Saturating 64-bit Unsigned Addition |
8 | UKMSR64 rd, rs1, rs2 | 32x32 with Saturating 64-bit Unsigned Subtraction |
3.3.3 带符号16位乘法与64位加减指令
Table 21. Signed 16-bit Multiply 64-bit Add/Subtract Instructions
序号 | 指令 | 说明 |
1 | SMALBB rd, rs1, rs2 | “Bottom 16 x Bottom 16” with 64-bit Signed Addition (64 = 64 + 16x16) |
2 | SMALBT rd, rs1, rs2 | “Bottom 16 x Top 16” with 64-bit Signed Addition (64 = 64 + 16x16) |
3 | SMALTT rd, rs1, rs2 | “Top 16 x Top 16” with 64-bit Signed Addition (64 = 64 + 16x16) |
4 | SMALDA rd, rs1, rs2 | Two “16x16” with 64-bit Signed Double Addition (64 = 64 + 16x16 + 16x16) |
5 | SMALXDA rd, rs1, rs2 | Two Crossed “16x16” with 64-bit Signed Double Addition (64 = 64 + 16x16 + 16x16) |
6 | SMALDS rd, rs1, rs2 | Two “16x16” with 64-bit Signed Addition and Subtraction (64 = 64 + 16x16 - 16x16) |
7 | SMALDRS rd, rs1, rs2 | Two “16x16” with 64-bit Signed Addition and Reversed Subtraction (64 = 64 + 16x16 - 16x16) |
8 | SMALXDS rd, rs1, rs2 | Two Crossed “16x16” with 64-bit Signed Addition and Subtraction (64 = 64 + 16x16 - 16x16) |
9 | SMSLDA rd, rs1, rs2 | Two “16x16” with 64-bit Signed Double Subtraction (64 = 64 - 16x16 - 16x16) |
10 | SMSLXDA rd, rs1, rs2 | Two Crossed “16x16” with 64-bit Signed Double Subtraction (64 = 64 - 16x16 - 16x16) |
3.4 非SIMD指令
3.4.1 Q15饱和说明
Table 22. Non-SIMD Q15 saturation ALU Instructions
序号 | 指令 | 说明 |
1 | KADDH rd, rs1, rs2 | Add with Q15 saturation |
2 | KSUBH rd, rs1, rs2 | Subtract with Q15 saturation |
3 | KHMBB rd, rs1, rs2 | Multiply the first 16- bit Q15 elements of two registers and transform the Q30 result into a saturated Q15 number. |
4 | KHMBT rd, rs1, rs2 | Multiply the first 16- bit Q15 element of one register with the second 16-bit Q15 element of another register and transform the Q30 result into a saturated Q15 number. |
5 | KHMTT rd, rs1, rs2 | Multiply the second 16-bit Q15 elements of two registers and transform the Q30 result into a saturated Q15 number. |
6 | UKADDH rd, rs1, rs2 | Add with I16 saturation |
7 | UKSUBH rd, rs1, rs2 | Subtract with I16 saturation |
3.4.2 Q31饱和指令
Table 23. Non-SIMD Q31 saturation ALU Instructions
序号 | 指令 | 说明 |
1 | KADDW rd, rs1, rs2 | Add with Q31 saturation |
2 | UKADDW rd, rs1, rs2 | Unsigned Add with U32 saturation |
3 | KSUBW rd, rs1, rs2 | Subtract with Q31 saturation |
4 | UKSUBW rd, rs1, rs2 | Unsigned Subtract with U32 saturation |
5 | KDMBB rd, rs1, rs2 | Multiply the first 16- bit Q15 elements of two registers and transform the Q30 result into a saturated Q31 number. |
6 | KDMBT rd, rs1, rs2 | Multiply the first 16- bit Q15 element of one register with the second 16-bit Q15 element of another register and transform the Q30 result into a saturated Q31 number. |
7 | KDMTT rd, rs1, rs2 | Multiply the second 16-bit Q15 elements of two registers and transform the Q30 result into a saturated Q31 number. |
8 | KSLRAW rd, rs1, rs2 | Shift Left Logical with Q31 Saturation or Shift Right Arithmetic |
9 | KSLRAW.u rd, rs1, rs2 | Shift Left Logical with Q31 Saturation or Rounding Shift Right Arithmetic |
10 | KSLLW rd, rs1, rs2 | Saturating Shift Left Logical for 32-bit Word |
11 | KSLLIW rd, rs1, imm5u | Saturating Shift Left Logical Immediate for 32-bit Word |
12 | KDMABB rd, rs1, rs2 | Multiply the first 16- bit Q15 elements of two registers and transform the Q30 result into a saturated Q31 number. Add the Q31 number with a 32-bit accumulator. |
13 | KDMABT rd, rs1, rs2 | Multiply the first 16- bit Q15 element of one register with the second 16-bit Q15 element of another register and transform the Q30 result into a saturated Q31 number. Add the Q31 number with a 32-bit accumulator. |
14 | KDMATT rd, rs1, rs2 | Multiply the second 16-bit Q15 elements of two registers and transform the Q30 result into a saturated Q31 number. Add the Q31 number with a 32-bit accumulator. |
15 | KABSW rd, rs1 | 32-bit Absolute Value (scalar version) |
3.4.3 32位计算指令
Table 24. 32-bit Computation Instructions
序号 | 指令 | 说明 |
1 | RADDW rd, rs1, rs2 | 32-bit Signed Halving Addition |
2 | URADDW rd, rs1, rs2 | 32-bit Unsigned Halving Addition |
3 | RSUBW rd, rs1, rs2 | 32-bit Signed Halving Subtraction |
4 | URSUBW rd, rs1, rs2 | 32-bit Unsigned Halving Subtraction |
5 | MULR64 rd, rs1, rs2 | Multiply Word Unsigned to 64-bit data |
6 | MULSR64 rd, rs1, rs2 | Multiply Word Signed to 64-bit data |
7 | MSUBR32 rd, rs1, rs2 | Multiply and Subtract from 32-bit Word |
3.4.4 溢流/饱和状态操作说明
Table 25. OV (Overflow) flag Set/Clear Instructions
序号 | 指令 | 说明 |
1 | RDOV rd | Read vxsat.OV to rd. |
2 | CLROV | Clear vsat.OV flag |
3.4.5 其他指令
Table 22. Non-SIMD Q15 saturation ALU Instructions
序号 | 指令 | 说明 |
1 | AVE rd, rs1, rs2 | Average with rounding |
2 | SRA.u rd, rs1, rs2 | Rounding Shift Right Arithmetic |
3 | SRAI.u rd, rs1, imm5u/imm6u | Rounding Shift Right Arithmetic Immediate |
4 | BITREV rd, rs1, rs2 | Bit Reverse |
5 | BITREVI rd, rs1, imm5u/imm6u | Bit Reverse Immediate |
6 | WEXT rd, rs1, rs2 | Extract 32-bit from a 64-bit value |
7 | WEXTI rd, rs1, imm5u | Extract 32-bit from a 64-bit value Immediate |
8 | CMIX rd, rs2, rs1, rs3 | Conditional Mix |
9 | INSB rd, rs1, imm2u/imm3u | Insert Byte |
10 | MADDR32 rd, rs1, tb | Multiply and Add to 32-bit Word |
11 | MSUBR32 rd, rs1, tb | Multiply and Subtract from 32-bit Word |
12 | MAX rd, rs1, rs2 | Signed Word Maximum |
13 | MIN rd, rs1, rs2 | Signed Word Minimum |
RISC-V DSP扩展指令集文档: