使用Gem5自定义RISC-V指令集
环境搭建
Gem5的编译需要依赖一些库文件,在ubuntu上面执行下面命令可以完成
sudo apt install build-essential git m4 scons zlib1g zlib1g-dev libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev python-dev python
克隆Gem5仓库
git clone https://github.com/gem5/gem5
使用scons开始编译,里面ISA模型可自己选择(X86,ARM,RISCV等)
scons build/riscv/gem5.opt -j16
编译完成后可以试运行
build/X86/gem5.opt configs/tutorial/part1/simple.py
执行结果应该如下所示,如果报错则增加ubuntu的swap空间
gem5 Simulator System. http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.
gem5 version 21.0.0.0
gem5 compiled May 17 2021 18:05:59
gem5 started May 17 2021 22:05:20
gem5 executing on amarillo, pid 75197
command line: build/X86/gem5.opt configs/tutorial/part1/simple.py
Global frequency set at 1000000000000 ticks per second
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
0: system.remote_gdb: listening for remote gdb on port 7005
Beginning simulation!
info: Entering event queue @ 0. Starting simulation...
Hello world!
Exiting @ tick 490394000 because exiting with last active thread context
RISC-V编译环境搭建
搭建交叉编译环境Linux的交叉工具链riscv64-unknown-linux-gnu
(基于ubuntu)
git clone https://github.com/riscv/riscv-gnu-toolchain
sudo apt-get install autoconf automake autotools-dev curl python3 python3-pip libmpc-dev libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev ninja-build git cmake libglib2.0-dev libslirp-dev
cd riscv-gnu-toolchain
git submodule update --init --recursive
编译
./configure --prefix=/opt/riscv
make -j16
下载安装riscv-tools
git clone https://github.com/riscv/riscv-tools.git
git submodule update --init --recursive
export RISCV=/path/to/install/riscv/toolchain
./build.sh
构建riscv工具链,现在向ISA添加一条取模指令,语义如下
mod r1, r2, r3
Semantics:
R[r1] = R[r2] % R[r3]
在riscv-tools
打开文件 riscv-opcodes/opcodes
,在这里您将能够看到分配给各种指令的各种操作码和指令位。
31…25=2、14…12=0 方便指令的扩展,进行指令的区分
sra rd rs1 rs2 31..25=32 14..12=5 6..2=0x0C 1..0=3
or rd rs1 rs2 31..25=0 14..12=6 6..2=0x0C 1..0=3
and rd rs1 rs2 31..25=0 14..12=7 6..2=0x0C 1..0=3
mod rd rs1 rs2 31..25=2 14..12=0 6..2=0x0C 1..0=3
addiw rd rs1 imm12 14..12=0 6..2=0x06 1..0=3
slliw rd rs1 31..25=0 shamtw 14..12=1 6..2=0x06 1..0=3
srliw rd rs1 31..25=0 shamtw 14..12=5 6..2=0x06 1..0=3
sraiw rd rs1 31..25=32 shamtw 14..12=5 6..2=0x06 1..0=3
运行以下命令,将前面五个文件的内容合并在一起,将合并后的内容通过管道传递给./parse-opcodes -c
程序进行处理,再将 ./parse-opcodes -c
程序的输出结果保存到用户主目录下的 temp.h
文件
cat opcodes-pseudo opcodes opcodes-rvc opcodes-rvc-pseudo opcodes-custom | ./parse-opcodes -c > ~/temp.h
打开temp.h
文件,会发现以下两行代码,用于指令解码,确保指令能够正确执行
#define MATCH_MOD 0x200006b
#define MASK_MOD 0xfe00707f
在riscv-gnu-toolchain
打开文件binutils/opcodes/riscv-opc.c
。将自己的指令添加到
const struct riscv_opcode riscv_opcodes[] =
{
/* name, xlen, isa, operands, match, mask, match_func, pinfo. */
{"mod", 0, INSN_CLASS_I, "d,s,t", MATCH_MOD, MASK_MOD, match_opcode, 0 },
这一行定义了指令名称、操作数和 MATCH/MASK 常量。这两个常量将在稍后定义并用于解析该指令
接下来打开binutils/include/opcode/riscv-opc.h
并添加 MATCH/MASK 的定义
#define MATCH_MOD 0x200006b
#define MASK_MOD 0xfe00707f
在我们声明指令的地方,添加指令
DECLARE_INSN(mod,MATCH_MOD,MASK_MOD)
至此,添加自定义已完成,现在我们可以重建工具链
make clean
./configure --prefix=/opt/riscv
make -j16
编译程序
虽然我们已经设法添加了指令,但这只是在 RISC-V 编译器中,而不是在 C 语言本身中。因此,示例程序需要将汇编步骤嵌入到更广泛的程序中。在riscv-gnu-toolchain
目录下创建一个.c
文件。这是modulus.c
:
#include <stdio.h>
int main(){
int a,b,c;
a = 5;
b = 2;
asm volatile
(
"mod %[z], %[x], %[y]\n\t"
: [z] "=r" (c)
: [x] "r" (a), [y] "r" (b)
);
if ( c != 1 ){
printf("\n[[FALSE]]\n");
return -1;
}
printf("\n[[TRUE]]\n");
return 0;
}
然后运行source ~/.bashrc
刷新终端
现在我们可以使用新的RISC-V指令了
/opt/riscv/bin/riscv64-unknown-elf-gcc modulus.c -o modulus.o
如果成功,将看到一个名为modulus.o
.作为二进制文件,我们无法直接读取它。在这里拆开看一下:
/opt/riscv/bin/riscv64-unknown-elf-objdump -D modulus.o
000000000001019c <main>:
1019c: 1101 addi sp,sp,-32
1019e: ec06 sd ra,24(sp)
101a0: e822 sd s0,16(sp)
101a2: 1000 addi s0,sp,32
101a4: 4795 li a5,5
101a6: fef42623 sw a5,-20(s0)
101aa: 4789 li a5,2
101ac: fef42423 sw a5,-24(s0)
101b0: fec42783 lw a5,-20(s0)
101b4: fe842703 lw a4,-24(s0)
101b8: 02e787eb mod a5,a5,a4
101bc: fef42223 sw a5,-28(s0)
101c0: fe442783 lw a5,-28(s0)
101c4: 0007871b sext.w a4,a5
101c8: 4785 li a5,1
101ca: 00f70963 beq a4,a5,101dc <main+0x40>
101ce: 67c9 lui a5,0x12
101d0: 60878513 addi a0,a5,1544 # 12608 <__errno+0x8>
101d4: 392000ef jal 10566 <puts>
101d8: 57fd li a5,-1
101da: a039 j 101e8 <main+0x4c>
101dc: 67c9 lui a5,0x12
101de: 61878513 addi a0,a5,1560 # 12618 <__errno+0x18>
101e2: 384000ef jal 10566 <puts>
101e6: 4781 li a5,0
101e8: 853e mv a0,a5
101ea: 60e2 ld ra,24(sp)
101ec: 6442 ld s0,16(sp)
101ee: 6105 addi sp,sp,32
101f0: 8082 ret
看到它正确地将指令 02e787eb
解析为我们的模数命令。现在让我们运行它。
在Gem5中运行自定义指令
首先在riscv-gnu-toolchain
下把modulus.c
生成适用于RISC-V架构的ELF二进制文件
/opt/riscv/bin/riscv64-unknown-elf-gcc -o modulus.elf modulus.c
然后进行检测
file modulus
如果结果如下所示,则正确
modulus: ELF 64-bit LSB executable, UCB RISC-V, RVC, double-float ABI, version 1 (SYSV), statically linked, not stripped
在Gem5
下的src/arch/riscv/isa/decoder.isa
下添加指令:(非常关键)
0x0c: decode FUNCT3 { //0x0c:opcode操作码
format ROp {
0x0: decode KFUNCT5 { //KFUNCT5:自定义的5位功能码;0x0:FUNCT3
0x00: decode BS {
0x0: add({{
Rd = rvSext(Rs1_sd + Rs2_sd);
}});
0x1: sub({{
Rd = rvSext(Rs1_sd - Rs2_sd);
}});
}
0x2: mod({{ //0x2:FUNCT7
Rd = Rs1_sd % Rs2_sd; //添加的部分
}});
//如果按照网上的教程这里应该是0x33
0x33: decode FUNCT3 {
format ROp {
0x0: decode FUNCT7 {
0x1: mul(, IntMultOp);
}
//但是这里的opcode是0x33,在Gem5模拟器中,对一直为0x3的低两位做了右移位处理,因此模拟器中的opcode为0x0c。
0x01: decode BS {
0x0: mul({{
Rd = rvSext(Rs1_sd * Rs2_sd);
}}, IntMultOp);
}
并且前面的所有全部都要修改,重新模拟(可省略)
mod rd rs1 rs2 31..25=2 14..12=0 6..2=0x0C 1..0=3
#define MATCH_MOD 0x4000033
#define MASK_MOD 0xfe00707f
{"mod", "I", "d,s,t", MATCH_MOD, MASK_MOD, match_opcode, 0 },
然后自己按照Gem5
教程的hello world
修改,此处需要把之前生成的二进制文件modulus.elf放入目录下,下面是示例
import m5
from m5.objects import *
system = System()
system.clk_domain = SrcClockDomain()
system.clk_domain.clock = "1GHz"
system.clk_domain.voltage_domain = VoltageDomain()
system.mem_mode = "timing"
system.mem_ranges = [AddrRange("512MB")]
system.cpu = RiscvTimingSimpleCPU()
system.membus = SystemXBar()
system.cpu.icache_port = system.membus.cpu_side_ports
system.cpu.dcache_port = system.membus.cpu_side_ports
system.cpu.createInterruptController()
system.mem_ctrl = MemCtrl()
system.mem_ctrl.dram = DDR3_1600_8x8()
system.mem_ctrl.dram.range = system.mem_ranges[0]
system.mem_ctrl.port = system.membus.mem_side_ports
system.system_port = system.membus.cpu_side_ports
thispath = os.path.dirname(os.path.realpath(__file__))
modulus = os.path.join(
thispath,
"../../../",
"configs/learning_gem5/part1/modulus.elf",
)
system.workload = SEWorkload.init_compatible(modulus)
process = Process()
process.cmd = [modulus]
system.cpu.workload = process
system.cpu.createThreads()
root = Root(full_system=False, system=system)
m5.instantiate()
print(f"Beginning simulation!")
exit_event = m5.simulate()
print(f"Exiting @ tick {m5.curTick()} because {exit_event.getCause()}")
然后开始重编译,最终结果如下所示
parallels@ubuntu-linux-22-04-02-desktop:~/Desktop/gem5$ build/RISCV/gem5.opt configs/learning_gem5/part1/simple-riscv.py
gem5 Simulator System. https://www.gem5.org
gem5 is copyrighted software; use the --copyright option for details.
gem5 version 23.1.0.0
gem5 compiled May 27 2024 18:21:34
gem5 started May 27 2024 18:23:32
gem5 executing on ubuntu-linux-22-04-02-desktop, pid 1250271
command line: build/RISCV/gem5.opt configs/learning_gem5/part1/simple-riscv.py
Global frequency set at 1000000000000 ticks per second
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
src/mem/dram_interface.cc:690: warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
src/arch/riscv/isa.cc:275: info: RVV enabled, VLEN = 256 bits, ELEN = 64 bits
src/arch/riscv/linux/se_workload.cc:60: warn: Unknown operating system; assuming Linux.
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
system.remote_gdb: Listening for connections on port 7001
Beginning simulation!
src/sim/simulate.cc:199: info: Entering event queue@ 0. Starting simulation...
[[TRUE]]
Exiting @ tick 165023000 because exiting with last active thread context
调试
1 可以在执行的时候加--debug-flags=Decode,Exec,Fetch
参考文章
https://junningwu.haawking.com/tech/2019/11/28/%E4%BD%BF%E7%94%A8Gem5%E8%87%AA%E5%AE%9A%E4%B9%89RISC-V%E6%8C%87%E4%BB%A4%E9%9B%86-%E6%8C%81%E7%BB%AD%E6%9B%B4%E6%96%B0/
https://fleker.medium.com/extending-gem5-with-custom-risc-v-commands-653eeefe83b8
https://nitish2112.github.io/post/adding-instruction-riscv/
https://pcotret.gitlab.io/riscv-custom/sw_toolchain.html#id6