序
上文我们实现了加减和比较的指令,本篇将实现乘法运算指令。
MUL指令格式:(rs × rt => rd,只保存低32位)
OP rs rt rd FUNC
011100 xxxxx xxxxx xxxxx 00000 000010
MULT指令格式:(rs × rt => {hi,lo},结果高32位保存至HI寄存器、低32位保存至LO寄存器)
OP rs rt FUNC
000000 xxxxx xxxxx 00000 00000 011000
MULTU指令格式:(rs × rt => {hi,lo},无符号乘法,上面两条都是有符号)
OP rs rt FUNC
000000 xxxxx xxxxx 00000 00000 011001
指令定义
`define EXE_MUL 6'b000010 // rs × rt -> rd(保存结果低32位) `define EXE_MULT 6'b011000 // rs × rt -> {HI,LO} `define EXE_MULTU 6'b011001 // rs × rt -> {HI,LO} 无符号 `define EXE_MULT_FUNC 8'b00_011000 `define EXE_MULTU_FUNC 8'b00_011001 `define EXE_MUL_FUNC 8'b01_000010
修改ID模块
`include "defines.v" //译码阶段 对if_id传入的指令进行译码,分离出操作数和操作码 module id( input rst, input [`InstAddrBus] pc, input [`InstDataBus] inst, //读通用寄存器 读取 input [`RegDataBus] reg1_data, input [`RegDataBus] reg2_data, output reg reg1_rden, output reg reg2_rden, output reg [`RegAddrBus] reg1_addr, //源操作数1的地址 output reg [`RegAddrBus] reg2_addr, //源操作数2的地址 //送到ex阶段的值 output reg [`RegDataBus] reg1, //源操作数1 32b output reg [`RegDataBus] reg2, //源操作数2 32b output reg reg_wb, //写回目的寄存器标志 output reg [`RegAddrBus] reg_wb_addr,//写回目的寄存器地址 output reg [`AluOpBus] aluop, //操作码 //相邻指令的冲突,由EX阶段给出数据旁路 input ex_wr_en, //处于执行阶段的指令是否要写目的寄存器 input [`RegDataBus] ex_wr_data, input [`RegAddrBus] ex_wr_addr, //相隔一条指令的冲突,由MEM阶段给出数据旁路 input mem_wr_en, //处于访存阶段指令是否要写目的寄存器 input [`RegDataBus] mem_wr_data, input [`RegAddrBus] mem_wr_addr ); wire [5:0] op = inst[31:26]; //从指令中获取操作码 高6位 wire [5:0] func = inst[5:0]; //从指令中获取功能号确定指令类型 低6位 wire [4:0] shmat = inst[10:6]; //部分移位位数不从寄存器取值,直接由shmat给出 reg [`RegDataBus] imm; //立即数 always @ (*) begin if (rst) begin reg1_rden <= 1'd0; reg2_rden <= 1'd0; reg1_addr <= 5'd0; reg2_addr <= 5'd0; imm <= 32'd0; reg_wb <= 1'd0; reg_wb_addr <= 5'd0; aluop <= 7'd0; end else begin reg1_rden <= 1'd0; reg2_rden <= 1'd0; reg1_addr <= inst[25:21]; //默认从指令中读取操作数1地址 reg2_addr <= inst[20:16]; //默认从指令中读取操作数2地址 imm <= 32'd0; reg_wb <= 1'd0; reg_wb_addr <= inst[15:11]; //默认结果地址寄存器rd aluop <= 7'd0; //操作类型 if (op == `EXE_SPECIAL) begin reg1_rden <= 1'd1; reg2_rden <= 1'd1; reg_wb <= 1'd1; case (func) `EXE_AND: begin aluop <= `EXE_AND_FUNC; end `EXE_OR: begin aluop <= `EXE_OR_FUNC; end `EXE_XOR: begin aluop <= `EXE_XOR_FUNC; end `EXE_NOR: begin aluop <= `EXE_NOR_FUNC; end `EXE_SLLV: begin aluop <= `EXE_SLL_FUNC; end `EXE_SRLV: begin aluop <= `EXE_SRLV_FUNC; end `EXE_SRAV: begin aluop <= `EXE_SRAV_FUNC; end `EXE_SLL: begin reg1_rden <= 1'd0; imm[4:0] <= shmat; aluop <= `EXE_SLL_FUNC; end `EXE_SRL: begin reg1_rden <= 1'd0; imm[4:0] <= shmat; aluop <= `EXE_SRL_FUNC; end `EXE_SRA: begin reg1_rden <= 1'd0; imm[4:0] <= shmat; aluop <= `EXE_SRA_FUNC; end `EXE_MOVN: begin if (reg2 == 32'd0) begin reg_wb <= 1'b0; end else begin reg_wb <= 1'b1; aluop <= `EXE_MOVN_FUNC; end end `EXE_MOVZ: begin if (reg2 == 32'd0) begin reg_wb <= 1'b1; aluop <= `EXE_MOVZ_FUNC; end else begin reg_wb <= 1'b0; end end `EXE_MFHI: begin reg1_rden <= 1'b0; reg2_rden <= 1'b0; aluop <= `EXE_MFHI_FUNC; end `EXE_MFLO: begin reg1_rden <= 1'b0; reg2_rden <= 1'b0; aluop <= `EXE_MFLO_FUNC; end `EXE_MTHI: begin reg2_rden <= 1'b0; reg_wb <= 1'b0; aluop <= `EXE_MTHI_FUNC; end `EXE_MTLO: begin reg2_rden <= 1'b0; reg_wb <= 1'b0; aluop <= `EXE_MTLO_FUNC; end `EXE_ADD: begin aluop <= `EXE_ADD_FUNC; end `EXE_ADDU: begin aluop <= `EXE_ADDU_FUNC; end `EXE_SUB: begin aluop <= `EXE_SUB_FUNC; end `EXE_SUBU: begin aluop <= `EXE_SUBU_FUNC; end `EXE_SLT: begin aluop <= `EXE_SLT_FUNC; end `EXE_SLTU: begin aluop <= `EXE_SLTU_FUNC; end `EXE_MULT: begin aluop <= `EXE_MULT_FUNC; end `EXE_MULTU: begin aluop <= `EXE_MULTU_FUNC; end default: aluop <= 7'd0; endcase end else if (op == `EXE_SPECIAL2) begin //由FUNC字段决定,操作码为011100的指令 reg1_rden <= 1'd1; reg2_rden <= 1'd1; reg_wb <= 1'd1; case (func) `EXE_CLZ: begin aluop <= `EXE_CLZ_FUNC; end `EXE_CLO: begin aluop <= `EXE_CLO_FUNC; end `EXE_MUL: begin aluop <= `EXE_MUL_FUNC; end default: begin aluop <= 7'd0; end endcase end else begin reg1_rden <= 1'd1; //需要读取操作数1 rs寄存器的值 reg2_rden <= 1'd0; //不需要读取操作数2 rt寄存器值, imm <= {16'h0, inst[15:0]}; reg_wb <= 1'd1; reg_wb_addr <= inst[20:16]; case (op) `EXE_ORI: begin //或指令 rs寄存器值是操作数1,imm是操作数2,结果放到rt寄存器 aluop <= `EXE_ORI_OP; end `EXE_ANDI: begin aluop <= `EXE_ANDI_OP; end `EXE_XORI: begin aluop <= `EXE_XORI_OP; end `EXE_LUI: begin reg1_rden <= 1'b0; aluop <= `EXE_LUI_OP; end `EXE_ADDI: begin aluop <= `EXE_ADDI_OP; end `EXE_ADDIU: begin aluop <= `EXE_ADDIU_OP; end `EXE_SLTI: begin aluop <= `EXE_SLTI_OP; end `EXE_SLTIU: begin aluop <= `EXE_SLTIU_OP; end default: aluop <= 7'd0; endcase end end end always @ (*) begin if (rst) begin reg1 <= 32'd0; end else if (reg1_rden == 1'd1 && ex_wr_en == 1'd1 && ex_wr_addr == reg1_addr) begin //执行阶段旁路 reg1 <= ex_wr_data; end else if (reg1_rden == 1'd1 && mem_wr_en == 1'd1 && mem_wr_addr == reg1_addr) begin //访存阶段旁路 reg1 <= mem_wr_data; end else if (reg1_rden == 1'd1) begin //从通用寄存器获取操作数 reg1 <= reg1_data; end else if (reg1_rden == 1'd0) begin //从指令中获取操作数 reg1 <= imm; end else begin reg1 <= 32'd0; end end always @ (*) begin if (rst) begin reg2 <= 32'd0; end else if (reg2_rden == 1'd1 && ex_wr_en == 1'd1 && ex_wr_addr == reg2_addr) begin //执行阶段旁路 reg2 <= ex_wr_data; end else if (reg2_rden == 1'd1 && mem_wr_en == 1'd1 && mem_wr_addr == reg2_addr) begin //访存阶段旁路 reg2 <= mem_wr_data; end else if (reg2_rden == 1'd1) begin //从通用寄存器获取操作数 reg2 <= reg2_data; end else if (reg2_rden == 1'd0) begin //从指令中获取操作数 reg2 <= imm; end else begin reg2 <= 32'd0; end end endmodule
这个模块依然和前篇保持一致,只传递我们自定义的aluop码,在EX模块中集中处理这个指令。
修改EX模块
`include "defines.v" //执行阶段,根据译码阶段得到的操作码和操作数进行运算,得到结果 module ex( input rst, input [`AluOpBus] aluop, input [`RegDataBus] reg1, input [`RegDataBus] reg2, input reg_wb_i, input [`RegAddrBus] reg_wb_addr_i, output reg reg_wb_o, output reg [`RegAddrBus] reg_wb_addr_o, output reg [`RegDataBus] reg_wb_data, //写回数据到目的寄存器 //HILO寄存器 input [`RegDataBus] hi_reg_i, //读取HI寄存器数据 input [`RegDataBus] lo_reg_i, //读取LO寄存器数据 output reg [`RegDataBus] hi_reg_o, //写入HI寄存器数据 output reg [`RegDataBus] lo_reg_o, //写入LO寄存器数据 output reg hi_wren, //HI寄存器写使能 output reg lo_wren, //LO寄存器写使能 // HILO寄存器旁路 input [`RegDataBus] wb_hi_i, input [`RegDataBus] wb_lo_i, input wb_hi_wren_i, //有指令写HI,从写回阶段给出旁路(隔一条指令) input wb_lo_wren_i, //有指令写LO,从写回阶段给出旁路(隔一条指令) input [`RegDataBus] mem_hi_i, input [`RegDataBus] mem_lo_i, input mem_hi_wren_i, //有指令写HI,从访存阶段给出旁路(上一条指令) input mem_lo_wren_i //有指令写LO,从访存阶段给出旁路(上一条指令) ); wire [31:0] mfhi_res = mem_hi_wren_i ? mem_hi_i : wb_hi_wren_i ? wb_hi_i : hi_reg_i; wire [31:0] mflo_res = mem_lo_wren_i ? mem_lo_i : wb_lo_wren_i ? wb_lo_i : lo_reg_i; //需要转换成补码的指令 wire [31:0] reg2_mux = ((aluop==`EXE_SUB_FUNC)||(aluop==`EXE_SUBU_FUNC)||(aluop==`EXE_SLT_FUNC)) ? (~reg2+1) : reg2; wire [31:0] res = reg1 + reg2_mux; //判断加减的结果是否溢出,减法转换成加法 //overflow flag 两个正数相加得负或两个负数相加得正则溢出 wire of = ((!reg1[31]&&!reg2_mux[31]&&res[31])||(reg1[31]&®2_mux[31]&&!res[31])); //计算clz和clo的结果,0~32 wire [31:0] clz_res = reg1[31] ? 0 : reg1[30] ? 1 : reg1[29] ? 2 : reg1[28] ? 3 : reg1[27] ? 4 : reg1[26] ? 5 : reg1[25] ? 6 : reg1[24] ? 7 : reg1[23] ? 8 : reg1[22] ? 9 : reg1[21] ? 10 : reg1[20] ? 11 : reg1[19] ? 12 : reg1[18] ? 13 : reg1[17] ? 14 : reg1[16] ? 15 : reg1[15] ? 16 : reg1[14] ? 17 : reg1[13] ? 18 : reg1[12] ? 19 : reg1[11] ? 20 : reg1[10] ? 21 : reg1[9] ? 22 : reg1[8] ? 23 : reg1[7] ? 24 : reg1[6] ? 25 : reg1[5] ? 26 : reg1[4] ? 27 : reg1[3] ? 28 : reg1[2] ? 29 : reg1[1] ? 30 : reg1[0] ? 31 : 32 ; wire [31:0] reg1_i = ~reg1; wire [31:0] clo_res = reg1_i[31] ? 0 : reg1_i[30] ? 1 : reg1_i[29] ? 2 :reg1_i[28] ? 3 : reg1_i[27] ? 4 : reg1_i[26] ? 5 : reg1_i[25] ? 6 : reg1_i[24] ? 7 : reg1_i[23] ? 8 : reg1_i[22] ? 9 : reg1_i[21] ? 10 : reg1_i[20] ? 11 : reg1_i[19] ? 12 : reg1_i[18] ? 13 : reg1_i[17] ? 14 : reg1_i[16] ? 15 : reg1_i[15] ? 16 : reg1_i[14] ? 17 : reg1_i[13] ? 18 : reg1_i[12] ? 19 : reg1_i[11] ? 20 : reg1_i[10] ? 21 : reg1_i[9] ? 22 : reg1_i[8] ? 23 : reg1_i[7] ? 24 : reg1_i[6] ? 25 : reg1_i[5] ? 26 : reg1_i[4] ? 27 : reg1_i[3] ? 28 : reg1_i[2] ? 29 : reg1_i[1] ? 30 : reg1_i[0] ? 31 : 32 ; //计算reg1 < reg2的补码比较 assign reg1_lt_reg2 = ((aluop == `EXE_SLT_FUNC) || (aluop == `EXE_SLTI_OP)) ? ((reg1[31] && !reg2[31]) || (!reg1[31] && !reg2[31] && res[31]) || (reg1[31] && reg2[31] && res[31])) : (reg1 < reg2); //乘法运算结果 wire [31:0] mul_reg1 = ((aluop == `EXE_MUL_FUNC) || (aluop == `EXE_MULT_FUNC)) && reg1[31] ? (~reg1 + 1) : reg1; //还原补码 wire [31:0] mul_reg2 = ((aluop == `EXE_MUL_FUNC) || (aluop == `EXE_MULT_FUNC)) && reg2[31] ? (~reg2 + 1) : reg2; wire [63:0] mul_temp = mul_reg1 * mul_reg2; //真值计算 wire [63:0] mul_res = ((aluop == `EXE_MUL_FUNC) || (aluop == `EXE_MULT_FUNC)) && (mul_reg1[31] ^ mul_reg2[31]) ? (~mul_temp + 1) : mul_temp; //有符号负数补码保存 always @ (*) begin if (rst) begin reg_wb_o <= 1'd0; reg_wb_addr_o <= 5'd0; reg_wb_data <= 32'd0; hi_reg_o <= 32'd0; lo_reg_o <= 32'd0; hi_wren <= 1'b0; lo_wren <= 1'b0; end else begin reg_wb_o <= reg_wb_i; reg_wb_addr_o <= reg_wb_addr_i; reg_wb_data <= 32'd0; hi_wren <= 1'b0; lo_wren <= 1'b0; hi_reg_o <= 32'd0; lo_reg_o <= 32'd0; case (aluop) `EXE_ORI_OP,`EXE_OR_FUNC: begin reg_wb_data <= reg1 | reg2; end `EXE_ANDI_OP,`EXE_AND_FUNC: begin reg_wb_data <= reg1 & reg2; end `EXE_XORI_OP,`EXE_XOR_FUNC: begin reg_wb_data <= reg1 ^ reg2; end `EXE_LUI_OP: begin reg_wb_data <= {reg2[15:0],reg2[31:16]}; end `EXE_NOR_FUNC: begin reg_wb_data <= ~(reg1 | reg2); end `EXE_SLL_FUNC,`EXE_SLLV_FUNC: begin reg_wb_data <= reg2 << reg1[4:0]; end `EXE_SRL_FUNC,`EXE_SRLV_FUNC: begin reg_wb_data <= reg2 >> reg1[4:0]; end `EXE_SRA_FUNC,`EXE_SRAV_FUNC: begin //算术移位也可以直接使用>>> reg_wb_data <= ({32{reg2[31]}} << (6'd32 - {1'b0,reg1[4:0]})) | reg2 >> reg1[4:0]; end `EXE_MOVN_FUNC,`EXE_MOVZ_FUNC: begin reg_wb_data <= reg1; end `EXE_MFHI_FUNC: begin reg_wb_data <= mfhi_res; end `EXE_MFLO_FUNC: begin reg_wb_data <= mflo_res; end `EXE_MTHI_FUNC: begin hi_wren <= 1'b1; hi_reg_o <= reg1; lo_reg_o <= lo_reg_i; end `EXE_MTLO_FUNC: begin lo_wren <= 1'b1; lo_reg_o <= reg1; hi_reg_o <= hi_reg_i; end `EXE_ADD_FUNC,`EXE_SUB_FUNC, `EXE_ADDI_OP: begin //加法减法都是加法实现 reg_wb_data <= res; reg_wb_o <= of ? 0 : 1; end `EXE_ADDU_FUNC,`EXE_SUBU_FUNC, `EXE_ADDIU_OP: begin //无符号数无需判断溢出,直接截断保存 reg_wb_data <= res; end `EXE_CLZ_FUNC: begin reg_wb_data <= clz_res; end `EXE_CLO_FUNC: begin reg_wb_data <= clo_res; end `EXE_SLT_FUNC,`EXE_SLTU_FUNC, `EXE_SLTI_OP,`EXE_SLTIU_OP: begin reg_wb_data <= reg1_lt_reg2; end `EXE_MUL_FUNC : begin reg_wb_data <= mul_res[31:0]; end `EXE_MULT_FUNC,`EXE_MULTU_FUNC: begin hi_wren <= 1'b1; lo_wren <= 1'b1; hi_reg_o <= mul_res[63:32]; lo_reg_o <= mul_res[31:0]; end default: begin reg_wb_o <= 1'd0; reg_wb_addr_o <= 5'd0; reg_wb_data <= 32'd0; hi_reg_o <= 32'd0; lo_reg_o <= 32'd0; hi_wren <= 1'b0; lo_wren <= 1'b0; end endcase end end endmodule
EX模块主要修改的地方是,通过mul_reg1和mul_reg2来解析操作数,如果是负数要转换成真值后计算,计算结果mul_res也要同样处理,如果计算结果为负数,也要使用补码来存储表示(前提是有符号乘法)。
仿真结果
测试指令设计:
34010001:ORI reg0 | 0001 => reg1 赋值reg1 = 0000,0001
3c02ffff: LUI 赋值reg2 = ffff,0000
70221802:MUL reg1 × reg2 => reg3
00220018:MULU reg1 × reg2 => {HI,LO}
00220019:MULTU reg1 × reg2 => {HI,LO}由仿真结果可知,图中黄线处是第二条乘法指令的EX阶段,有符号计算结果是00010000没问题,HI寄存器保存00000000没问题,LO寄存器保存00010000也没问题;第三条乘法指令是无符号乘法指令,计算结果是ffff0000没问题,结果保存至HI、LO都没问题;第一条指令是有符号运算结果低32位保存至rd,如下图所示:
也没问题。至此,三条乘法指令的仿真结束。
下一篇将介绍多周期的乘法和除法指令!
08-31
2366