计算机组成原理(实验三):定制MIPS功能型处理器设计(custom_cpu)


前言

本文将描述基于 MIPS 的定制功能型处理器的 Verilog 实现。
所有需要实现的 M I P S MIPS MIPS 指令共45条,详见计算机组成原理(实验二):简单功能型处理器设计(simple_cpu);本文旨在分享我在实验中的想法,作者也是初学者,代码可能存在未知问题,欢迎指正。仅供参考,请勿复用!


1、接口定义

信号名I/O说明
rstInput与处理器工作时钟同步的高电平复位信号
clkInput处理器工作时钟
PC[31:0]Output程序计数器, 复位后初值为32’d0
inst_req_validoutput指令请求发送通道握手信号,高电平表示发送方发出的请求内容有效
inst_req_readyinput指令请求发送通道握手信号,高电平表示接收方可以接收发送方的请求
Instruction[31:0]Input从内存(Memory)中读取至处理器的指令
inst_validintput指令应答接收通道握手信号,高电平表示发送方发出的应答内容有效
inst_readyoutput指令请求发送通道握手信号,高电平表示接收方可以接收发送方的应答
Address[31:0]Output数据访存指令使用的内存地址
MemWriteOutput内存访问的写使能信号(高电平有效)
Write_data[31:0]Output内存写操作数据
Write_strb[3:0]Output内存写操作字节有效信号(支持32/16/8-bit内存写)Write_strb[i] == 1表示Write_data[8 × (i + 1) - 1 : 8 × i ] 位会被写入内存的对应地址
MemReadOutput内存访问的读使能信号(高电平有效)
Read_data[31:0]Input从内存中读取的数据

2、定制MIPS功能型处理器的改进

实验项目二中,我们实现了简单功能型处理器,但是我们所实现的简单功能性处理器使用的是理想内存,我们需要将其改进为支持真实内存,即建立真实内存访问通路。真实内存的访问需要根据一定的内存访问协议,并通过内存访问控制器增加额外的周期。

1)访存通路及其接口改造

之前的访存通路是这样的:

访存通道说明
指令请求发送通道程序计数器(PC)作为内存读地址
指令应答接收信号从内存读入指令(Instruction)
数据请求发送通道数据访问地址、读写控制信号、写数据共同组成
数据应答接收通道从外界读取数据(Read_data)

我们需要做的是为每一个通道添加一个Valid-Ready握手信号,当且仅当某个通道的Valid与Ready同时拉高时才开放通道。其中,Valid为高电平表示发送方发出的请求或应答内容有效;而Ready表示接收方可以接收发送方的请求或应答,已经在接口定义时提到。为了方便改造,数据请求通道的Valid由MemWrite和MemRead代替。

2)三段式状态机

在等待接收方拉高Ready信号时,发送方需要保持Valid信号为高电平,并在接收方Ready拉高后的第一个时钟上升沿释放包括Valid在内的所有对应通道控制信号,此时对应通道输入或输出的内容有效,即握手成功。对于接收方,如果发送方已经准备好即Ready已拉高,它需要等待发送方发送有效值即Valid信号拉高,并在握手成功后释放信号。这样的控制信号模式使用状态机相当方便,我们可以通过定义不同的状态来改变这些控制信号的值。三段式状态机则是一个比较规范的状态机的典型。

各个状态及信号定义

localparam INIT	= 9'b000000001,
		   IF	= 9'b000000010,
		   IW	= 9'b000000100,
		   ID	= 9'b000001000,
		   EX	= 9'b000010000,
		   ST	= 9'b000100000,
		   WB	= 9'b001000000,
		   LD	= 9'b010000000,
		   RDW	= 9'b100000000;
reg [8:0]	current_state;
reg [8:0]	next_state;
reg [31:0]	current_PC;
reg [31:0]	Valid_Instruction;
reg [31:0]	Valid_Read_data;

状态转移图如下:状态转移图

状态机

	always@(posedge clk) begin
		if (rst)
			current_state <= INIT;
		else
			current_state <= next_state;
	end
	always@(*) begin
		case (current_state)
			INIT:	next_state <= IF;//无条件
			IF:	begin
				if (Inst_Req_Ready) next_state <= IW;//Inst_Req_Ready
				else next_state <= IF;
			end
			IW:	begin
				if (Inst_Valid) next_state <= ID;//Inst_Valid
				else next_state <= IW;
			end
			ID:	begin
				if (Valid_Instruction != 32'b0) next_state <= EX;//非 NOP 指令
				else next_state <= IF;
			end
			EX:	begin
				if (opcode == `REGIMM || opcode[5:2] == 4'b0001 || opcode == 6'b000010)
					next_state <= IF;//REGIMM / I-Type 跳转指令 / J 指令
				else if (opcode == `SPECIAL || opcode[5:3] == 3'b001 || opcode == 6'b000011)
					next_state <= WB;//R-Type指令 / I-Type运算指令 / JAL指令
				else if (opcode[5] && ~opcode[3]) next_state <= LD;//Load 指令
				else if (opcode[5] && opcode[3]) next_state <= ST;//Store 指令
				else next_state = EX;
			end
			LD:	begin
				if (Mem_Req_Ready) next_state <= RDW;//Mem_Req_Ready
				else next_state <= LD;
			end
			ST:	begin
				if (Mem_Req_Ready) next_state <= IF;//Mem_Req_Ready
				else next_state <= ST;
			end
			WB:	next_state <= IF;//无条件
			RDW:	begin
				if (Read_data_Valid) next_state <= WB;//Read_data_Valid
				else next_state <= RDW;
			end
			default:
				next_state <= current_state;
		endcase
	end
	
	assign PC_4 = PC + 4;
	always@(posedge clk) begin
		if (rst) begin 
			PC <= 32'b0;
		end
		else if (current_state == EX) begin
			PC <= Jump ? Jump_addr : (Branch ? Branch_addr : PC_4);
		end
		else if (Instruction == 32'b0 && current_state == IW && Inst_Ready && Inst_Valid) begin
			PC <= PC_4;
		end
		else begin
			PC <= PC;
		end
	end
	always @(posedge clk) begin
		current_PC <= (current_state == IF) ? PC : current_PC;
	end
	assign Inst_Req_Valid = (current_state == IF) ? 1 : 0;
	assign Inst_Ready = (current_state == INIT || current_state == IW) ? 1 : 0;
	always@(posedge clk) begin
		Valid_Instruction <= (Inst_Ready && Inst_Valid) ? Instruction : Valid_Instruction;
	end
	assign Read_data_Ready = (current_state == INIT || current_state == RDW) ? 1 : 0;
	always@(posedge clk) begin
		Valid_Read_data <= (Read_data_Ready && Read_data_Valid) ? Read_data : Valid_Read_data;
	end

3、UART控制器的访问与打印的实现

对于 puts 函数,它需要向 UART 控制器传送字符串 s 而不是打印 s。首先需要检查 UART 是否为满,这是通过查看基地址偏移四位后的地址所指向的 STATUS 值来实现的,之后等待它清空,并把 s[i] 写入到基地址偏移八位后的地址指向的位置。关于如何判断 UART 为空,需要先偏移四位然后取倒数第四位看是不是零,其中的偏移四位可以通过直接加一或者先转化为 char 类型并加四,再转换回 unsigned int 即可。这中间的 volatile 意为不稳定,即告诉计算机这个变量的值随时可能改变,在程序运行时无需对这个变量进行优化,而且在之后每次运行中用到该变量的时候都需要从内存中重新读取;如果去掉 volatile 可能导致 UART 控制器的基地址不能实时更新。

int
puts(const char *s)
{
	//TODO: Add your driver code here
	int i = 0;
	while (s[i] != '\0') {
		while ((*(volatile unsigned int *)((char *)uart + UART_STATUS)) & UART_TX_FIFO_FULL);
		*((char *)uart + UART_TX_FIFO) = s[i++];
	}
	return i;
}

4、性能计数器的实现

性能计数器比较自由,可以自行选择。我在实验中实现了对处理器运行周期、完成执行的指令数、访存指令数、访存延时、跳转发生/不发生指令数这六个性能指标。

	//处理器运行周期
	reg [31:0]	Cycle_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			Cycle_cnt <= 32'b0;
		else
			Cycle_cnt <= Cycle_cnt + 32'b1;
	end
	assign cpu_perf_cnt_0 = Cycle_cnt;

	//完成执行的指令数
	reg [31:0]	Inst_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			Inst_cnt <= 32'b0;
		else if (current_state == EX)
			Inst_cnt <= Inst_cnt + 32'b1;
		else
			Inst_cnt <= Inst_cnt;
	end
	assign cpu_perf_cnt_1 = Inst_cnt;

	//访存指令数
	reg [31:0]	MemVisit_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			MemVisit_cnt <= 32'b0;
		else if ((current_state == LD || current_state == ST) && Mem_Req_Ready)
			MemVisit_cnt <= MemVisit_cnt + 32'b1;
		else
			MemVisit_cnt <= MemVisit_cnt;
	end
	assign cpu_perf_cnt_2 = MemVisit_cnt;

	//访存延时
	reg [31:0]	MemDelay_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			MemDelay_cnt <= 32'b0;
		else if (((current_state == ST || current_state == LD) && !Mem_Req_Ready) || (current_state == RDW && !Read_data_Valid))
			MemDelay_cnt <= MemDelay_cnt + 32'b1;
		else
			MemDelay_cnt <= MemDelay_cnt;
	end
	assign cpu_perf_cnt_3 = MemDelay_cnt;

	//跳转发生数
	reg [31:0]	Branch_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			Branch_cnt <= 32'b0;
		else if (current_state == EX && Branch)
			Branch_cnt <= Branch_cnt + 32'b1;
		else
			Branch_cnt <= Branch_cnt;
	end
	assign cpu_perf_cnt_4 = Branch_cnt;

	//非跳转发生数
	reg [31:0]	NotBra_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			NotBra_cnt <= 32'b0;
		else if (current_state == EX && !Branch)
			NotBra_cnt <= NotBra_cnt + 32'b1;
		else
			NotBra_cnt <= NotBra_cnt;
	end
	assign cpu_perf_cnt_5 = NotBra_cnt;

除此之外,还需要对软件部分做更改。先在头文件中定义我们所用到的六个接口:

#define cpu_perf_cnt_0 0x60010000
#define cpu_perf_cnt_1 0x60010008
#define cpu_perf_cnt_2 0x60011000
#define cpu_perf_cnt_3 0x60011008
#define cpu_perf_cnt_4 0x60012000
#define cpu_perf_cnt_5 0x60012008

typedef struct Result {
	int pass;
	unsigned long msec;
	unsigned long Inst;
	unsigned long MemVisit;
	unsigned long MemDelay;
	unsigned long Branch;
	unsigned long NotBra;
} Result;

之后修改函数 perf_cnt.c:

unsigned long _uptime() {
  // TODO [COD]
  //   You can use this function to access performance counter related with time or cycle.
  volatile unsigned long *Cycle_cnt = (unsigned long *)cpu_perf_cnt_0;
  return *Cycle_cnt;
}

unsigned long _upInst() {
  // TODO [COD]
  //   You can use this function to access performance counter related with time or cycle.
  volatile unsigned long *Inst_cnt = (unsigned long *)cpu_perf_cnt_1;
  return *Inst_cnt;
}

unsigned long _upMemVisit() {
  // TODO [COD]
  //   You can use this function to access performance counter related with time or cycle.
  volatile unsigned long *MemVisit_cnt = (unsigned long *)cpu_perf_cnt_2;
  return *MemVisit_cnt;
}

unsigned long _upMemDelay() {
  // TODO [COD]
  //   You can use this function to access performance counter related with time or cycle.
  volatile unsigned long *MemDelay_cnt = (unsigned long *)cpu_perf_cnt_3;
  return *MemDelay_cnt;
}

unsigned long _upBranch() {
  // TODO [COD]
  //   You can use this function to access performance counter related with time or cycle.
  volatile unsigned long *Branch_cnt = (unsigned long *)cpu_perf_cnt_4;
  return *Branch_cnt;
}

unsigned long _upNotBra() {
  // TODO [COD]
  //   You can use this function to access performance counter related with time or cycle.
  volatile unsigned long *NotBra_cnt = (unsigned long *)cpu_perf_cnt_5;
  return *NotBra_cnt;
}

void bench_prepare(Result *res) {
  // TODO [COD]
  //   Add preprocess code, record performance counters' initial states.
  //   You can communicate between bench_prepare() and bench_done() through
  //   static variables or add additional fields in `struct Result`
  res->msec     = _uptime();
  res->Inst     = _upInst();
  res->MemVisit = _upMemVisit();
  res->MemDelay = _upMemDelay();
  res->Branch   = _upBranch();
  res->NotBra   = _upNotBra();
}

void bench_done(Result *res) {
  // TODO [COD]
  //  Add postprocess code, record performance counters' current states.
  res->msec     = _uptime()     - res->msec;
  res->Inst     = _upInst()     - res->Inst;
  res->MemVisit = _upMemVisit() - res->MemVisit;
  res->MemDelay = _upMemDelay() - res->MemDelay;
  res->Branch   = _upBranch()   - res->Branch;
  res->NotBra   = _upNotBra()   - res->NotBra;
}

还需要修改 bench.c 以在运行中打印性能计数器的值:

int main() {
  int pass = 1;

  _Static_assert(ARR_SIZE(benchmarks) > 0, "non benchmark");

  for (int i = 0; i < ARR_SIZE(benchmarks); i ++) {
    Benchmark *bench = &benchmarks[i];
    current = bench;
    setting = &bench->settings[SETTING];
    const char *msg = bench_check(bench);
    printk("[%s] %s: ", bench->name, bench->desc);
    if (msg != NULL) {
      printk("Ignored %s\n", msg);
    } else {
      unsigned long msec = ULONG_MAX;
      unsigned long Inst = ULONG_MAX;
      unsigned long MemVisit = ULONG_MAX;
      unsigned long MemDelay = ULONG_MAX;
      unsigned long Branch = ULONG_MAX;
      unsigned long NotBra = ULONG_MAX;
      int succ = 1;
      for (int i = 0; i < REPEAT; i ++) {
        Result res;
        run_once(bench, &res);
        printk(res.pass ? "*" : "X");
        succ &= res.pass;
        if (res.msec < msec) msec = res.msec;
        if (res.Inst < Inst) Inst = res.Inst;
        if (res.MemVisit < MemVisit) MemVisit = res.MemVisit;
        if (res.MemDelay < MemDelay) MemDelay = res.MemDelay;
        if (res.Branch < Branch) Branch = res.Branch;
        if (res.NotBra < NotBra) NotBra = res.NotBra;
      }

      if (succ) printk(" Passed.\n");
      else printk(" Failed.\n");

      pass &= succ;

      // TODO [COD]
      //   A benchmark is finished here, you can use printk to output some informantion.
      //   `msec' is intended indicate the time (or cycle),
      //   you can ignore according to your performance counters semantics.
      printk("Time Cycle: %u\n", msec);
      printk("Instruction Number: %u\n", Inst);
      printk("Memory Visit: %u\n", MemVisit);
      printk("Memory Delay: %u\n", MemDelay);
      printk("Branch Number: %u\n", Branch);
      printk("Not Branch Number: %u\n", NotBra);
    }
  }

  printk("benchmark finished\n");

  if(pass)
	  hit_good_trap();
  else
	  nemu_assert(0);

  return 0;
}

5、附:costom_cpu.v

`timescale 10ns / 1ns

`define DATA_WIDTH 32
`define ADDR_WIDTH 5

//	OPCODE:		6-bit
`define SPECIAL		6'b000000
`define REGIMM		6'b000001
`define ADDIU		6'b001001
`define LUI		6'b001111
`define LB		6'b100000
`define LH		6'b100001
`define LBU		6'b100100
`define LHU		6'b100101
`define LWL		6'b100010
`define LWR		6'b100110
`define SB		6'b101000
`define SH		6'b101001
`define SW		6'b101011
`define SWL		6'b101010
`define SWR		6'b101110
`define J		6'b000010
`define JAL		6'b000011

//	FUNC:		6-bit
`define JR		6'b001000
`define JALR		6'b001001
`define MOVZ		6'b001010
`define MOVN		6'b001011

module custom_cpu(
	input         clk,
	input         rst,

	//Instruction request channel
	output reg [31:0] PC,
	output        Inst_Req_Valid,
	input         Inst_Req_Ready,

	//Instruction response channel
	input  [31:0] Instruction,
	input         Inst_Valid,
	output        Inst_Ready,

	//Memory request channel
	output [31:0] Address,
	output        MemWrite,
	output [31:0] Write_data,
	output [ 3:0] Write_strb,
	output        MemRead,
	input         Mem_Req_Ready,

	//Memory data response channel
	input  [31:0] Read_data,
	input         Read_data_Valid,
	output        Read_data_Ready,

	input         intr,

	output [31:0] cpu_perf_cnt_0,
	output [31:0] cpu_perf_cnt_1,
	output [31:0] cpu_perf_cnt_2,
	output [31:0] cpu_perf_cnt_3,
	output [31:0] cpu_perf_cnt_4,
	output [31:0] cpu_perf_cnt_5,
	output [31:0] cpu_perf_cnt_6,
	output [31:0] cpu_perf_cnt_7,
	output [31:0] cpu_perf_cnt_8,
	output [31:0] cpu_perf_cnt_9,
	output [31:0] cpu_perf_cnt_10,
	output [31:0] cpu_perf_cnt_11,
	output [31:0] cpu_perf_cnt_12,
	output [31:0] cpu_perf_cnt_13,
	output [31:0] cpu_perf_cnt_14,
	output [31:0] cpu_perf_cnt_15,

	output [69:0] inst_retire
);

/* The following signal is leveraged for behavioral simulation, 
* which is delivered to testbench.
*
* STUDENTS MUST CONTROL LOGICAL BEHAVIORS of THIS SIGNAL.
*
* inst_retired (70-bit): detailed information of the retired instruction,
* mainly including (in order) 
* { 
*   reg_file write-back enable  (69:69,  1-bit),
*   reg_file write-back address (68:64,  5-bit), 
*   reg_file write-back data    (63:32, 32-bit),  
*   retired PC                  (31: 0, 32-bit)
* }
*
*/
	wire [69:0] inst_retire;

// TODO: Please add your custom CPU code here
	localparam INIT	= 9'b000000001,
		   IF	= 9'b000000010,
		   IW	= 9'b000000100,
		   ID	= 9'b000001000,
		   EX	= 9'b000010000,
		   ST	= 9'b000100000,
		   WB	= 9'b001000000,
		   LD	= 9'b010000000,
		   RDW	= 9'b100000000;
	reg [8:0]	current_state;
	reg [8:0]	next_state;
	reg [31:0]	current_PC;
	reg [31:0]	Valid_Instruction;
	reg [31:0]	Valid_Read_data;

	wire				RF_wen;
	wire [`ADDR_WIDTH - 1:0]	RF_waddr;
	wire [`DATA_WIDTH - 1:0]	RF_wdata;
	wire [`DATA_WIDTH - 1:0]	RF_rdata1;
	wire [`DATA_WIDTH - 1:0]	RF_rdata2;

	wire [5:0]			opcode;
	wire [`ADDR_WIDTH - 1:0]	rs;
	wire [`ADDR_WIDTH - 1:0]	rt;
	wire [`ADDR_WIDTH - 1:0]	rd;
	wire [4:0]			sa;
	wire [5:0]			func;

	wire [`DATA_WIDTH - 1:0]	zero_extend;
	wire [`DATA_WIDTH - 1:0]	signed_extend;
	wire [`DATA_WIDTH - 1:0]	shift_signed_extend;

	wire [2:0]			ALU_control;
	wire [`DATA_WIDTH - 1:0]	ALU_result;
	wire [`DATA_WIDTH - 1:0]	ALU_num1;
	wire [`DATA_WIDTH - 1:0]	ALU_num2;
	wire				Zero;

	wire [4:0]			Shift_num;
	wire [1:0]			Shift_op;
	wire [`DATA_WIDTH - 1:0]	Shift_result;

	wire				Jump;
	wire [`DATA_WIDTH - 1:0]	Jump_addr;

	wire				Branch;
	wire [`DATA_WIDTH - 1:0]	Branch_addr;

	wire [`DATA_WIDTH - 1:0]	load_data;
	wire [7:0]			byte_data;
	wire [15:0]			half_data;
	wire [`DATA_WIDTH - 1:0]	lwl_data;
	wire [`DATA_WIDTH - 1:0]	lwr_data;
	
	wire [`DATA_WIDTH - 1:0]	PC_4;

	assign	opcode 	= Valid_Instruction[31:26];
	assign	rs 	= Valid_Instruction[25:21];
	assign	rt 	= Valid_Instruction[20:16];
	assign	rd 	= Valid_Instruction[15:11];
	assign	sa	= Valid_Instruction[10:6];
	assign	func 	= Valid_Instruction[5:0];
	assign	zero_extend		= {16'b0, Valid_Instruction[15:0]};
	assign	signed_extend		= Valid_Instruction[15] ? {{16{1'b1}}, Valid_Instruction[15:0]} : {{16{1'b0}}, Valid_Instruction[15:0]};
	assign	shift_signed_extend	= Valid_Instruction[15] ? {{14{1'b1}}, Valid_Instruction[15:0], 2'b00} : {{14{1'b0}}, Valid_Instruction[15:0], 2'b00};

	assign 	ALU_control 	= (opcode == `SPECIAL && func[3:2] == 2'b00) ? {func[1], 2'b10}//ADD/SUB: R-Type: 运算指令-ADDU/SUBU
			   	: (opcode == `SPECIAL && func[3:2] == 2'b01) ? {func[1], 1'b0, func[0]}//AND/OR/XOR/NOR: R-Type: 运算指令-AND/OR/XOR/NOR
			   	: (opcode == `SPECIAL && func[3:2] == 2'b10) ? {~func[0], 2'b11}//SLT/SLTU: R-Type: 运算指令-SLT
				: (opcode == `REGIMM || opcode[5:1] == 5'b00011) ? 3'b111//SLT: REGIMM指令/I-Type: 分支指令-BLEZ/BGTZ
				: (opcode[5:1] == 5'b00010) ? 3'b110//SUB: I-Type: 分支指令-BEQ/BNE
				: (opcode[5:3] == 3'b001 && opcode[2:1] == 2'b00) ? {opcode[1], 2'b10}//ADD: I-Type: 计算指令-ADDI/ADDIU
				: (opcode[5:3] == 3'b001 && opcode[2] == 1'b1 && opcode[1:0] != 2'b11) ? {opcode[1], 1'b0, opcode[0]}//AND/OR/XOR: I-Type: 计算指令-ANDI/ORI/XORI
				: (opcode[5:3] == 3'b001 && opcode[2:1] == 2'b01) ? {~opcode[0], 2'b11}//SLT/SLTU: I-Type: 计算指令-SLTI/SLTIU
				: (opcode[5]) ? 3'b010//ADD: I-Type: 访存指令
			   	: 3'bXXX;//NOPE
	assign	ALU_num1	= (opcode[5:1] == 5'b00011) ? 0 : RF_rdata1;//I-Type: 分支指令-BLEZ/BGTZ : 其他指令
	assign	ALU_num2	= (opcode == `REGIMM) ? 32'b0//REGIMM指令
				: (opcode[5:1] == 5'b00011) ? RF_rdata1//I-Type: 分支指令-BLEZ/BGTZ
				: (opcode[5:3] == 3'b001 && opcode != `ADDIU) ? zero_extend//I-Type: 计算指令(除了ADDIU)
				: (opcode[5] == 1 || opcode == `ADDIU) ? signed_extend//I-Type: 访存指令/计算指令-ADDIU
				: RF_rdata2;//其他指令
	assign	Shift_num	= (func[2] == 0) ? sa : RF_rdata1[4:0];
	assign	Shift_op	= (opcode == `SPECIAL && func[5:3] == 3'b000) ? func[1:0] : 2'bXX;

	assign	Jump		= ((opcode == `SPECIAL && {func[5:3], func[1]} == 4'b0010) || opcode[5:1] == 5'b00001) ? 1//R-Type: 跳转指令/J-Type指令
				: 0;
	assign	Jump_addr	= (opcode == `SPECIAL && {func[5:3], func[1]} == 4'b0010) ? {RF_rdata1}//R-Type: 跳转指令
				: {PC_4[31:28], Valid_Instruction[25:0], 2'b00};//J-Type指令
	assign	Branch		= ((opcode == `REGIMM && (rt[0]^ALU_result[0])) || (opcode[5:2] == 4'b0001 && (opcode[0] ^ Zero))) ? 1 : 0;//REGIMM指令/I-Type: 分支指令
	assign	Branch_addr	= shift_signed_extend + PC_4;

	assign	RF_wen		= (opcode == `REGIMM || opcode[5:2] == 4'b0001 || (opcode[5] && opcode[3])) ? 0//REGIMM指令/I-Type: 分支指令/I-Type: 内存写指令
				: (opcode == `SPECIAL && {func[5:3], func[1]} == 4'b0011) ? func[0]^(RF_rdata2 == 32'b0)//R-Type: mov指令
				: (opcode == `SPECIAL && func == `JR) ? 0//R-Type: 跳转指令-JR
				: (opcode == `J) ? 0//J-Type: J
				: (opcode == `SPECIAL && func == `JALR && current_state == EX) ? 1//R-Type: 跳转指令-JALR 且 state = EX
				: (opcode == `JAL && current_state == EX) ? 1//J-Type: JAL 且 state = EX
				: (current_state == WB) ? 1//state = WB
				: 0;
	assign	RF_waddr	= (opcode[5:3] == 3'b001 || opcode[5] & ((~opcode[3]))) ? rt//I-Type: 计算指令/I-Type: 内存读指令
				: (opcode[5:1] == 5'b00001 || (opcode == `SPECIAL && func == `JALR && rd == 0)) ? 31//J-Type指令/R-Type: 跳转指令-JALR(rd未指定)
				: rd;
	assign	RF_wdata	= (opcode == `SPECIAL && ((func == `MOVZ && RF_rdata2 == 32'b0) || (func == `MOVN && RF_rdata2 != 32'b0))) ? RF_rdata1//R-Type: mov指令
				: (opcode == `LUI) ? {Valid_Instruction[15:0], 16'b0}//I-Type: 计算指令-LUI
				: ((opcode == `SPECIAL && func[5] == 1'b1) || (opcode[5:3] == 3'b001)) ? ALU_result//R-Type: 运算指令/I-Type: 计算指令
				: (opcode == `SPECIAL && func[5:3] == 3'b000) ? Shift_result//R-Type: 移位指令
				: ((opcode == `SPECIAL && {func[5:3], func[1]} == 4'b0010) || opcode[5:1] == 5'b00001) ? (current_PC + 8)//R-Type: 跳转指令/J-Type指令
				: (opcode[5] && (~opcode[3])) ? load_data//I-Type: 内存读指令
				: 32'bx;

	assign	MemRead		= (current_state == LD) ? 1 : 0;//I-Type: 内存读指令
	assign	load_data	= (opcode == `LB) ? (byte_data[7] ? {{24{1'b1}}, byte_data} : {{24{1'b0}}, byte_data})//LB
				: (opcode == `LH) ? (half_data[15] ? {{16{1'b1}}, half_data} : {{16{1'b0}}, half_data})//LH
				: (opcode == `LBU) ? {{24{1'b0}}, byte_data}//LBU
				: (opcode == `LHU) ? {{16{1'b0}}, half_data}//LHU
				: (opcode == `LWL) ? lwl_data//LWL
				: (opcode == `LWR) ? lwr_data//LWR
				: Valid_Read_data;//LW
	assign	byte_data	= (ALU_result[1] & ALU_result[0]) ? Valid_Read_data[31:24]
				: (ALU_result[1] & ~ALU_result[0]) ? Valid_Read_data[23:16]
				: (~ALU_result[1] & ALU_result[0]) ? Valid_Read_data[15:8]
				: Valid_Read_data[7:0];
	assign	half_data	= (~ALU_result[1] & ~ALU_result[0]) ? Valid_Read_data[15:0] : Valid_Read_data[31:16];
	assign	lwl_data	= (ALU_result[1] & ALU_result[0]) ? Valid_Read_data[31:0]
				: (ALU_result[1] & ~ALU_result[0]) ? {Valid_Read_data[23:0], RF_rdata2[7:0]}
				: (~ALU_result[1] & ALU_result[0]) ? {Valid_Read_data[15:0], RF_rdata2[15:0]}
				: {Valid_Read_data[7:0], RF_rdata2[23:0]};
	assign	lwr_data	= (ALU_result[1] & ALU_result[0]) ? {RF_rdata2[31:8], Valid_Read_data[31:24]}
				: (ALU_result[1] & ~ALU_result[0]) ? {RF_rdata2[31:16], Valid_Read_data[31:16]}
				: (~ALU_result[1] & ALU_result[0]) ? {RF_rdata2[31:24], Valid_Read_data[31:8]}
				: Valid_Read_data[31:0];
	assign	Address		= {ALU_result[31:2], 2'b00};
	assign	MemWrite	= (current_state == ST) ? 1 : 0;//I-Type: 内存写指令
	assign	Write_data	= (opcode == `SB) ? (Write_strb[3] ? {RF_rdata2[7:0], 24'b0}
						  : Write_strb[2] ? {8'b0, RF_rdata2[7:0], 16'b0}
						  : Write_strb[1] ? {16'b0, RF_rdata2[7:0], 8'b0}
						  : {24'b0, RF_rdata2[7:0]})//SB
				: (opcode == `SH) ? ((Write_strb[3] && Write_strb[2]) ? {RF_rdata2[15:0], 16'b0}
						  : {16'b0, RF_rdata2[15:0]})//SH
				: (opcode == `SWL) ? (Write_strb[3] ? RF_rdata2
						   : Write_strb[2] ? {8'b0, RF_rdata2[31:8]}
						   : Write_strb[1] ? {16'b0, RF_rdata2[31:16]}
						   : {24'b0, RF_rdata2[31:24]})
				: (opcode == `SWR) ? (Write_strb[0] ? RF_rdata2
						   : Write_strb[1] ? {RF_rdata2[23:0], 8'b0}
						   : Write_strb[2] ? {RF_rdata2[15:0], 16'b0}
						   : {RF_rdata2[7:0], 24'b0})
				: RF_rdata2;
	assign	Write_strb	= (opcode[1:0] == 2'b00) ? (4'b1000 >> (~ALU_result[1:0]))//SB
				: (opcode[1:0] == 2'b01) ? {{2{ALU_result[1]}}, {2{~ALU_result[1]}}}//SH
				: (opcode[1:0] == 2'b11) ? 4'b1111//SW
				: (opcode[2:0] == 3'b010) ? {ALU_result[1]&ALU_result[0], ALU_result[1], ALU_result[1]|ALU_result[0], 1'b1}//SWL
				: {1'b1, (~ALU_result[1]) | (~ALU_result[0]), (~ALU_result[1]), (~ALU_result[1]) & (~ALU_result[0])};//SWR

	reg_file reg_file_module(
		.clk(clk),
		.waddr(RF_waddr),
		.raddr1(rs),
		.raddr2(rt),
		.wen(RF_wen),
		.wdata(RF_wdata),
		.rdata1(RF_rdata1),
		.rdata2(RF_rdata2)
	);
	alu alu_module(
		.A(ALU_num1),
		.B(ALU_num2),
		.ALUop(ALU_control),
		.Result(ALU_result),
		.Overflow(),
		.CarryOut(),
		.Zero(Zero)
	);
	shifter shifter_module(
		.A(RF_rdata2),
		.B(Shift_num),
		.Shiftop(Shift_op),
		.Result(Shift_result)
	);

	always@(posedge clk) begin
		if (rst)
			current_state <= INIT;
		else
			current_state <= next_state;
	end
	always@(*) begin
		case (current_state)
			INIT:	next_state <= IF;//无条件
			IF:	begin
				if (Inst_Req_Ready) next_state <= IW;//Inst_Req_Ready
				else next_state <= IF;
			end
			IW:	begin
				if (Inst_Valid) next_state <= ID;//Inst_Valid
				else next_state <= IW;
			end
			ID:	begin
				if (Valid_Instruction != 32'b0) next_state <= EX;//非 NOP 指令
				else next_state <= IF;
			end
			EX:	begin
				if (opcode == `REGIMM || opcode[5:2] == 4'b0001 || opcode == 6'b000010)
					next_state <= IF;//REGIMM / I-Type 跳转指令 / J 指令
				else if (opcode == `SPECIAL || opcode[5:3] == 3'b001 || opcode == 6'b000011)
					next_state <= WB;//R-Type指令 / I-Type运算指令 / JAL指令
				else if (opcode[5] && ~opcode[3]) next_state <= LD;//Load 指令
				else if (opcode[5] && opcode[3]) next_state <= ST;//Store 指令
				else next_state = EX;
			end
			LD:	begin
				if (Mem_Req_Ready) next_state <= RDW;//Mem_Req_Ready
				else next_state <= LD;
			end
			ST:	begin
				if (Mem_Req_Ready) next_state <= IF;//Mem_Req_Ready
				else next_state <= ST;
			end
			WB:	next_state <= IF;//无条件
			RDW:	begin
				if (Read_data_Valid) next_state <= WB;//Read_data_Valid
				else next_state <= RDW;
			end
			default:
				next_state <= current_state;
		endcase
	end
	
	assign PC_4 = PC + 4;
	always@(posedge clk) begin
		if (rst) begin 
			PC <= 32'b0;
		end
		else if (current_state == EX) begin
			PC <= Jump ? Jump_addr : (Branch ? Branch_addr : PC_4);
		end
		else if (Instruction == 32'b0 && current_state == IW && Inst_Ready && Inst_Valid) begin
			PC <= PC_4;
		end
		else begin
			PC <= PC;
		end
	end
	always @(posedge clk) begin
		current_PC <= (current_state == IF) ? PC : current_PC;
	end
	assign Inst_Req_Valid = (current_state == IF) ? 1 : 0;
	assign Inst_Ready = (current_state == INIT || current_state == IW) ? 1 : 0;
	always@(posedge clk) begin
		Valid_Instruction <= (Inst_Ready && Inst_Valid) ? Instruction : Valid_Instruction;
	end
	assign Read_data_Ready = (current_state == INIT || current_state == RDW) ? 1 : 0;
	always@(posedge clk) begin
		Valid_Read_data <= (Read_data_Ready && Read_data_Valid) ? Read_data : Valid_Read_data;
	end

	//处理器运行周期
	reg [31:0]	Cycle_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			Cycle_cnt <= 32'b0;
		else
			Cycle_cnt <= Cycle_cnt + 32'b1;
	end
	assign cpu_perf_cnt_0 = Cycle_cnt;

	//完成执行的指令数
	reg [31:0]	Inst_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			Inst_cnt <= 32'b0;
		else if (current_state == EX)
			Inst_cnt <= Inst_cnt + 32'b1;
		else
			Inst_cnt <= Inst_cnt;
	end
	assign cpu_perf_cnt_1 = Inst_cnt;

	//访存指令数
	reg [31:0]	MemVisit_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			MemVisit_cnt <= 32'b0;
		else if ((current_state == LD || current_state == ST) && Mem_Req_Ready)
			MemVisit_cnt <= MemVisit_cnt + 32'b1;
		else
			MemVisit_cnt <= MemVisit_cnt;
	end
	assign cpu_perf_cnt_2 = MemVisit_cnt;

	//访存延时
	reg [31:0]	MemDelay_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			MemDelay_cnt <= 32'b0;
		else if (((current_state == ST || current_state == LD) && !Mem_Req_Ready) || (current_state == RDW && !Read_data_Valid))
			MemDelay_cnt <= MemDelay_cnt + 32'b1;
		else
			MemDelay_cnt <= MemDelay_cnt;
	end
	assign cpu_perf_cnt_3 = MemDelay_cnt;

	//跳转发生数
	reg [31:0]	Branch_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			Branch_cnt <= 32'b0;
		else if (current_state == EX && Branch)
			Branch_cnt <= Branch_cnt + 32'b1;
		else
			Branch_cnt <= Branch_cnt;
	end
	assign cpu_perf_cnt_4 = Branch_cnt;

	//非跳转发生数
	reg [31:0]	NotBra_cnt;
	always@(posedge clk) begin
		if (rst == 1'b1)
			NotBra_cnt <= 32'b0;
		else if (current_state == EX && !Branch)
			NotBra_cnt <= NotBra_cnt + 32'b1;
		else
			NotBra_cnt <= NotBra_cnt;
	end
	assign cpu_perf_cnt_5 = NotBra_cnt;
endmodule
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值