FPGA常见接口及逻辑实现(三)—— SPI

一、SPI协议简介

        SPI是串行外设接口(Serial Peripheral Interface)的缩写,是一种同步串行接口,相对于之前介绍过的UART和I2C,SPI的速率就高出很多,最高能到100M左右,SPI协议比较简单,就不多做介绍,主要介绍下与UART和I2C的不同,SPI是一主多从协议,每个从机通过一根片选信号线和主机连接,所以从接口线数量上没有I2C的优势,但是SPI是全双工通信,两根数据线,同时读写,而且还有带宽更大的DUAL SPI和QUAD SPI,分别是两根数据线同时写和四根数据线同时写,不过DSPI和QSPI一般就是半双工的了,读写不能同时进行。相较于UART,最大的不同就是SPI是同步协议,数据随总线时钟一同发送,所以接收方不需要进行波特率匹配,使用更灵活。

二、SPI协议的verilog实现思路

        SPI的时序简单,开始发送后,拉低对应从机的片选信号,然后随着总线时钟一位一位地发送数据到总线上,不同于UART的是SPI一般先发送最高位,发送完一个字节就接着发送下一个字节,没有停止位或者起始位,知道主机重新拉高片选信号,一次操作结束。

        要实现SPI,首先要产生SPI时钟,时钟一般通过系统时钟分频产生,将其参数化,以便灵活设置SPI速率。

        其次就是SPI的四种模式,由时钟极性和时钟相位两个参数来控制,其中模式0和3是上升沿采样,模式1和2是下降沿采样,模式0和1空闲时为低电平,模式2和3空闲时为高电平,记住这两个最重要的区别即可。

        在上一篇I2C的内容中,最后实现了用寄存器去控制I2C接口,这次SPI也用同样的方法来实现,所以提前预留出各种寄存器的接口。

        对参数和接口进行规划之后,就是具体时序的实现了,首先时钟分频自然是通过计数器来实现,计数的过程中还能顺便在时钟的跳变沿产生脉冲信号方便使用,同时每个时钟周期对应一比特操作,对操作的比特也进行计数,每八个比特就是一个字节;至于数据线的操作就和串口一模一样,根据比特计数器的值发送或接收对应位的比特即可。

        根据上述的思路画出时序图:

        可以看到图中片选信号提前一个时钟周期就拉低了,但是实际操作并不需要这样,只要在读写的过程中片选信号保持为低即可。

        以上都是SPI主机的实现思路,对于SPI的从机,一般想到的首先就是SPI flash,或者SPI屏幕等等,可以看出SPI的从机不像I2C的从机那样有比较通用的实现方法,很难写出一个很通用的从机模块,不过万变不离其宗,从机无论如何都是要接收主机发来的数据的,所以对于从机,我就只实现一个兼容性较好的可以完成接收数据操作和完成写回数据操作的模块,具体要实现的功能就基于此模块的基础上去修改,应该也会减少很多工作量。

        对于从机的参数,主要需要兼容主机的四种模式,一般读写flash的时候,SPI flash都同时支持模式0和模式3,或是模式1和模式2,而前文我提到模式0和模式3的共同点是上升沿采样,所以我们就用采样沿参数化,当采样沿为上升沿时,兼容模式0和3,反之兼容模式1和2。但是对于不同的采样沿,显然需要两套不同的代码,这种情况下就需要generate关键字在不同的情况下生成不同的逻辑。

三、SPI主机的Verilog实现

        主机实现的难点主要在于兼容四种模式,一般对于四种模式的描述都是第一个变化沿怎么样,第二个变化沿怎么样,我只能说这种描述太抽象了,又难记又不方便转化为逻辑语言,对于模式0和3,无论是第几个变化沿,都是在上升沿采样,下降沿发送,这些是确定的,时钟空闲状态也很好确定,是协议规定的时钟极性,至于第一个变化沿的问题直接简化为发送时的时钟初始状态,无论时钟空闲状态是高还是低,都从一个低电平开始,假如空闲状态为高,自然会产生一个下降沿,假如空闲状态为低,则继续保持低,这样真正的第一个变化沿自然是上升沿了。

        综上所述,编写的SPI主机代码如下:

`timescale 1ns / 1ps

module spi_master#(
    parameter CLK_PHA = 0,                  // SPI时钟相位
    parameter CLK_POL = 0,                  // SPI时钟极性
    parameter SCK_DIV_CNT = 4               // SPI时钟分频系数
)(
    input                       clk,        // 输入时钟
    input                       rst_n,      // 同步复位

    input                       op_start,   // 操作开始信号
    output                      op_busy,    // 操作忙碌信号
    input   [7:0]               op_len,     // 操作长度
    input   [7:0]               cs_ctrl,    // 片选信号

    output                      txc,        // 数据请求
    input   [7:0]               txd,        // 数据输入
    output                      rxv,        // 数据有效
    output  [7:0]               rxd,        // 数据输出

    output                      sck,        // SPI时钟
    output                      mosi,       // SPI主机输出从机输入
    input                       miso,       // SPI主机输入从机输出
    output  [7:0]               cs_n        // SPI片选
    );

// 参数变量声明
    // SPI时钟空闲状态
    localparam [0:0] SCK_IDLE = CLK_POL;
    // SPI时钟初始状态
    localparam [0:0] SCK_INIT = CLK_PHA ? ~CLK_POL : CLK_POL;

    // 寄存器
    reg                         spi_clk;
    reg                         master_out;

    reg     [3:0]               clk_cnt;
    reg     [3:0]               bit_cnt;
    reg     [7:0]               byte_cnt;

    reg                         spi_busy;

    reg                         data_req;
    reg                         data_valid;
    reg     [7:0]               data_out;

    reg                         start_ff1;
    reg                         start_ff2;
    reg                         start_flag;

    reg                         sck_r;
    reg                         mosi_r;
    reg     [7:0]               cs_n_r;

// 组合逻辑
    wire    half_bit = clk_cnt == SCK_DIV_CNT/2 - 1;
    wire    one_bit = clk_cnt == SCK_DIV_CNT - 1;
    wire    one_byte = bit_cnt == 7;
    wire    one_op = byte_cnt == (op_len - 1) & one_byte & one_bit;

// 模块输出连线
    assign op_busy = spi_busy;
    assign txc = data_req;
    assign rxv = data_valid;
    assign rxd = data_out;
    assign sck = sck_r;
    assign mosi = mosi_r;
    assign cs_n = cs_n_r;

// 时序逻辑
    // SPI主机接口输出
    always @(posedge clk) begin
        if(spi_busy) begin
            sck_r <= spi_clk;
            mosi_r <= master_out;
            cs_n_r <= cs_ctrl;
        end else begin
            sck_r <= SCK_IDLE;
            mosi_r <= 1'b0;
            cs_n_r <= 8'hff;
        end
    end

    // 启动信号二级同步
    always @(posedge clk) begin
        start_ff1 <= op_start;
        start_ff2 <= start_ff1;
    end

    always @(posedge clk) begin
        if(start_ff1 & ~start_ff2)
            start_flag <= 1'b1;
        else if(spi_busy)
            start_flag <= 1'b0;
    end

    // 产生SPI时钟,忙碌状态下,每半个比特周期翻转时钟信号
    always @(posedge clk) begin
        if(!rst_n)
            spi_clk <= SCK_INIT;
        else if(spi_busy & (half_bit | one_bit))
            spi_clk <= ~spi_clk;
    end

    // 忙碌标志信号,接收到启动信号拉高,发送完操作长度个字节后拉低
    always @(posedge clk) begin
        if(!rst_n)
            spi_busy <= 0;
        else if(start_flag)
            spi_busy <= 1;
        else if(one_op)
            spi_busy <= 0;
    end

    // SPI时钟周期计数器,忙碌状态下计数,计满一比特清零
    always @(posedge clk) begin
        if(!rst_n)
            clk_cnt <= 0;
        else if(spi_busy) begin
            if(one_bit)
                clk_cnt <= 0;
            else
                clk_cnt <= clk_cnt + 1;
        end
    end

    // 发送比特计数,发送完一字节计数器清零
    always @(posedge clk) begin
        if(!rst_n)
            bit_cnt <= 0;
        else if(spi_busy & one_bit) begin
            if(one_byte)
                bit_cnt <= 0;
            else
                bit_cnt <= bit_cnt + 1;
        end
    end

    // 在发送每比特的中间时刻对输入线进行采样
    always @(posedge clk) begin
        if(!rst_n)
            data_out <= 0;
        else if(spi_busy & half_bit) begin
            case (bit_cnt)
                0:data_out[7] <= miso;
                1:data_out[6] <= miso;
                2:data_out[5] <= miso;
                3:data_out[4] <= miso;
                4:data_out[3] <= miso;
                5:data_out[2] <= miso;
                6:data_out[1] <= miso;
                7:data_out[0] <= miso;
                default: data_out <= data_out;
            endcase
        end
    end

    // 依次发送每个比特到输出线
    always @(posedge clk) begin
        if(!rst_n)
            master_out <= 0;
        else if(start_flag & !spi_busy)
            master_out <= txd[7];
        else if(spi_busy & one_bit) begin
            case (bit_cnt)
                0:master_out <= txd[6];
                1:master_out <= txd[5];
                2:master_out <= txd[4];
                3:master_out <= txd[3];
                4:master_out <= txd[2];
                5:master_out <= txd[1];
                6:master_out <= txd[0];
                7:master_out <= txd[7];
                default: master_out <= master_out;
            endcase
        end
    end

    // 每字节发送结束前一个比特拉高数据请求信号
    always @(posedge clk) begin
        if(!rst_n)
            data_req <= 0;
        else if((bit_cnt == 6) & one_bit)
            data_req <= 1;
        else 
            data_req <= 0;
    end

    // 每字节发送结束的比特拉高数据有效信号
    always @(posedge clk) begin
        if(!rst_n)
            data_valid <= 0;
        else if(one_byte & one_bit)
            data_valid <= 1;
        else 
            data_valid <= 0;
    end

    // 每字节操作完成后字节计数加一,计数达到操作长度后清零
    always @(posedge clk) begin
        if(!rst_n)
            byte_cnt <= 0;
        else if(one_byte & one_bit) begin
            if(one_op)
                byte_cnt <= 0;
            else
                byte_cnt <= byte_cnt + 1;
        end
    end
    
endmodule

        本次的代码和串口很像,只不过加入了部分寄存器配置端口。以下是spi寄存器代码:

`timescale 1ns / 1ps

module spi_reg(
    input               clk,
    input               en,
    input               we,
    input   [7:0]       din,
    output  [7:0]       dout,
    input   [7:0]       addr,

    output              op_start,
    input               op_busy,
    output  [7:0]       op_len,
    output  [7:0]       cs_ctrl,
    input               txc,
    output  [7:0]       txd,
    input               rxv,
    input   [7:0]       rxd
    );

    reg     [7:0]       r_data_out;

    reg     [7:0]       r_tx_buffer [0:31];     // 0x00 - 0x1f write only
    reg     [7:0]       r_rx_buffer [0:31];     // 0x20 - 0x3f read only
    
    // bit 4-0: tx_buffer ptr
    reg     [7:0]       r_tx_ctrl = 0;

    // bit 4-0: rx_buffer ptr
    reg     [7:0]       r_rx_ctrl = 0;

    // bit 7: tx_buffer reset,self clear
    reg     [7:0]       r_tx_rst = 0;

    // bit 7: rx_buffer reset,self clear
    reg     [7:0]       r_rx_rst = 0;

    // bit 7: operate start,self clear
    reg     [7:0]       r_op_startup = 0;

    // bit 0: operate busy flag,read only
    reg     [7:0]       r_op_status = 0;

    // bit 7-0: operate length
    reg     [7:0]       r_op_length = 0;

    // bit 7-0: chip select control
    reg     [7:0]       r_cs_control = 0;

    reg     [7:0]       r_reserve_0 = 0;
    reg     [7:0]       r_reserve_1 = 0;
    reg     [7:0]       r_reserve_2 = 0;
    reg     [7:0]       r_reserve_3 = 0;
    reg     [7:0]       r_reserve_4 = 0;
    reg     [7:0]       r_reserve_5 = 0;
    reg     [7:0]       r_reserve_6 = 0;
    reg     [7:0]       r_reserve_7 = 0;        // 0x40 - 0x4f

    reg     [7:0]       start_cnt;
    reg     [7:0]       txrst_cnt;
    reg     [7:0]       rxrst_cnt;

    assign dout = r_data_out;
    assign op_start = r_op_startup[7];
    assign op_len = r_op_length;
    assign cs_ctrl = r_cs_control;
    assign txd = r_tx_buffer[r_tx_ctrl[4:0]];

    always @(posedge clk) begin:READ_REGISTER
        if(en) begin
            case (addr)
                8'h20: r_data_out <= r_rx_buffer[0];
                8'h21: r_data_out <= r_rx_buffer[1];
                8'h22: r_data_out <= r_rx_buffer[2];
                8'h23: r_data_out <= r_rx_buffer[3];
                8'h24: r_data_out <= r_rx_buffer[4];
                8'h25: r_data_out <= r_rx_buffer[5];
                8'h26: r_data_out <= r_rx_buffer[6];
                8'h27: r_data_out <= r_rx_buffer[7];
                8'h28: r_data_out <= r_rx_buffer[8];
                8'h29: r_data_out <= r_rx_buffer[9];
                8'h2a: r_data_out <= r_rx_buffer[10];
                8'h2b: r_data_out <= r_rx_buffer[11];
                8'h2c: r_data_out <= r_rx_buffer[12];
                8'h2d: r_data_out <= r_rx_buffer[13];
                8'h2e: r_data_out <= r_rx_buffer[14];
                8'h2f: r_data_out <= r_rx_buffer[15];
                8'h30: r_data_out <= r_rx_buffer[16];
                8'h31: r_data_out <= r_rx_buffer[17];
                8'h32: r_data_out <= r_rx_buffer[18];
                8'h33: r_data_out <= r_rx_buffer[19];
                8'h34: r_data_out <= r_rx_buffer[20];
                8'h35: r_data_out <= r_rx_buffer[21];
                8'h36: r_data_out <= r_rx_buffer[22];
                8'h37: r_data_out <= r_rx_buffer[23];
                8'h38: r_data_out <= r_rx_buffer[24];
                8'h39: r_data_out <= r_rx_buffer[25];
                8'h3a: r_data_out <= r_rx_buffer[26];
                8'h3b: r_data_out <= r_rx_buffer[27];
                8'h3c: r_data_out <= r_rx_buffer[28];
                8'h3d: r_data_out <= r_rx_buffer[29];
                8'h3e: r_data_out <= r_rx_buffer[30];
                8'h3f: r_data_out <= r_rx_buffer[31];
                8'h40: r_data_out <= r_tx_ctrl;
                8'h41: r_data_out <= r_rx_ctrl;
                8'h42: r_data_out <= r_tx_rst;
                8'h43: r_data_out <= r_rx_rst;
                8'h44: r_data_out <= r_op_startup;
                8'h45: r_data_out <= r_op_status;
                8'h46: r_data_out <= r_op_length;
                8'h47: r_data_out <= r_cs_control;
                8'h48: r_data_out <= r_reserve_0;
                8'h49: r_data_out <= r_reserve_1;
                8'h4a: r_data_out <= r_reserve_2;
                8'h4b: r_data_out <= r_reserve_3;
                8'h4c: r_data_out <= r_reserve_4;
                8'h4d: r_data_out <= r_reserve_5;
                8'h4e: r_data_out <= r_reserve_6;
                8'h4f: r_data_out <= r_reserve_7;
                default: r_data_out <= r_data_out;
            endcase
        end
    end

    always @(posedge clk) begin:TX_BUFFER
        integer i;
        if(en & we) begin
            r_tx_buffer[addr] <= din;
        end else if(r_tx_rst[7]) begin
            for (i = 0;i < 32;i = i + 1) begin
                r_tx_buffer[i] <= 0;
            end
        end
    end

    always @(posedge clk) begin:RX_BUFFER
        integer j;
        if(rxv)
            r_rx_buffer[r_rx_ctrl[4:0]] <= rxd;
        else if(r_rx_rst[7]) begin
            for (j = 0;j < 32;j = j + 1) begin
                r_rx_buffer[j] <= 0;
            end
        end
    end

    always @(posedge clk) begin
        if(en & we & addr == 8'h40)
            r_tx_ctrl <= din;
        else if(r_tx_rst[7])
            r_tx_ctrl <= 0;
        else if(txc)
            r_tx_ctrl <= (r_tx_ctrl != 8'h1f) ? r_tx_ctrl + 1 : 0;
    end

    always @(posedge clk) begin
        if(en & we & addr == 8'h41)
            r_rx_ctrl <= din;
        else if(r_rx_rst[7])
            r_rx_ctrl <= 0;
        else if(rxv)
            r_rx_ctrl <= (r_rx_ctrl != 8'h1f) ? r_rx_ctrl + 1 : 0;
    end

    always @(posedge clk) begin
        if(en & we & addr == 8'h42)
            r_tx_rst <= din;
        else if(&txrst_cnt)
            r_tx_rst <= r_tx_rst & 8'b0111_1111;
    end

    always @(posedge clk) begin
        if(en & we & addr == 8'h43)
            r_rx_rst <= din;
        else if(&rxrst_cnt)
            r_rx_rst <= r_rx_rst & 8'b0111_1111;
    end

    always @(posedge clk) begin
        if(en & we & addr == 8'h44)
            r_op_startup <= din;
        else if(&start_cnt)
            r_op_startup <= r_op_startup & 8'b0111_1111;
    end

    always @(posedge clk) begin
        r_op_status <= {7'b0000000,op_busy};
    end

    always @(posedge clk) begin
        if(en & we) begin
            case(addr)
                8'h46:r_op_length <= din;
                8'h47:r_cs_control <= din;
                8'h48:r_reserve_0 <= din;
                8'h49:r_reserve_1 <= din;
                8'h4a:r_reserve_2 <= din;
                8'h4b:r_reserve_3 <= din;
                8'h4c:r_reserve_4 <= din;
                8'h4d:r_reserve_5 <= din;
                8'h4e:r_reserve_6 <= din;
                8'h4f:r_reserve_7 <= din;
            endcase
        end
    end

    initial begin:TX_BUF_INIT
        integer n;
        for(n = 0;n < 32;n = n + 1) begin
            r_tx_buffer[n] = 0;
        end
    end

    initial begin:RX_BUF_INIT
        integer m;
        for(m = 0;m < 32;m = m + 1) begin
            r_rx_buffer[m] = 0;
        end
    end

    always @(posedge clk) begin
        if(r_op_startup[7])
            start_cnt <= (&start_cnt) ? start_cnt : start_cnt + 1;
        else
            start_cnt <= 0;
    end

    always @(posedge clk) begin
        if(r_tx_rst[7])
            txrst_cnt <= (&txrst_cnt) ? txrst_cnt : txrst_cnt + 1;
        else
            txrst_cnt <= 0;
    end

    always @(posedge clk) begin
        if(r_rx_rst[7])
            rxrst_cnt <= (&rxrst_cnt) ? rxrst_cnt : rxrst_cnt + 1;
        else
            rxrst_cnt <= 0;
    end

endmodule

        以上SPI主机是最常见的单线输出单线输入SPI接口,在实际项目中还经常会用到四线输出输入的qspi接口,正好最近的项目中也要用到qspi,就基于以上的spi修改了一个qspi主机模块,此qspi模块主要用于数据流写入,所以没有做读写双向处理,如果要支持四线读写,还需要做一些小修改。代码如下:

`timescale 1ns / 1ps

module qspi_master#(
    parameter CLK_PHA = 0,                  // SPI时钟相位
    parameter CLK_POL = 0,                  // SPI时钟极性
    parameter SCK_DIV_CNT = 4               // SPI时钟分频系数
)(
    input                       clk,        // 输入时钟
    input                       rst_n,      // 同步复位

    input                       empty_n,    // 发送缓存空
    input   [1:0]               wire_mode,  // SPI线模式    0:单线 1:双线 2:四线

    output                      txc,        // 数据请求
    input   [7:0]               txd,        // 数据输入
    output                      rxv,        // 数据有效
    output  [7:0]               rxd,        // 数据输出

    output                      sck,        // SPI时钟
    output                      cs_n,       // SPI片选
    output                      sd_0,       // 单线模式MOSI,双线四线输出线
    inout                       sd_1,       // 单线模式MISO,双线四线输出线
    output                      sd_2,       // 四线模式输出线
    output                      sd_3        // 四线模式输出线
    );

// 参数变量声明
    // SPI时钟空闲状态
    localparam [0:0] SCK_IDLE = CLK_POL;
    // SPI时钟初始状态
    localparam [0:0] SCK_INIT = CLK_PHA ? ~CLK_POL : CLK_POL;

    // 寄存器
    reg                         spi_clk;

    reg     [3:0]               clk_cnt;
    reg     [3:0]               bit_cnt;

    reg                         spi_busy;

    reg                         data_req;
    reg                         data_valid;
    reg     [7:0]               data_out;

    reg                         start_ff1;
    reg                         start_ff2;
    reg                         start_flag;

    reg                         sck_r;
    reg                         cs_n_r;
    reg                         sd_0_r;
    reg                         sd_1_r;
    reg                         sd_2_r;
    reg                         sd_3_r;
    reg                         out_0;
    reg                         out_1;
    reg                         out_2;
    reg                         out_3;

// 组合逻辑
    wire    miso = sd_1;
    wire    half_bit = clk_cnt == SCK_DIV_CNT/2 - 1;
    wire    one_bit = clk_cnt == SCK_DIV_CNT - 1;
    wire    one_byte = 
        wire_mode == 0 ? bit_cnt == 7 :
        wire_mode == 1 ? bit_cnt == 3 :
        wire_mode == 2 ? bit_cnt == 1 : 1'b0;
    wire    nxt_byte = 
        wire_mode == 0 ? bit_cnt == 6 :
        wire_mode == 1 ? bit_cnt == 2 :
        wire_mode == 2 ? bit_cnt == 0 : 1'b0;

// 模块输出连线
    assign txc = data_req;
    assign rxv = data_valid;
    assign rxd = data_out;
    assign sck = sck_r;
    assign cs_n = cs_n_r;
    assign sd_0 = sd_0_r;
    assign sd_1 = ((wire_mode == 1)|(wire_mode == 2)) ? sd_1_r : 1'bz;
    assign sd_2 = sd_2_r;
    assign sd_3 = sd_3_r;

// 时序逻辑
    // SPI主机接口输出
    always @(posedge clk) begin
        if(spi_busy) begin
            sck_r <= spi_clk;
            cs_n_r <= 1'b0;
            sd_0_r <= out_0;
            sd_1_r <= out_1;
            sd_2_r <= out_2;
            sd_3_r <= out_3;
        end else begin
            sck_r <= SCK_IDLE;
            cs_n_r <= 1'b1;
            sd_0_r <= 1'b0;
            sd_1_r <= 1'b0;
            sd_2_r <= 1'b0;
            sd_3_r <= 1'b0;
        end
    end

    // 启动信号二级同步
    always @(posedge clk) begin
        start_ff1 <= empty_n;
        start_ff2 <= start_ff1;
    end

    always @(posedge clk) begin
        if(start_ff1 & ~start_ff2)
            start_flag <= 1'b1;
        else if(spi_busy)
            start_flag <= 1'b0;
    end

    // 产生SPI时钟,忙碌状态下,每半个比特周期翻转时钟信号
    always @(posedge clk) begin
        if(!rst_n)
            spi_clk <= SCK_INIT;
        else if(spi_busy & (half_bit | one_bit))
            spi_clk <= ~spi_clk;
    end

    // 忙碌标志信号,接收到启动信号拉高,发送完操作长度个字节后拉低
    always @(posedge clk) begin
        if(!rst_n)
            spi_busy <= 0;
        else if(start_flag)
            spi_busy <= 1;
        else if(one_bit & one_byte & !empty_n)
            spi_busy <= 0;
    end

    // SPI时钟周期计数器,忙碌状态下计数,计满一比特清零
    always @(posedge clk) begin
        if(!rst_n)
            clk_cnt <= 0;
        else if(spi_busy) begin
            if(one_bit)
                clk_cnt <= 0;
            else
                clk_cnt <= clk_cnt + 1;
        end
    end

    // 发送比特计数,发送完一字节计数器清零
    always @(posedge clk) begin
        if(!rst_n)
            bit_cnt <= 0;
        else if(spi_busy & one_bit) begin
            if(one_byte)
                bit_cnt <= 0;
            else
                bit_cnt <= bit_cnt + 1;
        end
    end

    // 在发送每比特的中间时刻对输入线进行采样
    always @(posedge clk) begin
        if(!rst_n)
            data_out <= 0;
        else if(wire_mode == 0 & spi_busy & half_bit) begin
            case (bit_cnt)
                0:data_out[7] <= miso;
                1:data_out[6] <= miso;
                2:data_out[5] <= miso;
                3:data_out[4] <= miso;
                4:data_out[3] <= miso;
                5:data_out[2] <= miso;
                6:data_out[1] <= miso;
                7:data_out[0] <= miso;
                default: data_out <= data_out;
            endcase
        end
    end

    // 依次发送每个比特到输出线
    always @(posedge clk) begin
        if(!rst_n)
            out_0 <= 0;
        else if(start_flag & !spi_busy)
            out_0 <= txd[7];
        else if(wire_mode == 0 & spi_busy & one_bit) begin
            case (bit_cnt)
                0:out_0 <= txd[6];
                1:out_0 <= txd[5];
                2:out_0 <= txd[4];
                3:out_0 <= txd[3];
                4:out_0 <= txd[2];
                5:out_0 <= txd[1];
                6:out_0 <= txd[0];
                7:out_0 <= txd[7];
                default: out_0 <= out_0;
            endcase
        end else if(wire_mode == 1 & spi_busy & one_bit) begin
            case (bit_cnt)
                0:out_0 <= txd[5];
                1:out_0 <= txd[3];
                2:out_0 <= txd[1];
                3:out_0 <= txd[7];
                default: out_0 <= out_0;
            endcase
        end else if(wire_mode == 2 & spi_busy & one_bit) begin
            case (bit_cnt)
                0:out_0 <= txd[3];
                1:out_0 <= txd[7];
                default: out_0 <= out_0;
            endcase
        end
    end

    always @(posedge clk) begin
        if(!rst_n)
            out_1 <= 0;
        else if(wire_mode != 0 & start_flag & !spi_busy)
            out_1 <= txd[6];
        else if(wire_mode == 1 & spi_busy & one_bit) begin
            case (bit_cnt)
                0:out_1 <= txd[4];
                1:out_1 <= txd[2];
                2:out_1 <= txd[0];
                3:out_1 <= txd[6];
                default: out_1 <= out_1;
            endcase
        end else if(wire_mode == 2 & spi_busy & one_bit) begin
            case (bit_cnt)
                0:out_1 <= txd[2];
                1:out_1 <= txd[6];
                default: out_1 <= out_1;
            endcase
        end
    end

    always @(posedge clk) begin
        if(!rst_n)
            out_2 <= 0;
        else if(wire_mode == 2 & start_flag & !spi_busy)
            out_2 <= txd[5];
        else if(wire_mode == 2 & spi_busy & one_bit) begin
            case (bit_cnt)
                0:out_2 <= txd[1];
                1:out_2 <= txd[5];
                default: out_2 <= out_2;
            endcase
        end
    end

    always @(posedge clk) begin
        if(!rst_n)
            out_3 <= 0;
        else if(wire_mode == 2 & start_flag & !spi_busy)
            out_3 <= txd[4];
        else if(wire_mode == 2 & spi_busy & one_bit) begin
            case (bit_cnt)
                0:out_3 <= txd[0];
                1:out_3 <= txd[4];
                default: out_3 <= out_3;
            endcase
        end
    end

    // 每字节发送结束前一个比特拉高数据请求信号
    always @(posedge clk) begin
        if(!rst_n)
            data_req <= 0;
        else if(nxt_byte & one_bit)
            data_req <= 1;
        else 
            data_req <= 0;
    end

    // 每字节发送结束的比特拉高数据有效信号
    always @(posedge clk) begin
        if(!rst_n)
            data_valid <= 0;
        else if(one_byte & one_bit)
            data_valid <= 1;
        else 
            data_valid <= 0;
    end
    
endmodule

        主机部分到此结束,接下来介绍从机的实现。

四、SPI从机的Verilog实现

        从机的实现方式有两种,一是直接使用SCK作为时钟,二是使用高速时钟对SCK进行过采样来获取SCK的时钟沿,第一种方式实现的设计较为紧凑,第二种方式实现的设计可靠性较高,对于I2C来说,SCL最快也不过几M的频率,很容易实现第二种方式,但是SPI就不一样了,可能SCK基本都在几十M甚至100M左右,这样的话采样时钟起码需要两百多M,对于FPGA来说已经算是很高的系统频率了,舍本逐末了属于是,所以对于SPI从机我们选择用第一种方式实现。

        至于具体的功能,此从机主要模拟flash,接收几个字节的数据后返回数据,我还在内部ram的前几个字节提前初始化了几个数据来观察返回数据的正确性。代码如下:

module spi_slave#(
    parameter SAMPLE_EDGE = "rise"  // "rise" or "fall",update edge is the opposite one
)(
    input   wire        sck,        // SPI串行时钟
    input   wire        cs_n,       // SPI片选信号
    input   wire        mosi,       // SPI从机输入
    output  wire        miso        // SPI从机输出
);

    localparam RX_BYTE_CNT = 4;     // 接收字节数,指一般情况下指令 + 地址的字节数

    reg     [7:0]       ram [0:255];    // 内部RAM
    reg     [7:0]       addr;           // 操作RAM地址

    reg                 rx_valid;       // 接收数据有效信号
    reg     [7:0]       rx_buffer;      // 接收数据缓存区
    reg     [7:0]       des_reg;        // 接收数据存放目标寄存器

    reg     [2:0]       bit_cnt;        // 操作比特计数
    reg     [3:0]       byte_cnt;       // 操作字节计数

    reg                 slave_out;      // 从机输出

    reg                 state;          // 状态信号 0:接收 1:发送

    // 操作计数满一字节
    wire        one_byte = &bit_cnt;
    // 接收状态结束
    wire        rx_done = byte_cnt == RX_BYTE_CNT - 1;
    // 接收进行中
    wire        rx_busy = state == 1'b0;
    // 发送进行中
    wire        tx_busy = state == 1'b1;
    // 发送缓存区,指向内部RAM
    wire [7:0]  tx_buffer = ram[addr];

    assign miso = slave_out;

    // 初始状态为接收,接收完成后开始发送
    always @(posedge sck or posedge cs_n) begin
        if(cs_n)
            state <= 1'b0;
        else if(rx_done & one_byte)
            state <= 1'b1;
    end

generate
    // 上升沿采样,即可满足模式0和模式3,其余逻辑简单,懒得注释了
    if(SAMPLE_EDGE == "rise") begin:MODE_0_3

        always @(posedge sck or posedge cs_n) begin
            if(cs_n)
                addr <= 1'b0;
            else if(tx_busy & one_byte)
                addr <= addr + 1;
        end

        always @(posedge sck or posedge cs_n) begin
            if(cs_n)
                des_reg <= 1'b0;
            else if(rx_valid)
                des_reg <= rx_buffer;
        end

        always @(posedge sck or posedge cs_n) begin
            if(cs_n)
                bit_cnt <= 0;
            else
                bit_cnt <= bit_cnt + 1;
        end

        always @(posedge sck or posedge cs_n) begin
            if(cs_n)
                byte_cnt <= 0;
            else if(one_byte)
                byte_cnt <= (rx_done) ? byte_cnt : byte_cnt + 1;
        end

        always @(posedge sck or posedge cs_n) begin
            if(cs_n) begin
                rx_valid <= 0;
                rx_buffer <= 0;
            end else if(rx_busy) begin
                rx_valid <= one_byte;
                case (bit_cnt)
                    0:rx_buffer[7] <= mosi;
                    1:rx_buffer[6] <= mosi;
                    2:rx_buffer[5] <= mosi;
                    3:rx_buffer[4] <= mosi;
                    4:rx_buffer[3] <= mosi;
                    5:rx_buffer[2] <= mosi;
                    6:rx_buffer[1] <= mosi;
                    7:rx_buffer[0] <= mosi;
                    default:rx_buffer <= rx_buffer;
                endcase
            end
        end

        always @(negedge sck or posedge cs_n) begin
            if(cs_n)
                slave_out <= 0;
            else if(tx_busy) begin
                case (bit_cnt)
                    0:slave_out <= tx_buffer[7];
                    1:slave_out <= tx_buffer[6];
                    2:slave_out <= tx_buffer[5];
                    3:slave_out <= tx_buffer[4];
                    4:slave_out <= tx_buffer[3];
                    5:slave_out <= tx_buffer[2];
                    6:slave_out <= tx_buffer[1];
                    7:slave_out <= tx_buffer[0];
                    default:slave_out <= slave_out;
                endcase
            end
        end

    // 下降沿采样,即可满足模式1和模式2
    end else if(SAMPLE_EDGE == "fall") begin:MODE_1_2

        always @(negedge sck or posedge cs_n) begin
            if(cs_n)
                addr <= 1'b0;
            else if(tx_busy & one_byte)
                addr <= addr + 1;
        end

        always @(negedge sck or posedge cs_n) begin
            if(cs_n)
                des_reg <= 1'b0;
            else if(rx_busy & one_byte)
                des_reg <= rx_buffer;
        end

        always @(negedge sck or posedge cs_n) begin
            if(cs_n)
                bit_cnt <= 0;
            else
                bit_cnt <= bit_cnt + 1;
        end

        always @(negedge sck or posedge cs_n) begin
            if(cs_n)
                byte_cnt <= 0;
            else if(one_byte)
                byte_cnt <= (rx_done) ? byte_cnt : byte_cnt + 1;
        end

        always @(negedge sck or posedge cs_n) begin
            if(cs_n) begin
                rx_valid <= 0;
                rx_buffer <= 0;
            end else if(rx_busy) begin
                rx_valid <= one_byte;
                case (bit_cnt)
                    0:rx_buffer[7] <= mosi;
                    1:rx_buffer[6] <= mosi;
                    2:rx_buffer[5] <= mosi;
                    3:rx_buffer[4] <= mosi;
                    4:rx_buffer[3] <= mosi;
                    5:rx_buffer[2] <= mosi;
                    6:rx_buffer[1] <= mosi;
                    7:rx_buffer[0] <= mosi;
                    default:rx_buffer <= rx_buffer;
                endcase
            end
        end

        always @(posedge sck or posedge cs_n) begin
            if(cs_n)
                slave_out <= 0;
            else if(tx_busy) begin
                case (bit_cnt)
                    0:slave_out <= tx_buffer[7];
                    1:slave_out <= tx_buffer[6];
                    2:slave_out <= tx_buffer[5];
                    3:slave_out <= tx_buffer[4];
                    4:slave_out <= tx_buffer[3];
                    5:slave_out <= tx_buffer[2];
                    6:slave_out <= tx_buffer[1];
                    7:slave_out <= tx_buffer[0];
                    default:slave_out <= slave_out;
                endcase
            end
        end

    end
endgenerate

    // 初始化RAM数据
    initial begin:ram_initialize
        integer i;
        ram[0] <= 8'h53;
        ram[1] <= 8'h8b;
        ram[2] <= 8'h9c;
        ram[3] <= 8'hea;
        for (i = 4;i < 256;i = i + 1) begin
            ram[i] <= 0;
        end
    end

endmodule

        接收的数据都写入了des_reg中,实际要使用时,只需要根据条件把des_reg改为别的寄存器或内存即可正确存储数据。

        我一般习惯使用同步复位,但是由于此从机是使用SCK作为时钟,无法同步复位,所以将片选信号作为异步复位来对整个模块进行复位。

        主机从机都实现了,接下来对两个模块进行仿真。

五、SPI主从仿真

        和上一篇的I2C主机类似,SPI主机模块也是通过寄存器控制的,像上次那样通过vio对各寄存器进行读写就可以完成对模块的操作,但是这样操作太麻烦了,一般对寄存器的操作是交给PS端来做的,我们可以写一个简单的类似MCU的模块,通过指令来控制读写寄存器,这样每次复位MCU就会从头到尾执行一遍我们提前写好的指令,就不用一个一个寄存器去操作了。

        MCU代码如下:

`timescale 1ns / 1ps

module mcu(
    input               clk,
    input               rst_n,

    input   [23:0]      ir,
    output  [7:0]       pc,

    output              wr_en,
    output  [7:0]       wr_data,
    output              rd_en,
    input   [7:0]       rd_data,

    output              en,
    output              we,
    output  [7:0]       din,
    input   [7:0]       dout,
    output  [7:0]       addr
    );

    localparam INITIAL  = 8'b1111_0000;
    localparam FIFO_WR  = 8'b0000_0001;
    localparam FIFO_RD  = 8'b0000_0010;
    localparam RAM_WR   = 8'b0000_1001;
    localparam RAM_RD   = 8'b0000_1010;
    localparam JUMP     = 8'b1000_0000;

    reg                 init;
    reg                 run;

    // [23:16]:opcode [15:8]:data [7:0]:addr
    reg     [23:0]      r_instr;
    reg     [7:0]       r_pcntr;

    reg     [7:0]       op_code;
    reg     [7:0]       op_data;
    reg     [7:0]       op_addr;

    reg                 fifo_we;
    reg     [7:0]       fifo_wd;
    reg                 fifo_re;
    reg     [7:0]       fifo_rd;
    reg                 ram_en;
    reg                 ram_we;
    reg     [7:0]       ram_di;
    reg     [7:0]       ram_ad;

    always @(posedge clk) begin
        if(!rst_n) begin
            init <= 1'b1;
            run <= 1'b0;
        end else if(&r_pcntr) begin
            init <= 1'b0;
            run <= 1'b0;
        end else if(init) begin
            init <= 1'b0;
            run <= 1'b1;
        end
    end

    // fetch
    always @(posedge clk) begin
        if(init) begin
            r_instr <= 0;
            r_pcntr <= 0;
        end else if(run) begin
            r_instr <= ir;
            r_pcntr <= ir[23:16] == JUMP ? ir[7:0] : r_pcntr + 1;
        end
    end

    // decode
    always @(posedge clk) begin
        if(init) begin
            op_code <= 0;
            op_data <= 0;
            op_addr <= 0;
        end else if(run) begin
            op_code <= r_instr[23:16];
            op_data <= r_instr[15:8];
            op_addr <= r_instr[7:0];
        end
    end

    // execute
    always @(posedge clk) begin
        if(init) begin
            fifo_we <= 0;
            fifo_wd <= 0;
            fifo_re <= 0;
            fifo_rd <= 0;
            ram_en <= 0;
            ram_we <= 0;
            ram_di <= 0;
            ram_ad <= 0;
        end else if(run) begin
            case (op_code)
                INITIAL:begin
                    fifo_we <= 0;
                    fifo_wd <= 0;
                    fifo_re <= 0;
                    fifo_rd <= 0;
                    ram_en <= 0;
                    ram_we <= 0;
                    ram_di <= 0;
                    ram_ad <= 0;
                end
                
                FIFO_WR:begin
                    fifo_we <= 1'b1;
                    fifo_wd <= op_data;
                end

                FIFO_RD:begin
                    fifo_re <= 1'b1;
                end

                RAM_WR:begin
                    ram_en <= 1'b1;
                    ram_we <= 1'b1;
                    ram_di <= op_data;
                    ram_ad <= op_addr;
                end

                RAM_RD:begin
                    ram_en <= 1'b1;
                    ram_ad <= op_addr;
                end

                default:begin
                    fifo_we <= 1'b0;
                    fifo_re <= 1'b0;
                    ram_en <= 1'b0;
                    ram_we <= 1'b0;
                end
            endcase
        end
    end

    assign pc = r_pcntr;
    assign wr_en = fifo_we;
    assign wr_data = fifo_wd;
    assign rd_en = fifo_re;
    assign en = ram_en;
    assign we = ram_we;
    assign din = ram_di;
    assign addr = ram_ad;

endmodule

        一个极其简单的三级流水线架构,只能实现ram读写和fifo读写,用来控制寄存器足够了。

        Vivado有一个很好用的功能,可以综合initial块以实现对寄存器的初始化,我们可以用这种方法来预设mcu的指令,编写一个简单的rom,然后对每个地址的内容进行初始化:

`timescale 1ns / 1ps

module irom(
    input               clk,
    input               en,
    input   [7:0]       addr,
    output  [23:0]      dout
    );

    reg     [23:0]      r_dout;
    (* ROM_STYLE = "distributed" *)
    reg     [23:0]      rom     [0:255];

    assign dout = r_dout;

    always @(posedge clk) begin
        if(en)
            r_dout <= rom[addr];
    end

    localparam INITIAL  = 8'b1111_0000;
    localparam FIFO_WR  = 8'b0000_0001;
    localparam FIFO_RD  = 8'b0000_0010;
    localparam RAM_WR   = 8'b0000_1001;
    localparam RAM_RD   = 8'b0000_1010;
    localparam JUMP     = 8'b1000_0000;

    localparam OPCODE   = 8'b0000_0000;
    localparam OPDATA   = 8'b0000_0000;
    localparam OPADDR   = 8'b0000_0000;
    
    initial begin
        rom[0] <= {INITIAL,OPDATA,OPADDR};
        rom[1] <= {OPCODE,OPDATA,OPADDR};
        rom[2] <= {RAM_WR,8'h85,8'h00};
        rom[3] <= {RAM_WR,8'h90,8'h01};
        rom[4] <= {RAM_WR,8'h4a,8'h02};
        rom[5] <= {RAM_WR,8'h5c,8'h03};
        rom[6] <= {RAM_WR,8'hff,8'h04};
        rom[7] <= {RAM_WR,8'hff,8'h05};
        rom[8] <= {RAM_WR,8'hff,8'h06};
        rom[9] <= {RAM_WR,8'hff,8'h07};
        rom[10] <= {OPCODE,OPDATA,OPADDR};
        rom[11] <= {RAM_WR,8'h08,8'h46};
        rom[12] <= {OPCODE,OPDATA,OPADDR};
        rom[13] <= {RAM_WR,8'b11111110,8'h47};
        rom[14] <= {OPCODE,OPDATA,OPADDR};
        rom[15] <= {RAM_WR,8'h00,8'h40};
        rom[16] <= {OPCODE,OPDATA,OPADDR};
        rom[17] <= {RAM_WR,8'h00,8'h41};
        rom[18] <= {OPCODE,OPDATA,OPADDR};
        rom[19] <= {RAM_WR,8'h80,8'h44};
        rom[20] <= {OPCODE,OPDATA,OPADDR};
        rom[21] <= {OPCODE,OPDATA,OPADDR};
        rom[22] <= {OPCODE,OPDATA,OPADDR};
        rom[23] <= {OPCODE,OPDATA,OPADDR};
        rom[24] <= {OPCODE,OPDATA,OPADDR};
        rom[25] <= {OPCODE,OPDATA,OPADDR};
        rom[26] <= {OPCODE,OPDATA,OPADDR};
        rom[27] <= {OPCODE,OPDATA,OPADDR};
        rom[28] <= {OPCODE,OPDATA,OPADDR};
        rom[29] <= {OPCODE,OPDATA,OPADDR};
        rom[30] <= {OPCODE,OPDATA,OPADDR};
        rom[31] <= {OPCODE,OPDATA,OPADDR};
        rom[32] <= {OPCODE,OPDATA,OPADDR};
        rom[33] <= {OPCODE,OPDATA,OPADDR};
        rom[34] <= {OPCODE,OPDATA,OPADDR};
        rom[35] <= {OPCODE,OPDATA,OPADDR};
        rom[36] <= {OPCODE,OPDATA,OPADDR};
        rom[37] <= {OPCODE,OPDATA,OPADDR};
        rom[38] <= {OPCODE,OPDATA,OPADDR};
        rom[39] <= {OPCODE,OPDATA,OPADDR};
        rom[40] <= {OPCODE,OPDATA,OPADDR};
        rom[41] <= {OPCODE,OPDATA,OPADDR};
        rom[42] <= {OPCODE,OPDATA,OPADDR};
        rom[43] <= {OPCODE,OPDATA,OPADDR};
        rom[44] <= {OPCODE,OPDATA,OPADDR};
        rom[45] <= {OPCODE,OPDATA,OPADDR};
        rom[46] <= {OPCODE,OPDATA,OPADDR};
        rom[47] <= {OPCODE,OPDATA,OPADDR};
        rom[48] <= {OPCODE,OPDATA,OPADDR};
        rom[49] <= {OPCODE,OPDATA,OPADDR};
        rom[50] <= {OPCODE,OPDATA,OPADDR};
        rom[51] <= {OPCODE,OPDATA,OPADDR};
        rom[52] <= {OPCODE,OPDATA,OPADDR};
        rom[53] <= {OPCODE,OPDATA,OPADDR};
        rom[54] <= {OPCODE,OPDATA,OPADDR};
        rom[55] <= {OPCODE,OPDATA,OPADDR};
        rom[56] <= {OPCODE,OPDATA,OPADDR};
        rom[57] <= {OPCODE,OPDATA,OPADDR};
        rom[58] <= {OPCODE,OPDATA,OPADDR};
        rom[59] <= {OPCODE,OPDATA,OPADDR};
        rom[60] <= {OPCODE,OPDATA,OPADDR};
        rom[61] <= {OPCODE,OPDATA,OPADDR};
        rom[62] <= {OPCODE,OPDATA,OPADDR};
        rom[63] <= {OPCODE,OPDATA,OPADDR};
        rom[64] <= {OPCODE,OPDATA,OPADDR};
        rom[65] <= {OPCODE,OPDATA,OPADDR};
        rom[66] <= {OPCODE,OPDATA,OPADDR};
        rom[67] <= {OPCODE,OPDATA,OPADDR};
        rom[68] <= {OPCODE,OPDATA,OPADDR};
        rom[69] <= {OPCODE,OPDATA,OPADDR};
        rom[70] <= {OPCODE,OPDATA,OPADDR};
        rom[71] <= {OPCODE,OPDATA,OPADDR};
        rom[72] <= {OPCODE,OPDATA,OPADDR};
        rom[73] <= {OPCODE,OPDATA,OPADDR};
        rom[74] <= {OPCODE,OPDATA,OPADDR};
        rom[75] <= {OPCODE,OPDATA,OPADDR};
        rom[76] <= {OPCODE,OPDATA,OPADDR};
        rom[77] <= {OPCODE,OPDATA,OPADDR};
        rom[78] <= {OPCODE,OPDATA,OPADDR};
        rom[79] <= {OPCODE,OPDATA,OPADDR};
        rom[80] <= {OPCODE,OPDATA,OPADDR};
        rom[81] <= {OPCODE,OPDATA,OPADDR};
        rom[82] <= {OPCODE,OPDATA,OPADDR};
        rom[83] <= {OPCODE,OPDATA,OPADDR};
        rom[84] <= {OPCODE,OPDATA,OPADDR};
        rom[85] <= {OPCODE,OPDATA,OPADDR};
        rom[86] <= {OPCODE,OPDATA,OPADDR};
        rom[87] <= {OPCODE,OPDATA,OPADDR};
        rom[88] <= {OPCODE,OPDATA,OPADDR};
        rom[89] <= {OPCODE,OPDATA,OPADDR};
        rom[90] <= {OPCODE,OPDATA,OPADDR};
        rom[91] <= {OPCODE,OPDATA,OPADDR};
        rom[92] <= {OPCODE,OPDATA,OPADDR};
        rom[93] <= {OPCODE,OPDATA,OPADDR};
        rom[94] <= {OPCODE,OPDATA,OPADDR};
        rom[95] <= {OPCODE,OPDATA,OPADDR};
        rom[96] <= {OPCODE,OPDATA,OPADDR};
        rom[97] <= {OPCODE,OPDATA,OPADDR};
        rom[98] <= {OPCODE,OPDATA,OPADDR};
        rom[99] <= {OPCODE,OPDATA,OPADDR};
        rom[100] <= {OPCODE,OPDATA,OPADDR};
        rom[101] <= {OPCODE,OPDATA,OPADDR};
        rom[102] <= {OPCODE,OPDATA,OPADDR};
        rom[103] <= {OPCODE,OPDATA,OPADDR};
        rom[104] <= {OPCODE,OPDATA,OPADDR};
        rom[105] <= {OPCODE,OPDATA,OPADDR};
        rom[106] <= {OPCODE,OPDATA,OPADDR};
        rom[107] <= {OPCODE,OPDATA,OPADDR};
        rom[108] <= {OPCODE,OPDATA,OPADDR};
        rom[109] <= {OPCODE,OPDATA,OPADDR};
        rom[110] <= {OPCODE,OPDATA,OPADDR};
        rom[111] <= {OPCODE,OPDATA,OPADDR};
        rom[112] <= {OPCODE,OPDATA,OPADDR};
        rom[113] <= {OPCODE,OPDATA,OPADDR};
        rom[114] <= {OPCODE,OPDATA,OPADDR};
        rom[115] <= {OPCODE,OPDATA,OPADDR};
        rom[116] <= {OPCODE,OPDATA,OPADDR};
        rom[117] <= {OPCODE,OPDATA,OPADDR};
        rom[118] <= {OPCODE,OPDATA,OPADDR};
        rom[119] <= {OPCODE,OPDATA,OPADDR};
        rom[120] <= {OPCODE,OPDATA,OPADDR};
        rom[121] <= {OPCODE,OPDATA,OPADDR};
        rom[122] <= {OPCODE,OPDATA,OPADDR};
        rom[123] <= {OPCODE,OPDATA,OPADDR};
        rom[124] <= {OPCODE,OPDATA,OPADDR};
        rom[125] <= {OPCODE,OPDATA,OPADDR};
        rom[126] <= {OPCODE,OPDATA,OPADDR};
        rom[127] <= {OPCODE,OPDATA,OPADDR};
        rom[128] <= {OPCODE,OPDATA,OPADDR};
        rom[129] <= {OPCODE,OPDATA,OPADDR};
        rom[130] <= {OPCODE,OPDATA,OPADDR};
        rom[131] <= {OPCODE,OPDATA,OPADDR};
        rom[132] <= {OPCODE,OPDATA,OPADDR};
        rom[133] <= {OPCODE,OPDATA,OPADDR};
        rom[134] <= {OPCODE,OPDATA,OPADDR};
        rom[135] <= {OPCODE,OPDATA,OPADDR};
        rom[136] <= {OPCODE,OPDATA,OPADDR};
        rom[137] <= {OPCODE,OPDATA,OPADDR};
        rom[138] <= {OPCODE,OPDATA,OPADDR};
        rom[139] <= {OPCODE,OPDATA,OPADDR};
        rom[140] <= {OPCODE,OPDATA,OPADDR};
        rom[141] <= {OPCODE,OPDATA,OPADDR};
        rom[142] <= {OPCODE,OPDATA,OPADDR};
        rom[143] <= {OPCODE,OPDATA,OPADDR};
        rom[144] <= {OPCODE,OPDATA,OPADDR};
        rom[145] <= {OPCODE,OPDATA,OPADDR};
        rom[146] <= {OPCODE,OPDATA,OPADDR};
        rom[147] <= {OPCODE,OPDATA,OPADDR};
        rom[148] <= {OPCODE,OPDATA,OPADDR};
        rom[149] <= {OPCODE,OPDATA,OPADDR};
        rom[150] <= {OPCODE,OPDATA,OPADDR};
        rom[151] <= {OPCODE,OPDATA,OPADDR};
        rom[152] <= {OPCODE,OPDATA,OPADDR};
        rom[153] <= {OPCODE,OPDATA,OPADDR};
        rom[154] <= {OPCODE,OPDATA,OPADDR};
        rom[155] <= {OPCODE,OPDATA,OPADDR};
        rom[156] <= {OPCODE,OPDATA,OPADDR};
        rom[157] <= {OPCODE,OPDATA,OPADDR};
        rom[158] <= {OPCODE,OPDATA,OPADDR};
        rom[159] <= {OPCODE,OPDATA,OPADDR};
        rom[160] <= {OPCODE,OPDATA,OPADDR};
        rom[161] <= {OPCODE,OPDATA,OPADDR};
        rom[162] <= {OPCODE,OPDATA,OPADDR};
        rom[163] <= {OPCODE,OPDATA,OPADDR};
        rom[164] <= {OPCODE,OPDATA,OPADDR};
        rom[165] <= {OPCODE,OPDATA,OPADDR};
        rom[166] <= {OPCODE,OPDATA,OPADDR};
        rom[167] <= {OPCODE,OPDATA,OPADDR};
        rom[168] <= {OPCODE,OPDATA,OPADDR};
        rom[169] <= {OPCODE,OPDATA,OPADDR};
        rom[170] <= {OPCODE,OPDATA,OPADDR};
        rom[171] <= {OPCODE,OPDATA,OPADDR};
        rom[172] <= {OPCODE,OPDATA,OPADDR};
        rom[173] <= {OPCODE,OPDATA,OPADDR};
        rom[174] <= {OPCODE,OPDATA,OPADDR};
        rom[175] <= {OPCODE,OPDATA,OPADDR};
        rom[176] <= {OPCODE,OPDATA,OPADDR};
        rom[177] <= {OPCODE,OPDATA,OPADDR};
        rom[178] <= {OPCODE,OPDATA,OPADDR};
        rom[179] <= {OPCODE,OPDATA,OPADDR};
        rom[180] <= {OPCODE,OPDATA,OPADDR};
        rom[181] <= {OPCODE,OPDATA,OPADDR};
        rom[182] <= {OPCODE,OPDATA,OPADDR};
        rom[183] <= {OPCODE,OPDATA,OPADDR};
        rom[184] <= {OPCODE,OPDATA,OPADDR};
        rom[185] <= {OPCODE,OPDATA,OPADDR};
        rom[186] <= {OPCODE,OPDATA,OPADDR};
        rom[187] <= {OPCODE,OPDATA,OPADDR};
        rom[188] <= {OPCODE,OPDATA,OPADDR};
        rom[189] <= {OPCODE,OPDATA,OPADDR};
        rom[190] <= {OPCODE,OPDATA,OPADDR};
        rom[191] <= {OPCODE,OPDATA,OPADDR};
        rom[192] <= {OPCODE,OPDATA,OPADDR};
        rom[193] <= {OPCODE,OPDATA,OPADDR};
        rom[194] <= {OPCODE,OPDATA,OPADDR};
        rom[195] <= {OPCODE,OPDATA,OPADDR};
        rom[196] <= {OPCODE,OPDATA,OPADDR};
        rom[197] <= {OPCODE,OPDATA,OPADDR};
        rom[198] <= {OPCODE,OPDATA,OPADDR};
        rom[199] <= {OPCODE,OPDATA,OPADDR};
        rom[200] <= {OPCODE,OPDATA,OPADDR};
        rom[201] <= {OPCODE,OPDATA,OPADDR};
        rom[202] <= {OPCODE,OPDATA,OPADDR};
        rom[203] <= {OPCODE,OPDATA,OPADDR};
        rom[204] <= {OPCODE,OPDATA,OPADDR};
        rom[205] <= {OPCODE,OPDATA,OPADDR};
        rom[206] <= {OPCODE,OPDATA,OPADDR};
        rom[207] <= {OPCODE,OPDATA,OPADDR};
        rom[208] <= {OPCODE,OPDATA,OPADDR};
        rom[209] <= {OPCODE,OPDATA,OPADDR};
        rom[210] <= {OPCODE,OPDATA,OPADDR};
        rom[211] <= {OPCODE,OPDATA,OPADDR};
        rom[212] <= {OPCODE,OPDATA,OPADDR};
        rom[213] <= {OPCODE,OPDATA,OPADDR};
        rom[214] <= {OPCODE,OPDATA,OPADDR};
        rom[215] <= {OPCODE,OPDATA,OPADDR};
        rom[216] <= {OPCODE,OPDATA,OPADDR};
        rom[217] <= {OPCODE,OPDATA,OPADDR};
        rom[218] <= {OPCODE,OPDATA,OPADDR};
        rom[219] <= {OPCODE,OPDATA,OPADDR};
        rom[220] <= {OPCODE,OPDATA,OPADDR};
        rom[221] <= {OPCODE,OPDATA,OPADDR};
        rom[222] <= {OPCODE,OPDATA,OPADDR};
        rom[223] <= {OPCODE,OPDATA,OPADDR};
        rom[224] <= {OPCODE,OPDATA,OPADDR};
        rom[225] <= {OPCODE,OPDATA,OPADDR};
        rom[226] <= {OPCODE,OPDATA,OPADDR};
        rom[227] <= {OPCODE,OPDATA,OPADDR};
        rom[228] <= {OPCODE,OPDATA,OPADDR};
        rom[229] <= {OPCODE,OPDATA,OPADDR};
        rom[230] <= {OPCODE,OPDATA,OPADDR};
        rom[231] <= {OPCODE,OPDATA,OPADDR};
        rom[232] <= {OPCODE,OPDATA,OPADDR};
        rom[233] <= {OPCODE,OPDATA,OPADDR};
        rom[234] <= {OPCODE,OPDATA,OPADDR};
        rom[235] <= {OPCODE,OPDATA,OPADDR};
        rom[236] <= {OPCODE,OPDATA,OPADDR};
        rom[237] <= {OPCODE,OPDATA,OPADDR};
        rom[238] <= {OPCODE,OPDATA,OPADDR};
        rom[239] <= {OPCODE,OPDATA,OPADDR};
        rom[240] <= {OPCODE,OPDATA,OPADDR};
        rom[241] <= {OPCODE,OPDATA,OPADDR};
        rom[242] <= {OPCODE,OPDATA,OPADDR};
        rom[243] <= {OPCODE,OPDATA,OPADDR};
        rom[244] <= {OPCODE,OPDATA,OPADDR};
        rom[245] <= {OPCODE,OPDATA,OPADDR};
        rom[246] <= {OPCODE,OPDATA,OPADDR};
        rom[247] <= {OPCODE,OPDATA,OPADDR};
        rom[248] <= {OPCODE,OPDATA,OPADDR};
        rom[249] <= {OPCODE,OPDATA,OPADDR};
        rom[250] <= {OPCODE,OPDATA,OPADDR};
        rom[251] <= {OPCODE,OPDATA,OPADDR};
        rom[252] <= {OPCODE,OPDATA,OPADDR};
        rom[253] <= {OPCODE,OPDATA,OPADDR};
        rom[254] <= {OPCODE,OPDATA,OPADDR};
        rom[255] <= {OPCODE,OPDATA,OPADDR};
    end

endmodule

        这个过程就有点像写软件代码了,使用的还是自己的指令集。

        最终实现的效果应该是主机向从机写入85 90 4a 5c四个字节的数据,然后写入四个ff的同时读出53 8b 9c ea四个字节的数据。以下是仿真波形:

        首先看上电复位后mcu的操作,从0地址开始读出指令然后顺序执行,读写对应地址的寄存器。寄存器控制SPI开始操作之后,SPI的波形如下:

        结果符合预期。

六、SPI主机板级验证

        最后就是上板验证,本次验证将使用SPI主机去读nor flash的ID,flash的型号是W25Q64,阅读芯片手册,得知其读ID的指令为90,还需要写入24位0xXXXX00的地址,然后就可以返回生产ID和器件ID:

        在手册中得知生产ID和器件ID分别是0xEF和0x16。

        在irom中修改指令为写入90 00 00 00,其他保持不变,在顶层中例化irom,mcu和spi_master,综合实现生成比特流。

`timescale 1ns / 1ps

module top(
    input               sys_clk,
    input               sys_rst_n,

    output              led,

    output              sck,
    output              mosi,
    input               miso,
    output              ss
    );

    wire                clk_50M;

    wire    [23:0]      ir;
    wire    [7:0]       pc;

    wire                en;
    wire                we;
    wire    [7:0]       din;
    wire    [7:0]       dout;
    wire    [7:0]       addr;

    wire    [7:0]       cs_n;

    assign ss = cs_n[0];
    assign led = 1'b1;

    BUFG BUFG_inst(
        .I(sys_clk),
        .O(clk_50M)
    );

    irom irom_inst(
        .clk(clk_50M),
        .en(1'b1),
        .addr(pc),
        .dout(ir));

    mcu mcu_inst(
        .clk(clk_50M),
        .rst_n(sys_rst_n),

        .ir(ir),
        .pc(pc),

        .en(en),
        .we(we),
        .din(din),
        .dout(dout),
        .addr(addr));

    m_spi_top m_spi_top_inst(
        .clk(clk_50M),
        .rst_n(sys_rst_n),

        .en(en),
        .we(we),
        .din(din),
        .dout(dout),
        .addr(addr),

        .sck(sck),
        .mosi(mosi),
        .miso(miso),
        .cs_n(cs_n));

    ila_spi ila_inst(
        .clk(clk_50M),
        .probe0({sck,ss,mosi,miso})
    );

endmodule

        将比特流下载到FPGA后,抓取波形如下:

        如图所示,正确地读出了ID,因为操作长度为8,所以又重复读出了两个字节的ID。

        三个常见低速接口终于更完了,接下来打算更新图像接口,常见的包括VGA时序,BT1120时序,DVI接口,HDMI接口,SDI接口,MIPI CSI和DSI等,欢迎持续关注!

  • 38
    点赞
  • 48
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
FPGA可以通过使用GPIO口模拟SPI接口,也可以使用硬件SPI接口进行实现。下面提供两种实现方式: 1. 使用GPIO口模拟SPI接口FPGA中,我们可以使用GPIO口来模拟SPI接口的时序,需要使用到FPGA的输入输出模块,以及时钟模块。 假设需要实现一个包含四个引脚的SPI接口,分别是SCLK、MOSI、MISO和SS。其中,SCLK为时钟信号,MOSI为主设备输出从设备输入的信号,MISO为从设备输出主设备输入的信号,SS为从设备的片选信号。 下面是一个简单的代码示例: ``` module spi_interface( input clk, input reset, output reg ss, output reg mosi, input miso, output reg sck ); // SPI状态机 parameter IDLE = 2'b00; parameter SEND = 2'b01; parameter RECV = 2'b10; reg [1:0] state; // SPI数据寄存器 reg [7:0] data_tx; reg [7:0] data_rx; reg [2:0] tx_index; reg [2:0] rx_index; // 初始化状态 initial begin state <= IDLE; ss <= 1'b1; mosi <= 1'b0; sck <= 1'b0; data_tx <= 8'h00; data_rx <= 8'h00; tx_index <= 3'b000; rx_index <= 3'b000; end // SPI状态机 always @(posedge clk) begin if (reset) begin state <= IDLE; ss <= 1'b1; mosi <= 1'b0; sck <= 1'b0; data_tx <= 8'h00; data_rx <= 8'h00; tx_index <= 3'b000; rx_index <= 3'b000; end else begin case (state) IDLE: begin // 空闲状态 mosi <= 1'b0; sck <= 1'b0; if (!ss) begin // 接收到片选信号 state <= SEND; tx_index <= 3'b000; rx_index <= 3'b000; end end SEND: begin // 发送状态 mosi <= data_tx[tx_index]; sck <= 1'b1; if (tx_index == 3'b111) begin // 发送完成 state <= RECV; end else begin tx_index <= tx_index + 1; end end RECV: begin // 接收状态 sck <= 1'b0; if (rx_index == 3'b111) begin // 接收完成 state <= IDLE; ss <= 1'b1; end else begin data_rx[rx_index] <= miso; rx_index <= rx_index + 1; sck <= 1'b1; end end default: begin // 默认状态 state <= IDLE; end endcase end end endmodule ``` 2. 使用硬件SPI接口 FPGA中一般都会集成SPI接口的硬件模块,这些硬件模块可以直接使用,无需再进行GPIO口模拟。 使用硬件SPI接口的主要区别在于它需要使用FPGASPI硬件模块,这个模块一般会有自己的时钟和控制信号,需要在代码中进行配置和连接。 下面是一个简单的代码示例: ``` module spi_interface( input clk, input reset, input [7:0] data_tx, output [7:0] data_rx, input ss, output sck, output mosi, input miso ); // 初始化状态 initial begin ss <= 1'b1; end // 配置SPI接口 spi_interface spi_inst ( .clk(clk), .reset(reset), .data_in(data_tx), .data_out(data_rx), .chip_select(ss), .sclk(sck), .mosi(mosi), .miso(miso) ); endmodule ``` 注意,在使用硬件SPI接口时,需要根据FPGA芯片的具体型号和硬件模块的配置进行代码的编写。
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值