本文介绍的是rocket-chip dcache flush和dachehe discard功能。
功能的说明可以参考文档《SiFive E76 Core Complex Manual 21G1.01.00》
CFLUSH.D.L1
- Implemented as state machine in L1 data cache, for cores with data caches.
- Only available in M-mode.
- When rs1 = x0, CFLUSH.D.L1 writes back and invalidates all lines in the L1 data cache.
- When rs1 != x0, CFLUSH.D.L1 writes back and invalidates the L1 data cache line containing the virtual address in integer register rs1.
- If the effective privilege mode does not have write permissions to the address in rs1, then a store access or store page-fault exception is raised.
- If the address in rs1 is in an uncacheable region with write permissions, the instruction has no effect but raises no exceptions.
- Note that if the PMP scheme write-protects only part of a cache line, then using a value for rs1 in the write-protected region will cause an exception, whereas using a value for rs1 in the write-permitted region will write back the entire cache line.
CDISCARD.D.L1
- Implemented as state machine in L1 data cache, for cores with data caches.
- Only available in M-mode.
- Opcode 0xFC200073: with optional rs1 field in bits [19:15].
- When rs1 = x0, CDISCARD.D.L1 invalidates, but does not write back, all lines in the L1 data cache. Dirty data within the cache is lost.
- When rs1 ≠ x0, CDISCARD.D.L1 invalidates, but does not write back, the L1 data cache line containing the virtual address in integer register rs1. Dirty data within the cache line is lost.
- If the effective privilege mode does not have write permissions to the address in rs1, then a store access or store page-fault exception is raised.
- If the address in rs1 is in an uncacheable region with write permissions, the instruction has no effect but raises no exceptions.
- Note that if the PMP scheme write-protects only part of a cache line, then using a value for rs1 in the write-protected region will cause an exception, whereas using a value for rs1 in the write-permitted region will invalidate and discard the entire cache line.
L1Dcache.h头文件的内容。
#include <stdint.h>
#define STR1(x) #x
#ifndef STR
#define STR(x) STR1(x)
#endif
#define CFLUSH_D_L1_REG(rs1) \
0xFC000073 | \
(rs1 << (7+5+3)) | \
#define CFLUSH_D_L1_ALL() \
0xFC000073 | \
#define FLUSH_D_ALL() \
{ \
asm volatile (".word " STR(CFLUSH_D_L1_ALL()) "\n\t" ::: "memory"); \
} \
//Stanard macro that passes rs1 via registers
#define FLUSH_D_REG(rs1) CFLUSH_D_L1_INST(rs1,13)
//rs1 is data
//rs_1 si the register number to use
#define CFLUSH_D_L1_INST(rs1, rs1_n) \
{ \
register uint32_t rs1_ asm("x" # rs1_n) = (uint32_t) rs1; \
asm volatile (".word " STR(CFLUSH_D_L1_REG(rs1_n)) "\n\t" :: [_rs1] "r" (rs1_) : "memory"); \
} \
#define CDISCARD_D_L1_REG(rs1) \
0xFC200073 | \
(rs1 << (7+5+3)) | \
#define CDISCARD_D_L1_ALL() \
0xFC200073 | \
#define DISCARD_D_ALL() \
{ \
asm volatile (".word " STR(CDISCARD_D_L1_ALL()) "\n\t" ::: "memory"); \
} \
//Stanard macro that passes rs1 via registers
#define DISCARD_D_REG(rs1) CDISCARD_D_L1_INST(rs1,13)
//rs1 is data
//rs_1 si the register number to use
#define CDISCARD_D_L1_INST(rs1, rs1_n) \
{ \
register uint32_t rs1_ asm("x" # rs1_n) = (uint32_t) rs1; \
asm volatile (".word " STR(CDISCARD_D_L1_REG(rs1_n)) "\n\t" :: [_rs1] "r" (rs1_) : "memory"); \
}
测试代码。
#include "encoding.h"
#include "L1Dcache.h"
#define U32 *(volatile unsigned int *)
#define DEBUG_SIG 0x70000000
#define DEBUG_VAL 0x70000004
#define DATA_SIZE 10
int input1_data[DATA_SIZE] =
{
41, 833, 564, 187, 749, 350, 132, 949, 584, 805, 621, 6, 931, 890, 392, 694, 961, 110, 116, 296,
426, 314, 659, 774, 319, 678, 875, 376, 474, 938, 539, 569, 203, 280, 759, 606, 511, 657, 195, 81,
267, 229, 337, 944, 902, 241, 913, 826, 933, 985, 195, 960, 566, 350, 649, 657, 181, 111, 859, 65,
288, 349, 141, 905, 886, 264, 576, 979, 761, 241, 478, 499, 403, 222, 444, 721, 676, 317, 224, 937,
288, 119, 615, 606, 389, 351, 455, 278, 367, 358, 584, 62, 985, 403, 346, 517, 559, 908, 775, 255
};
int input2_data[DATA_SIZE] =
{
454, 335, 1, 989, 365, 572, 64, 153, 216, 140, 210, 572, 339, 593, 898, 228, 12, 883, 750, 646,
500, 436, 701, 812, 981, 150, 696, 564, 272, 258, 647, 509, 88, 703, 669, 375, 551, 936, 592, 569,
952, 800, 584, 643, 368, 489, 328, 313, 592, 388, 543, 649, 979, 997, 814, 79, 208, 998, 629, 847,
704, 997, 253, 715, 430, 415, 538, 700, 4, 494, 100, 864, 693, 416, 296, 285, 620, 78, 351, 540,
646, 169, 527, 289, 796, 801, 720, 758, 745, 92, 989, 271, 853, 788, 531, 222, 461, 241, 358, 332
};
//--------------------------------------------------------------------------
// handle_trap function
void handle_trap()
{
asm volatile ("nop");
while(1);
}
//--------------------------------------------------------------------------
// dcache flush & discard function
void dcache( int n, int a[], int b[])
{
int i;
for ( i = 0; i < n; i++ )
U32(0x80001000+4*i) = a[i] + b[i];
//flush all dcache
FLUSH_D_ALL();
for ( i = 0; i < n; i++ )
U32(0x80001000+4*i) = i + 10086;
//discard all dcache
DISCARD_D_ALL();
for ( i = 0; i < n; i++ )
U32(0x60001000+4*i) = U32(0x80001000+4*i);
}
//--------------------------------------------------------------------------
// Main
void main()
{
dcache(DATA_SIZE, input1_data, input2_data);
U32(DEBUG_SIG) = 0xFF;
}
代码步骤说明:
- 第一次读入memory 0x80001000+4*i的值,并重新赋值为a[i] + b[i]。
- 运行FLUSH_D_ALL()函数,flush all dcache,将dcache中数据全部刷回memory。
- 第二次读入memory 0x80001000+4*i的值,并重新赋值为i+10086。
- 运行DISCARD_D_ALL()函数,丢弃dcache中的全部数据。
- 将memory 0x80001000+4i的值输出到0x60001000+4i中,可以看到memory 0x80001000+4*i的值会从memory中重新读取(第三次读取),而非从dcache中获取。
细化的仿真图,就不详细贴出了,只贴一个总的运行过程图。
图中说明如下,波形图的最上面有白色的时间点标志。
- 红色箭头:从memory中获取执行指令,指令存放的起始地址为0x8000_0000。
- 蓝色箭头:第一次读取memory 0x80001000+4*i的值。
- 黄色箭头:运行FLUSH_D_ALL()函数,将dcache中的数据全部写回到memory中。
- 黄色箭头和白色箭头之间:这里有第二次读取memory 0x80001000+4*i的值,但我忘了标出来。
- 白色箭头:运行DISCARD_D_ALL()函数,丢弃dcache中的全部数据。
- 绿色箭头:第三次从memory中读取0x80001000+4i的值,并输出至0x60001000+4i中。
memory值的说明。
- 第一次读取memory 0x80001000+4*i的值:memory初始值。
- flush后,memory 0x80001000+4*i的值:a[i] + b[i]。
- 第二次读取memory 0x80001000+4*i的值:a[i] + b[i]。
- discard后,memory 0x80001000+4*i的值:a[i] + b[i]。
- 第三次读取memory 0x80001000+4*i的值:a[i] + b[i]。
- 写到0x60001000+4*i中的值:a[i] + b[i]。
这里只说明了flush all dcache 和 discard all dcache,单条dcache line的flush和discard大家可以根据L1Dcache.h的内容自行尝试。