本文由RT-Thread论坛用户@blta原创发布:https://club.rt-thread.org/ask/article/248051628070d52e.html
在 RISC-V移植那些事中 文章中提到了对 RISC-V架构FPU移植部分的优化,最近找工作,这件事做的断断续续,终于完成了。
开发环境
硬件
这次选用了Nuclei和中国移动芯昇科技合作的CM32M433R-START的RISC-V生态开发板,该开发板今年刚出来,比较新,采用芯来科技N308内核(RV32IMACFP)符合我们的FPU测试需求,99元价格也很便宜,果断入手测试,顺便支持一下!
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-jl6JrnLU-1655860145499)(https://oss-club.rt-thread.org/uploads/20220620/41ca1bc4c2cd198af3df26cbf4d87a80.png.webp “image-20220611084812507.png”)]
more info , refer to https://www.rvmcu.com/quickstart-show-id-13.html
软件
由于rt-thread和rt-thread studio 目前还不支持该开发板,先使用官方Nuclei Studio测试
CM32M4xxR-Support-Pack-v1.0.2-win32-x32.zip
新建工程
1)基于CM32M433R_START开发板新建工程
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-kanxKDV4-1655860145499)(https://oss-club.rt-thread.org/uploads/20220620/d6c5eb3e46bb229776aba52218402777.png.webp “image-20220613095756782.png”)]
2)基于RT-Thread新建工程
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-4DYj5yKa-1655860145500)(https://oss-club.rt-thread.org/uploads/20220620/b574e837f0b8024bf33fcb21a97cbb7a.png.webp “image-20220613100215099.png”)]
3)编译测试
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Nd15mGRG-1655860145501)(https://oss-club.rt-thread.org/uploads/20220620/e13e559558691e66d33cf4ba0415a985.png.webp “image-20220613100807716.png”)]
这次使用的CMlink-OpenOCD 速度很给力啊,很快就加载成功!
新建测试线程
由于官方的例程切换时暂未对FPU部分上下文处理,所以一旦有多个牵涉到浮点寄存器操作的线程,就可能导致FPU上下文不一致。
.align 2
.global eclic_msip_handler
eclic_msip_handler:
addi sp, sp, -RT_CONTEXT_SIZE
STORE x1, 1 * REGBYTES(sp) /* RA */
STORE x5, 2 * REGBYTES(sp)
STORE x6, 3 * REGBYTES(sp)
STORE x7, 4 * REGBYTES(sp)
STORE x8, 5 * REGBYTES(sp)
STORE x9, 6 * REGBYTES(sp)
STORE x10, 7 * REGBYTES(sp)
STORE x11, 8 * REGBYTES(sp)
STORE x12, 9 * REGBYTES(sp)
STORE x13, 10 * REGBYTES(sp)
STORE x14, 11 * REGBYTES(sp)
STORE x15, 12 * REGBYTES(sp)
#ifndef __riscv_32e
STORE x16, 13 * REGBYTES(sp)
STORE x17, 14 * REGBYTES(sp)
STORE x18, 15 * REGBYTES(sp)
STORE x19, 16 * REGBYTES(sp)
STORE x20, 17 * REGBYTES(sp)
STORE x21, 18 * REGBYTES(sp)
STORE x22, 19 * REGBYTES(sp)
STORE x23, 20 * REGBYTES(sp)
STORE x24, 21 * REGBYTES(sp)
STORE x25, 22 * REGBYTES(sp)
STORE x26, 23 * REGBYTES(sp)
STORE x27, 24 * REGBYTES(sp)
STORE x28, 25 * REGBYTES(sp)
STORE x29, 26 * REGBYTES(sp)
STORE x30, 27 * REGBYTES(sp)
STORE x31, 28 * REGBYTES(sp)
#endif
/* Push mstatus to stack */
csrr t0, CSR_MSTATUS
STORE t0, (RT_SAVED_REGNUM - 1) * REGBYTES(sp)
...
/* Restore Registers from Stack */
LOAD x1, 1 * REGBYTES(sp) /* RA */
LOAD x5, 2 * REGBYTES(sp)
LOAD x6, 3 * REGBYTES(sp)
LOAD x7, 4 * REGBYTES(sp)
LOAD x8, 5 * REGBYTES(sp)
LOAD x9, 6 * REGBYTES(sp)
LOAD x10, 7 * REGBYTES(sp)
LOAD x11, 8 * REGBYTES(sp)
LOAD x12, 9 * REGBYTES(sp)
LOAD x13, 10 * REGBYTES(sp)
LOAD x14, 11 * REGBYTES(sp)
LOAD x15, 12 * REGBYTES(sp)
#ifndef __riscv_32e
LOAD x16, 13 * REGBYTES(sp)
LOAD x17, 14 * REGBYTES(sp)
LOAD x18, 15 * REGBYTES(sp)
LOAD x19, 16 * REGBYTES(sp)
LOAD x20, 17 * REGBYTES(sp)
LOAD x21, 18 * REGBYTES(sp)
LOAD x22, 19 * REGBYTES(sp)
LOAD x23, 20 * REGBYTES(sp)
LOAD x24, 21 * REGBYTES(sp)
LOAD x25, 22 * REGBYTES(sp)
LOAD x26, 23 * REGBYTES(sp)
LOAD x27, 24 * REGBYTES(sp)
LOAD x28, 25 * REGBYTES(sp)
LOAD x29, 26 * REGBYTES(sp)
LOAD x30, 27 * REGBYTES(sp)
LOAD x31, 28 * REGBYTES(sp)
#endif
addi sp, sp, RT_CONTEXT_SIZE
mret
浮点线程1
thread1里加了一个稍复杂的浮点操作
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Rd3qPtDf-1655860145502)(https://oss-club.rt-thread.org/uploads/20220620/724f625649d04e77e92284514d459188.png.webp “image-20220613175241185.png”)]
浮点线程2
thread2简单些
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ZAaM7vLl-1655860145502)(https://oss-club.rt-thread.org/uploads/20220620/ad36e2f662fce56f68451497988c474c.png.webp “image-20220613174511111.png”)]
线程优先级
thread1和thread2有相同优先级,时间片轮询执行。
thread1 = rt_thread_create("thread1",thread1_entry, NULL, 1024, 11, 1);
if(thread1 != NULL)
{
rt_thread_startup(thread1);
}
else
{
printf("\r\n create thread1 failed\r\n");
}
thread2 = rt_thread_create("thread2",thread2_entry, NULL, 1024, 11, 1);
if(thread2 != NULL)
{
rt_thread_startup(thread2);
}
else
{
printf("\r\n create thread2 failed\r\n");
}
运行结果
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-pbVpuaGM-1655860145503)(https://oss-club.rt-thread.org/uploads/20220620/6bf93670b928e9a323af4a8fe780964d.png “image-20220613174640185.png”)]
线程1在第一次切换后,res部分就出现了异常,稍后我们观察一下改进FPU后能否解决该问题
FPU静态保存
参考rtthread ch32v307 BSP, 使用FPU静态保存,即不判断Mstatus.FS域直接全保存FPU部分的寄存器
Stack frame
首先,修改stack_frame, 增加32个float registers f0~f31
struct rt_hw_stack_frame
{
rt_ubase_t epc; /*!< epc - epc - program counter */
rt_ubase_t ra; /*!< x1 - ra - return address for jumps */
rt_ubase_t t0; /*!< x5 - t0 - temporary register 0 */
rt_ubase_t t1; /*!< x6 - t1 - temporary register 1 */
rt_ubase_t t2; /*!< x7 - t2 - temporary register 2 */
rt_ubase_t s0_fp; /*!< x8 - s0/fp - saved register 0 or frame pointer */
rt_ubase_t s1; /*!< x9 - s1 - saved register 1 */
rt_ubase_t a0; /*!< x10 - a0 - return value or function argument 0 */
rt_ubase_t a1; /*!< x11 - a1 - return value or function argument 1 */
rt_ubase_t a2; /*!< x12 - a2 - function argument 2 */
rt_ubase_t a3; /*!< x13 - a3 - function argument 3 */
rt_ubase_t a4; /*!< x14 - a4 - function argument 4 */
rt_ubase_t a5; /*!< x15 - a5 - function argument 5 */
#ifndef __riscv_32e
rt_ubase_t a6; /*!< x16 - a6 - function argument 6 */
rt_ubase_t a7; /*!< x17 - s7 - function argument 7 */
rt_ubase_t s2; /*!< x18 - s2 - saved register 2 */
rt_ubase_t s3; /*!< x19 - s3 - saved register 3 */
rt_ubase_t s4; /*!< x20 - s4 - saved register 4 */
rt_ubase_t s5; /*!< x21 - s5 - saved register 5 */
rt_ubase_t s6; /*!< x22 - s6 - saved register 6 */
rt_ubase_t s7; /*!< x23 - s7 - saved register 7 */
rt_ubase_t s8; /*!< x24 - s8 - saved register 8 */
rt_ubase_t s9; /*!< x25 - s9 - saved register 9 */
rt_ubase_t s10; /*!< x26 - s10 - saved register 10 */
rt_ubase_t s11; /*!< x27 - s11 - saved register 11 */
rt_ubase_t t3; /*!< x28 - t3 - temporary register 3 */
rt_ubase_t t4; /*!< x29 - t4 - temporary register 4 */
rt_ubase_t t5; /*!< x30 - t5 - temporary register 5 */
rt_ubase_t t6; /*!< x31 - t6 - temporary register 6 */
#endif
rt_ubase_t mstatus; /*!< - machine status register */
/* float register */
#ifdef ARCH_RISCV_FPU
rv_floatreg_t f0; /* f0 */
rv_floatreg_t f1; /* f1 */
rv_floatreg_t f2; /* f2 */
rv_floatreg_t f3; /* f3 */
rv_floatreg_t f4; /* f4 */
rv_floatreg_t f5; /* f5 */
rv_floatreg_t f6; /* f6 */
rv_floatreg_t f7; /* f7 */
rv_floatreg_t f8; /* f8 */
rv_floatreg_t f9; /* f9 */
rv_floatreg_t f10; /* f10 */
rv_floatreg_t f11; /* f11 */
rv_floatreg_t f12; /* f12 */
rv_floatreg_t f13; /* f13 */
rv_floatreg_t f14; /* f14 */
rv_floatreg_t f15; /* f15 */
rv_floatreg_t f16; /* f16 */
rv_