之前在开发过程中,总是会出现各种各样的内存跑飞,然而君正的OS是基于uc/os改进的,不同于Linux有很多的调试手段,一旦内存跑飞总是感到很棘手。因此,后面项目闲下来后,经过仔细的研究总算找到解决办法了。
我首先是通过代码构造了个crash的场景如下:
static void beep_task(void *arg)
{
unsigned char err;
unsigned char *sp = NULL;
while (1) {
os_SemaphorePend(beep_sem_sync, 0, &err);
printf("%d\n", beep_cnt++);
AppIndication(g_decResult.indicate_type);
*sp = 1; //空指针触发crash
os_SemaphoreSet(beep_sem_sync, 0, &err);
}
}
之后,开机触发crash,收集串口log,如下:
hid_keyboard_tx Send Failed
hid_keyboard_tx Send Failed
-----------------------------------------------------
8001f0c8: 92a40420
8001f0cc: 8e442bd8
8001f0d0: 00002821
8001f0d4: 02203021
8001f0d8: 0c0140e2
8001f0dc: a0130000
8001f0e0: 08007c28
8001f0e4: 8e442bd8
-----------------------------------------------------
CAUSE=80800c0c --> TLB Store
EPC=8001f0d8
$0 zero 00000000 $1 at 00000000 $2 v0 00000001 $3 v1 02000000
$4 a0 802eebac $5 a1 00000000 $6 a2 8015f3e8 $7 a3 80070000
$8 t0 00000010 $9 t1 b000204c $10 t2 b0002040 $11 t3 b0002048
$12 t4 00010001 $13 t5 b0002034 $14 t6 00000032 $15 t7 00000000
$16 s0 80160000 $17 s1 8015f3e8 $18 s2 80070000 $19 s3 00000001
$20 s4 80060000 $21 s5 802ead24 $22 s6 00000000 $23 s7 00000000
$24 t8 00000000 $25 t9 00000000 $26 k0 8015f3d8 $27 k1 00000000
$28 gp 8007ab70 $29 sp 8015f3d8 $30 fp 00000000 $31 ra 8001f0e0
g_pExcept[0].sp = 8015a2f0 sp = 8015f3d8
Restarting after 4 ms
ok,注意上面的PC指针,例如最后一条PC指针是指向的8001f0c8,ok,拿到这个PC指针后,我们去分析dump文件。
dump文件是存放在
mini_os\x1500-minios\yak\target\minios.dump
然后我们打开它,并探索地址8001f0c8,then
//8001f060 <beep_task>:
8001f060: 27bdffc8 addiu sp,sp,-56
8001f064: 3c02802f lui v0,0x802f
8001f068: afb5002c sw s5,44(sp)
8001f06c: afb40028 sw s4,40(sp)
8001f070: afb30024 sw s3,36(sp)
8001f074: afb20020 sw s2,32(sp)
8001f078: afb1001c sw s1,28(sp)
8001f07c: afb00018 sw s0,24(sp)
8001f080: afbf0030 sw ra,48(sp)
8001f084: 2455ad24 addiu s5,v0,-21212
8001f088: 3c128007 lui s2,0x8007
8001f08c: 27b10010 addiu s1,sp,16
8001f090: 3c108016 lui s0,0x8016
8001f094: 3c148006 lui s4,0x8006
8001f098: 24130001 li s3,1
8001f09c: 8e442bd8 lw a0,11224(s2)
8001f0a0: 02203021 move a2,s1
8001f0a4: 0c014a3e jal 800528f8 <os_SemaphorePend>
8001f0a8: 00002821 move a1,zero
8001f0ac: 8e029414 lw v0,-27628(s0)
8001f0b0: 26843f08 addiu a0,s4,16136
8001f0b4: 00402821 move a1,v0
8001f0b8: 24420001 addiu v0,v0,1
8001f0bc: 0c007a85 jal 8001ea14 <printf>
8001f0c0: ae029414 sw v0,-27628(s0)
8001f0c4: 0c0095ef jal 800257bc <AppIndication>
//8001f0c8: 92a40420 lbu a0,1056(s5)
8001f0cc: 8e442bd8 lw a0,11224(s2)
8001f0d0: 00002821 move a1,zero
8001f0d4: 02203021 move a2,s1
8001f0d8: 0c0140e2 jal 80050388 <os_SemaphoreSet>
8001f0dc: a0130000 sb s3,0(zero)
8001f0e0: 08007c28 j 8001f0a0 <beep_task+0x40>
8001f0e4: 8e442bd8 lw a0,11224(s2)
OK,dump文件非常明白的告诉了我们crash发生在beep_task函数中,而且是在AppIndication的下一条指令。