最近一个项目,做机器重启测试时,连续测试一段时间后会概率出现进入开机界面死机问题。抓到一个死机log,确定是自己写的一个驱动部份引起。大概是在定时器timer回调处理函数里面。
代码如下:
static struct timer_list set_sp_work_timer;
static int set_sp_gpio_val;
static void timer_func(unsigned long data)
{
gpio_notice_pdata *gpio_pdata = (gpio_notice_pdata *)data;
//printk("%s, set:%d\n", __func__, set_sp_gpio_val);
//gpio_direction_output(gpio_pdata->ap_wake_sp_gpio, set_sp_gpio_val);
gpio_set_value(gpio_pdata->ap_wake_sp_gpio, set_sp_gpio_val);
set_sp_gpio_val = !set_sp_gpio_val;
mod_timer(&set_sp_work_timer, jiffies + msecs_to_jiffies(SET_SP_WORK_TIMER_PERIOD));
}
set_sp_gpio_val = GPIO_SP_ACTIVE_LEVEL;
init_timer(&set_sp_work_timer);
set_sp_work_timer.expires = jiffies + msecs_to_jiffies(SET_SP_WORK_TIMER_PERIOD);
set_sp_work_timer.data = gpio_pdata;
set_sp_work_timer.function = timer_func;
add_timer(&set_sp_work_timer);
抓到log如下所示:
[ 10.644782] BUG: scheduling while atomic: init/1/0x00000102
[ 10.651140] Modules linked in: im30_gpio_control(+) mdb_device ltr553 sw_device sitronix_ts gpio_notice sunxi_schw(O) mali(O) sunxi_tr hdmi disp bcm_btlpm bcmdhd[ 10.666727] [DISP] disp_ioctl,line:2051:para err in disp_ioctl, cmd = 0xb,screen id = 0
[ 10.676295] smsc95xx smsc75xx cdc_ether mcs7830 qf9700 asix usbnet
[ 10.683553] CPU: 0 PID: 1 Comm: init Tainted: G O 3.10.65 #1
[ 10.690925] Call trace:
[ 10.693640] [<ffffffc00008899c>] dump_backtrace+0x0/0x11c
[ 10.699603] [<ffffffc000088ad8>] show_stack+0x20/0x30
[ 10.705406] [<ffffffc00073277c>] dump_stack+0x1c/0x28
[ 10.711130] [<ffffffc0000cabc0>] __schedule_bug+0x4c/0x68
[ 10.717078] [<ffffffc000736b74>] __schedule+0x98/0x6e0
[ 10.722946] [<ffffffc000737224>] schedule+0x68/0x74
[ 10.728332] [<ffffffc000737604>] schedule_preempt_disabled+0x18/0x2c
[ 10.735533] [<ffffffc000735e44>] __mutex_lock_slowpath+0x198/0x21c
[ 10.742466] [<ffffffc000735efc>] mutex_lock+0x34/0x54
[ 10.748043] [<ffffffc0003498a0>] pinctrl_get_device_gpio_range+0x48/0xc8
[ 10.755663] [<ffffffc000349aec>] pinctrl_gpio_direction+0x40/0xa4
[ 10.762509] [<ffffffc000349ba0>] pinctrl_gpio_direction_output+0x20/0x30
[ 10.770024] [<ffffffc00034f4bc>] sunxi_pinctrl_gpio_direction_output+0x3c/0x50
[ 10.778000] [<ffffffc000353b80>] gpiod_direction_output+0xdc/0x23c
[ 10.785048] [<ffffffc000353d08>] gpio_direction_output+0x28/0x38
[ 10.791778] [<ffffffbffc2e2328>] timer_func+0x2c/0x6c [gpio_notice]
[ 10.798692] [<ffffffc0000a98ac>] call_timer_fn+0x90/0x164
[ 10.804855] [<ffffffc0000a9e18>] run_timer_softirq+0x228/0x260
[ 10.811407] [<ffffffc0000a2110>] __do_softirq+0x164/0x29c
[ 10.817350] [<ffffffc0000a22ec>] do_softirq+0x48/0x5c
[ 10.823108] [<ffffffc0000a254c>] irq_exit+0x78/0xb8
[ 10.828494] [<ffffffc0000848e8>] handle_IRQ+0x8c/0xac
[ 10.834241] [<ffffffc000081410>] gic_handle_irq+0x58/0x88
[ 10.840284] Exception stack(0xffffffc02a87f870 to 0xffffffc02a87f990)
[ 10.847385] f860: 000000c4 00000000 00000053 00000000
[ 10.856656] f880: 2a87f9b0 ffffffc0 00320020 ffffffc0 2a87fb00 ffffffc0 009a7aeb ffffffc0
[ 10.865828] f8a0: 00000000 00000000 2abf7580 ffffffc0 2a448d80 ffffffc0 00000006 00000000
[ 10.874977] f8c0: 00000000 00000000 00000000 00000000 00000001 00000000 00000007 00000000
[ 10.884145] f8e0: 01010101 01010101 00000006 00000000 3d6f6970 2c363533 2c363c20 2d2c312d
[ 10.893355] f900: 3e302c31 3d746572 007ba000 ffffffc0 00000007 00000000 0000000e 00000000
[ 10.902513] f920: 00000005 00000000 000000c4 00000000 00000053 00000000 2abe6480 ffffffc0
[ 10.911686] f940: 2a87fb00 ffffffc0 00000018 00000000 2abe6498 ffffffc0 ffffffff 00000000
[ 10.920835] f960: ffffffff 00000000 fc310000 ffffffbf fc310750 ffffffbf 2a87f9b0 ffffffc0
[ 10.929838] f980: 0034b3d0 ffffffc0 2a87f9b0 ffffffc0
[ 10.935651] [<ffffffc000083dbc>] el1_irq+0x7c/0xe4
[ 10.941030] [<ffffffc00034cb48>] pin_config_set+0x50/0xdc
[ 10.946990] [<ffffffbffc3138f0>] $x+0x8f0/0xb00 [im30_gpio_control]
[ 10.954112] [<ffffffc000394638>] platform_drv_probe+0x24/0x34
[ 10.960564] [<ffffffc000393054>] driver_probe_device+0xd8/0x218
[ 10.967075] [<ffffffc000393258>] __driver_attach+0x70/0xa0
[ 10.973323] [<ffffffc00039128c>] bus_for_each_dev+0x6c/0x94
[ 10.979458] [<ffffffc000392adc>] driver_attach+0x2c/0x3c
[ 10.985512] [<ffffffc000392660>] bus_add_driver+0x100/0x21c
[ 10.991760] [<ffffffc000393908>] driver_register+0xc0/0x140
[ 10.997903] [<ffffffc000394c78>] platform_driver_register+0x68/0x78
[ 11.005020] [<ffffffbffc313b10>] gpios_init+0x10/0x1c [im30_gpio_control]
[ 11.012621] [<ffffffc0000815d4>] do_one_initcall+0x88/0x128
[ 11.018754] [<ffffffc0000f97a8>] load_module+0x1644/0x17fc
[ 11.025011] [<ffffffc0000f9abc>] SyS_finit_module+0x6c/0x84
像上面这种log,一般是中断处理中调用sleep或者timer回调中调用sleep所致。而进入sleep是由于阻塞的原因,引起阻塞的有很多种,如mutex / kzalloc / sleep等。
BUG: scheduling while atomic: init/1/0x00000102
看上面的log,在gpio_notice.c里面的timer回调里面 gpio_direction_output(gpio_pdata->ap_wake_sp_gpio, set_sp_gpio_val);
注意:gpio_direction_output函数里面是有mutex互斥的,会引起休眠,所以这个函数不能用。改用gpio_set_value(),或者将定时器改成延时队列。
由于定时器是不可休眠的软中断,而mutex锁属于可休眠的锁,所以会出问题。