Some user use Tape driver to backup and server always crash

当使用磁带设备时,经常遇到I/O高负载情况,导致任务挂起超过120秒。启用hung_task_panic参数(设置为1)后,若设备超时,系统将自动重启。这可以通过调整kernel.hung_task_timeout_secs参数来改变,将其设为0可禁用此功能。日志显示挂起任务是tldd进程,并提供了调用跟踪详细信息。
摘要由CSDN通过智能技术生成

When the user is using tape device, always meet io high situation.

So if you enable hung_task_panic, mean hung_task_panic=1, when the device time out, the server will be reboot for the time_out

Like:

[1817156.683888] INFO: task tldd:53185 blocked for more than 120 seconds.
[1817156.683896]       Not tainted 4.1.12-61.1.18.el7uek.x86_64 #2
[1817156.683899] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1817156.683902] tldd            D ffff88303ac57840     0 53185 441368 0x00000080
[1817156.683910]  ffff881f930d3bd8 0000000000000082 ffff881fef411c00 ffff8800340d9c00
[1817156.683915]  ffff8830278f47f8 ffff881f930d4000 ffff883023315c38 7fffffffffffffff
[1817156.683920]  ffff8800340d9c00 ffff883022d6f140 ffff881f930d3bf8 ffffffff8171d0d7
[1817156.683925] Call Trace:
[1817156.683939]  [<ffffffff8171d0d7>] schedule+0x37/0x90
[1817156.683945]  [<ffffffff8172013c>] schedule_timeout+0x24c/0x2c0  >>> time out
[1817156.683953]  [<ffffffff814b8068>] ? scsi_request_fn+0x48/0x770
[1817156.683958]  [<ffffffff8122f924>] ? mntput+0x24/0x40
[1817156.683964]  [<ffffffff8171dc44>] wait_for_completion+0x134/0x190
[1817156.683971]  [<ffffffff810b49c0>] ? wake_up_state+0x20/0x20
[1817156.683997]  [<ffffffffa03f86bd>] st_do_scsi.constprop.23+0x27d/0x390 [st]
[1817156.684006]  [<ffffffffa03ff775>] do_load_unload+0x161/0x1fc [st]
[1817156.684015]  [<ffffffffa03fbce8>] st_ioctl+0x5d8/0x16f0 [st] >>>>>> tape device
[1817156.684023]  [<ffffffff81222878>] do_vfs_ioctl+0x2f8/0x4f0
[1817156.684032]  [<ffffffff8112ee34>] ? __audit_syscall_entry+0xb4/0x110
[1817156.684038]  [<ffffffff81026a2c>] ? do_audit_syscall_entry+0x6c/0x70
[1817156.684044]  [<ffffffff81222af1>] SyS_ioctl+0x81/0xa0
[1817156.684048]  [<ffffffff81028426>] ? syscall_trace_leave+0xc6/0x150
[1817156.684055]  [<ffffffff817212ae>] system_call_fastpath+0x12/0x71
[1817156.684080] sending NMI to all CPUs

#cat /proc/sys/kernel/hung_task_panic
1
#cat /proc/sys/kernel/hung_task_timeout_secs
120

Hung_task_panic was enabled which means that khungtaskd thread provides the ability to detect process which are stuck in state D longer than the seconds mentioned in kernel.hung_task_timeout_secs sysctl parameter (120 seconds) and causes the server to reboot

When you could tolerate long time wait, you could disable the parameter.

 Change the hung_task_panic to 0

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值