Linux使用Emulex光纤卡下可能发生以下错误,环境:Redhat 5.11,multipath,Emulex,EMC VNX5600,HP gen8
Apr 24 23:06:30 xxxx kernel: lpfc 0000:07:00.0: 0:(0):0748 abort handler timed out waiting for abort to complete: ret 0x2003, ID 1, LUN 15, snum 0x2d7c314d
Apr 24 23:07:29 xxxx kernel: opcmona[27437] trap invalid opcode rip:4281ea rsp:2ae690100ed0 error:0
Apr 24 23:07:30 xxxx kernel: lpfc 0000:07:00.0: 0:(0):0748 abort handler timed out waiting for abort to complete: ret 0x2003, ID 1, LUN 11, snum 0x2d7c314e
Apr 24 23:08:01 xxxx kernel: INFO: task kmirrord:9204 blocked for more than 120 seconds.
Apr 24 23:08:01 xxxx kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
所属环境稳定运行一年半左右,开始发生以上问题,此后约1.5月发生一次,多套环境相继发生该问题,且问题发生后会导致数据读写堵塞,最终导致应用服务不可用,经过多轮分析,最终定位为Emulex光纤卡问题,全部更换为Qlogic光纤卡后该问题得以消除。
https://access.redhat.com/solutions/221043
https://access.redhat.com/solutions/2123731