Linux Host "Abort command issued"

I still investigate this issue, and it now narrowed down after we hard working.

Thanks my team mate.


Action done:

1)  Change the storage ports, that host can access the storage by different storage port. And most important IOPS pressure can be low down.

---- but, host still report the errors.

2)  Change the switch port.  --Failed.

3)  Replace the host HBA port. --Failed.

4)  Replace the OPTICAL FIBER ---Failed.

After we do these actions, the host still report the errors.


Analysis Process:

HOST:

[sfstorage@*********** ~]$ sudo cat /var/log/messages*|grep 2002
Sep 14 02:51:21 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:2:23): Abort command issued -- 1 50f052 2002.
Sep 14 02:51:21 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:2:15): Abort command issued -- 1 514d9a 2002.
Sep 14 02:59:26 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:2:6): Abort command issued -- 1 56bdb2 2002.
Sep 14 04:13:07 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:2:23): Abort command issued -- 1 927343 2002.
Sep 14 04:37:32 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:2:16): Abort command issued -- 1 ab3ff0 2002.


open the debug mode:(Another host debug information, but the same issue.)

Sep  3 08:47:58 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:23): Abort command issued -- 1 28649dab 2002.
Sep  3 08:48:38 cnsz02pl0306 kernel: scsi(3): Asynchronous PORT UPDATE ignored 0083/0007/1100.
Sep  3 08:48:38 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:9) Port down status: port-state=0x4
Sep  3 08:48:38 cnsz02pl0306 kernel: scsi(3): Port login retry: 24530002ac008e8d, id = 0x0083 retry cnt=30
Sep  3 08:48:38 cnsz02pl0306 kernel: scsi(3): qla2x00_port_login()
Sep  3 08:48:38 cnsz02pl0306 kernel: scsi(3): Trying Fabric Login w/loop id 0x0083 for port 039a00.
Sep  3 08:48:38 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:9) FCP command status: 0x29-0x0 (0xe0000) portid=039a00 oxid=0xffff ser=0x286b640f cdb=280014 len=0x2000 rsp_info=0x0 resid=0x0 fw_resid=0x0
Sep  3 08:48:38 cnsz02pl0306 kernel: scsi(3): port login OK: logged in ID 0x83
Sep  3 08:48:38 cnsz02pl0306 kernel: scsi(3): qla2x00_port_login - end
Sep  3 09:09:21 cnsz02pl0306 kernel: qla2xxx_eh_abort(3): aborting sp ffff8174518a97c0 from RISC. pid=684212605.
Sep  3 09:09:21 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:17) FCP command status: 0x5-0x0 (0x80000) portid=039a00 oxid=0x6cb ser=0x28c8417d cdb=2a003d len=0x2000 rsp_info=0x0 resid=0x0 fw_resid=0x0
Sep  3 09:09:21 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:17): Abort command issued -- 1 28c8417d 2002.
Sep  3 09:10:02 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:13) Port down status: port-state=0x4
Sep  3 09:10:02 cnsz02pl0306 kernel: scsi(3): Port login retry: 24530002ac008e8d, id = 0x0083 retry cnt=30
Sep  3 09:10:02 cnsz02pl0306 kernel: scsi(3): qla2x00_port_login()
Sep  3 09:10:02 cnsz02pl0306 kernel: scsi(3): Trying Fabric Login w/loop id 0x0083 for port 039a00.
Sep  3 09:10:02 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:13) FCP command status: 0x29-0x0 (0xe0000) portid=039a00 oxid=0x0 ser=0x28ccd092 cdb=28008b len=0x2000 rsp_info=0x0 resid=0x0 fw_resid=0x0
Sep  3 09:10:02 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:7) Port down status: port-state=0x3
Sep  3 09:10:02 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:7) FCP command status: 0x29-0x0 (0xe0000) portid=039a00 oxid=0x0 ser=0x28ccd095 cdb=280002 len=0x4000 rsp_info=0x0 resid=0x0 fw_resid=0x0
Sep  3 09:10:02 cnsz02pl0306 kernel: scsi(3): Asynchronous PORT UPDATE ignored 0083/0007/1100.
Sep  3 09:10:02 cnsz02pl0306 kernel: scsi(3): port login OK: logged in ID 0x83
Sep  3 09:10:02 cnsz02pl0306 kernel: scsi(3): qla2x00_port_login - end
Sep  3 09:18:40 cnsz02pl0306 kernel: qla2xxx_eh_abort(3): aborting sp ffff813352b14dc0 from RISC. pid=687125221.
Sep  3 09:18:40 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:12) FCP command status: 0x5-0x0 (0x80000) portid=039a00 oxid=0x86d ser=0x28f4b2e5 cdb=280017 len=0x2000 rsp_info=0x0 resid=0x0 fw_resid=0x0
Sep  3 09:18:40 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:12): Abort command issued -- 1 28f4b2e5 2002.
Sep  3 09:18:40 cnsz02pl0306 kernel: qla2xxx_eh_abort(3): aborting sp ffff8174594a61c0 from RISC. pid=687222175.
Sep  3 09:18:40 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:21) FCP command status: 0x5-0x0 (0x80000) portid=039a00 oxid=0x51 ser=0x28f62d9f cdb=280065 len=0x2000 rsp_info=0x0 resid=0x0 fw_resid=0x0
Sep  3 09:18:40 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:21): Abort command issued -- 1 28f62d9f 2002.
Sep  3 09:52:47 cnsz02pl0306 kernel: qla2xxx_eh_abort(3): aborting sp ffff8133b0351240 from RISC. pid=700097385.
Sep  3 09:52:47 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:25) FCP command status: 0x5-0x0 (0x80000) portid=039a00 oxid=0x37e ser=0x29baa369 cdb=2a002e len=0x2000 rsp_info=0x0 resid=0x0 fw_resid=0x0
Sep  3 09:52:47 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:25): Abort command issued -- 1 29baa369 2002.
Sep  3 09:53:27 cnsz02pl0306 kernel: scsi(3): Asynchronous PORT UPDATE ignored 0083/0007/1100.
Sep  3 09:53:27 cnsz02pl0306 kernel: qla2xxx 0000:42:00.0: scsi(3:7:32) Port down status: port-state=0x4
Sep  3 09:53:27 cnsz02pl0306 kernel: scsi(3): Port login retry: 24530002ac008e8d, id = 0x0083 retry cnt=30


It show lots of port login and logout information, then I want to check the FC switch and see whether log the login logs?


The FC swtich:

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Switch 0; Sun Sep 14 00:24:51 2014 GMT (GMT+0:00)
00:24:51.528276 SCN Port Offline;g=0x53e                    D0,P0  D0,P0  209   NA    
00:24:51.528283 *Removing all nodes from port               D0,P0  D0,P0  209   NA    
00:40:24.178797 SCN LR_PORT(0);g=0x53e                      D0,P0  D0,P0  209   NA    
00:40:24.178810 SCN Port Online; g=0x53e,isolated=0         D0,P0  D0,P1  209   NA    
00:40:24.178860 Port Elp engaged                            D0,P1  D0,P0  209   NA    
00:40:24.178877 *Removing all nodes from port               D0,P0  D0,P0  209   NA    
00:40:24.179245 SCN Port F_PORT                             D0,P1  D0,P0  209   NA    

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Please be careful the time stamp, it do NOT match the host's. why?

So, I check the port login the swtich process, there are three steps:

Fabric login

Port login

Process login

Fabric login
After the fabric capable Fibre Channel device is attached to a fabric switch, it will carry out a fabric login (FLOGI). Similar to port login, FLOGI is an extended link service command that sets up a session between two participants. With FLOGI a session is created between an N_Port or NL_Port and the switch. An N_Port will send a FLOGI frame that contains its Node Name, its N_Port Name, and service parameters to a well-known address of 0xFFFFFE.
A public loop NL_Port first opens the destination AL_PA 0x00 before issuing the FLOGI request. In both cases the switch accepts the login and returns an accept (ACC) frame to the sender. If some of the service parameters requested by the N_Port or NL_Port are not supported, the switch will set the appropriate bits in the ACC frame to indicate this.
When the N_Port logs in it uses a 24-bit port address of 0x000000. Because of this the fabric is allowed to assign the appropriate port address to that device, based on the Domain-Area-Port address format. The newly assigned address is contained in the ACC response frame.
When the NL_Port logs in a similar process starts, except that the least significant byte is used to assign AL_PA and the upper two bytes constitute a fabric loop identifier. Before an NL_Port logs in it will go through the LIP on the loop, which is started by the FL_Port, and from this process it has already derived an AL_PA. The switch then decides if it will accept this AL_PA for this device or not. If not a new AL_PA is assigned to the NL_Port, which then causes the start of another LIP. This ensures that the switch assigned AL_PA does not conflict with any previously selected AL_PAs on the loop.
After the N_Port or public NL_Port gets its fabric address from FLOGI, it needs to register with the SNS. This is done with port login (PLOGI) at the address 0xFFFFFC. The device may register values for all or just some database objects, but the most useful are its 24-bit port address, 64-bit Port Name (WWPN), 64-bit Node Name (WWN), class of service parameters, FC-4 protocols supported, and port type, such as N_Port or NL_Port.

Port login
Port login is also known as PLOGI. Port login is used to establish a session between two N_Ports (devices) and is necessary before any upper level commands or operations can be performed. During the port login, two N_Ports (devices) swap service parameters and make themselves known to each other.

Process login
Process login is also known as PRLI. Process login is used to set up the environment between related processes on an originating N_Port and a responding N_Port. A group of related processes is collectively known as an image pair. The processes involved can be system processes, system images, such as mainframe logical partitions, control unit images, and FC-4 processes. Use of process login is optional from the perspective of Fibre Channel FC-2 layer, but may be required by a specific upper-level protocol as in the case of SCSI-FCP mapping.

FLOGI
          N_Port requests a unique 24-bit address from the Fabric Login Server (accessible via an F_port on a Fabric switch).

PLOGI
          N_Port informs the Fabric Name Server of its personality and capabilities. For example:
          WWNN, WWPN
          Buffer credits for flow control
          clock frequency ('speed capability')
          Upper layer protocol support (eg. SCSI-3, IP)
PRLI
           Upper layer protocol communication. Well, ever since SCSI was designed and engineered (1970s, or so, previously SASI...), SCSI initiators need todiscover SCSI                    targets. So, during PRLI, N_Port SCSI initiators discover N_Port SCSI targets (which is an opportunity for the host (maybe a UNIXhost) to assign a target ID to the                    device path).
           Depending on the OS, you may be able to investigate further with commands like:
           egrep -i 'flogi|plogi|prli' /var/adm/messages  


Please pay attention that only "Process login" won't log by switch. Fabric login and Port login all will be loged by switch.

So, next action, we need do deeply investigate the storage device, and find out why the host login and logout actions do not loged in the storage log file?

and we need upgrade the Storage Firmware too.


Thanks.


  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值