Linux内核I/O报错信息中hostbyte与driverbyte含义

1.现象举例

  • 1.hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jul 16 08:06:53 localhost kernel: sd 11:0:0:0: [sdh] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jul 16 08:06:53 localhost kernel: sd 11:0:0:0: [sdh] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
  • 2.hostbyte=DID_OK driverbyte=DRIVER_SENSE
[   37.404796] sd 0:0:0:0: [sda] tag#3 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[   37.404806] blk_update_request: I/O error, dev sda, sector 0
  • 3.hostbyte=DID_ERROR driverbyte=DRIVER_OK
Dec  6 18:12:13 localhost kernel: sd 20:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Dec  6 18:12:13 localhost kernel: blk_update_request: I/O error, dev sdb, sector 512

2.hostbyte和driverbyte

  • linux/include/scsi/scsi.h
/*
 *  Use these to separate status msg and our bytes
 *
 *  These are set by:
 *
 *      status byte = set from target device
 *      msg_byte    = return status from host adapter itself.
 *      host_byte   = set by low-level driver to indicate status.
 *      driver_byte = set by mid-level.
 */
#define status_byte(result) (((result) >> 1) & 0x7f)
#define msg_byte(result)    (((result) >> 8) & 0xff)
#define host_byte(result)   (((result) >> 16) & 0xff)
#define driver_byte(result) (((result) >> 24) & 0xff)
  • hostbyte码值对应的含义如下:
132 /*
133  * Host byte codes
134  */
135
136 #define DID_OK          0x00    /* NO error                                */
137 #define DID_NO_CONNECT  0x01    /* Couldn't connect before timeout period  */
138 #define DID_BUS_BUSY    0x02    /* BUS stayed busy through time out period */
139 #define DID_TIME_OUT    0x03    /* TIMED OUT for other reason              */
140 #define DID_BAD_TARGET  0x04    /* BAD target.                             */
141 #define DID_ABORT       0x05    /* Told to abort for some other reason     */
142 #define DID_PARITY      0x06    /* Parity error                            */
143 #define DID_ERROR       0x07    /* Internal error                          */
144 #define DID_RESET       0x08    /* Reset by somebody.                      */
145 #define DID_BAD_INTR    0x09    /* Got an interrupt we weren't expecting.  */
146 #define DID_PASSTHROUGH 0x0a    /* Force command past mid-layer            */
147 #define DID_SOFT_ERROR  0x0b    /* The low level driver just wish a retry  */
148 #define DID_IMM_RETRY   0x0c    /* Retry without decrementing retry count  */
149 #define DID_REQUEUE     0x0d    /* Requeue command (no immediate retry) also
150                                  * without decrementing the retry count    */
151 #define DID_TRANSPORT_DISRUPTED 0x0e /* Transport error disrupted execution
152                                       * and the driver blocked the port to
153                                       * recover the link. Transport class will
154                                       * retry or fail IO */
155 #define DID_TRANSPORT_FAILFAST  0x0f /* Transport class fastfailed the io */
156 #define DID_TARGET_FAILURE 0x10 /* Permanent target failure, do not retry on
157                                  * other paths */
158 #define DID_NEXUS_FAILURE 0x11  /* Permanent nexus failure, retry on other
159                                  * paths might yield different results */
160 #define DID_ALLOC_FAILURE 0x12  /* Space allocation on the device failed */
161 #define DID_MEDIUM_ERROR  0x13  /* Medium error */
  • hostbyte
    在这里插入图片描述

  • driverbyte
    在这里插入图片描述

3.FC链路的硬件故障

  • 光模块、光纤线或者HBA卡有异常,常见的表现有:
    a.链路出现误码:需更换整条链路光模块和光纤线
    b.存储上报光模块异常告警:更换光模块
    c.主机日志报错,常见的是0x70000错误,是HBA卡内部错误。
    示例:UP_done:C0P1L2, r=70000 , MPP_SELECTION_TIMEOUT, sk=0, ASC/ASCQ=0/0, SN:92288176
  • 常见HBA错误返回码如下:
    在这里插入图片描述

4.源码分析

以下是Linux系统日志(/var/log/messages)中硬件故障内容的一个例子: ``` Sep 10 11:23:45 server kernel: [123456.789012] mpt3sas0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303) Sep 10 11:23:45 server kernel: [123456.789012] mpt3sas0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303) Sep 10 11:24:01 server kernel: [123472.789012] sd 0:0:0:0: [sda] FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Sep 10 11:24:01 server kernel: [123472.789012] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 Sep 10 11:24:01 server kernel: [123472.789012] blk_update_request: I/O error, dev sda, sector 0 Sep 10 11:24:01 server kernel: [123472.789012] Buffer I/O error on dev sda, logical block 0, async page read Sep 10 11:24:01 server kernel: [123472.789012] ata1: EH complete Sep 10 11:24:01 server kernel: [123472.789012] sd 0:0:0:0: [sda] FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Sep 10 11:24:01 server kernel: [123472.789012] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 Sep 10 11:24:01 server kernel: [123472.789012] blk_update_request: I/O error, dev sda, sector 0 Sep 10 11:24:01 server kernel: [123472.789012] Buffer I/O error on dev sda, logical block 0, async page read Sep 10 11:24:01 server kernel: [123472.789012] ata1: EH complete ``` 这段日志中包含了一个硬件故障的信息,指出了磁盘sda的读取操作失败,可能是硬件设备出现了故障。需要进一步检查和排除故障。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

喜欢打篮球的普通人

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值