更换过四张万兆卡,万兆模块和线都更换过,也试过用光纤模块和线,系统也都重装过几十次,服务器还是有时候识别不到,有时候又可以。
防火墙已关,NetworkManager服务也禁用,Infiniband组件全都打上了。rdma服务也启了。
#lspci一直都可以看到:
07:00.0 Ethernet controller: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0)
#ifconfig -a有时候能显示万兆卡,有时候重启又不能看到
[root@fss1 ~]# service network restart
Shutting down interface em1: [ OK ]
Shutting down interface em2: [ OK ]
Shutting down loopback interface: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface em1: [ OK ]
Bringing up interface em2: [ OK ]
Bringing up interface p3p1:Device p3p1 does not seem to be present, delaying initialization.[FAILED]
正常识别时没有以下错误,无法识别万兆卡的时候可以看到有以下信息:
[root@fss1 ~]# dmesg |grep mlx4
mlx4_core: Mellanox ConnectX core driver v1.1 (Dec, 2011)
mlx4_core: Initializing 0000:07:00.0
mlx4_core 0000:07:00.0: PCI INT A -> GSI 40 (level, low) -> IRQ 40
mlx4_core 0000:07:00.0: setting latency timer to 64
mlx4_core 0000:07:00.0: vpd r/w failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update.
mlx4_core 0000:07:00.0: vpd r/w failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update.
mlx4_core 0000:07:00.0: vpd r/w failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update.
mlx4_core 0000:07:00.0: vpd r/w failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update.
mlx4_core 0000:07:00.0: QUERY_FW command failed, aborting.
mlx4_core 0000:07:00.0: PCI INT A disabled
mlx4_core: probe of 0000:07:00.0 failed with error -110
不知道有人遇到过没,下载
Mellanox官网的最新驱动安装了也没用。