linux网卡名称是p4p,dell r730 安装 Gp 后万兆网卡有 rx error

设备配置及操作系统

cpu:英特尔至强E5-2640V3处理器 2.6GHz 8核 2颗

mem:8G,DDR4-2133 RDIMM,32条,共256G

硬盘1:1.2T,万转sas做数据盘,24块

硬盘2:600G,万转sas做系统盘,2块

RAID卡:2G缓存

网卡:2*10GE(SFP+),原厂的

操作系统:suse11sp4

Linux hebda_data_33 3.0.101-77-default #1 SMP Tue Jun 14 20:33:58 UTC 2016 (a082ea6) x86_64 x86_64 x86_64 GNU/Linux

上联交换机:华为12812

网卡信息:

ethtool -i p4p2

driver: bnx2x

version: 1.710.51-0

firmware-version: FFV08.07.25 bc 7.13.54

bus-info: 0000:83:00.1

supports-statistics: yes

supports-test: yes

supports-eeprom-access: yes

supports-register-dump: yes

hebda_data_33:~ # ethtool -i em1

driver: bnx2x

version: 1.710.51-0

firmware-version: FFV08.07.25 bc 7.13.54

bus-info: 0000:01:00.0

supports-statistics: yes

supports-test: yes

supports-eeprom-access: yes

supports-register-dump: yes

hebda_data_33:~ # lspci -s 0000:83:00.1 -vvv

83:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)

Subsystem: Broadcom Corporation Device 1006

Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+

Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR-

Latency: 0

Interrupt: pin B routed to IRQ 60

Region 0: Memory at c8000000 (64-bit, prefetchable) [size=8M]

Region 2: Memory at c8800000 (64-bit, prefetchable) [size=8M]

Region 4: Memory at ca000000 (64-bit, prefetchable) [size=64K]

Expansion ROM at ca500000 [disabled] [size=512K]

Capabilities: [48] Power Management version 3

Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)

Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-

Capabilities: [50] Vital Product Data

Not readable

Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+

Address: 0000000000000000 Data: 0000

Capabilities: [a0] MSI-X: Enable+ Count=32 Masked-

Vector table: BAR=4 offset=00000000

PBA: BAR=4 offset=00001000

Capabilities: [ac] Express (v2) Endpoint, MSI 00

DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us

ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-

DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+

RlxdOrd+ ExtTag+ PhantFunc- AuxPwr+ NoSnoop+

MaxPayload 256 bytes, MaxReadReq 4096 bytes

DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-

LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <1us, L1 <2us

ClockPM+ Surprise- LLActRep- BwNot-

LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+

ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

DevCap2: Completion Timeout: Range ABCD, TimeoutDis+

DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-

LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB

Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

Compliance De-emphasis: -6dB

LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-

EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-

Capabilities: [100 v1] Advanced Error Reporting

UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-

CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+

CEMsk: RxErr- BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+

AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+

Capabilities: [13c v1] Device Serial Number f4-e9-d4-ff-fe-9d-ba-10

Capabilities: [150 v1] Power Budgeting >

Capabilities: [160 v1] Virtual Channel

Caps: LPEVC=0 RefClk=100ns PATEntryBits=1

Arb: Fixed- WRR32- WRR64- WRR128-

Ctrl: ArbSelect=Fixed

Status: InProgress-

VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-

Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-

Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff

Status: NegoPending- InProgress-

Capabilities: [1b8 v1] Alternative Routing-ID Interpretation (ARI)

ARICap: MFVC- ACS-, Next Function: 0

ARICtl: MFVC- ACS-, Function Group: 0

Capabilities: [220 v1] #15

Kernel driver in use: bnx2x

Kernel modules: bnx2x

hebda_data_33:~ # lspci -s 0000:01:00.0 -vvv

01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57800 1/10 Gigabit Ethernet (rev 10)

Subsystem: Dell BCM57800 10-Gigabit Ethernet

Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+

Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR-

Latency: 0

Interrupt: pin A routed to IRQ 40

Region 0: Memory at 95000000 (64-bit, prefetchable) [size=8M]

Region 2: Memory at 95800000 (64-bit, prefetchable) [size=8M]

Region 4: Memory at 96030000 (64-bit, prefetchable) [size=64K]

Expansion ROM at 96080000 [disabled] [size=512K]

Capabilities: [48] Power Management version 3

Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)

Status: D0 NoSoftRst+ PME-Enable- DSel=8 DScale=1 PME-

Capabilities: [50] Vital Product Data

Not readable

Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+

Address: 0000000000000000 Data: 0000

Capabilities: [a0] MSI-X: Enable+ Count=32 Masked-

Vector table: BAR=4 offset=00000000

PBA: BAR=4 offset=00001000

Capabilities: [ac] Express (v2) Endpoint, MSI 00

DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us

ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-

DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+

RlxdOrd+ ExtTag+ PhantFunc- AuxPwr+ NoSnoop+

MaxPayload 256 bytes, MaxReadReq 4096 bytes

DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-

LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <1us, L1 <2us

ClockPM+ Surprise- LLActRep- BwNot-

LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+

ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

DevCap2: Completion Timeout: Range ABCD, TimeoutDis+

DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-

LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB

Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

Compliance De-emphasis: -6dB

LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-

EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-

Capabilities: [100 v1] Advanced Error Reporting

UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-

CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+

CEMsk: RxErr- BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+

AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+

Capabilities: [13c v1] Device Serial Number 18-66-da-ff-fe-65-77-0b

Capabilities: [150 v1] Power Budgeting >

Capabilities: [160 v1] Virtual Channel

Caps: LPEVC=0 RefClk=100ns PATEntryBits=1

Arb: Fixed- WRR32- WRR64- WRR128-

Ctrl: ArbSelect=Fixed

Status: InProgress-

VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-

Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-

Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff

Status: NegoPending- InProgress-

Capabilities: [1b8 v1] Alternative Routing-ID Interpretation (ARI)

ARICap: MFVC- ACS-, Next Function: 1

ARICtl: MFVC- ACS-, Function Group: 0

Capabilities: [220 v1] #15

Capabilities: [300 v1] #19

Kernel driver in use: bnx2x

Kernel modules: bnx2x

hebda_data_33:~ # ethtool -S p4p2|grep dis

[0]: rx_discards: 79516

[0]: rx_phy_ip_err_discards: 0

[0]: rx_skb_alloc_discard: 28517

[1]: rx_discards: 88484

[1]: rx_phy_ip_err_discards: 0

[1]: rx_skb_alloc_discard: 27102

[2]: rx_discards: 13667973

[2]: rx_phy_ip_err_discards: 0

[2]: rx_skb_alloc_discard: 35207

[3]: rx_discards: 33056205

[3]: rx_phy_ip_err_discards: 0

[3]: rx_skb_alloc_discard: 33533

[4]: rx_discards: 13263091

[4]: rx_phy_ip_err_discards: 0

[4]: rx_skb_alloc_discard: 34748

[5]: rx_discards: 7583294

[5]: rx_phy_ip_err_discards: 0

[5]: rx_skb_alloc_discard: 32756

[6]: rx_discards: 3703892

[6]: rx_phy_ip_err_discards: 0

[6]: rx_skb_alloc_discard: 28380

[7]: rx_discards: 31746726

[7]: rx_phy_ip_err_discards: 0

[7]: rx_skb_alloc_discard: 32609

rx_discards: 103189181

rx_mf_tag_discard: 0

rx_brb_discard: 90068

rx_phy_ip_err_discards: 0

rx_skb_alloc_discard: 252852

没有其它错误

hebda_data_23:~ # for i in `seq 1 10`; do ifconfig p4p2 | grep RX | grep overruns; sleep 1; done

RX packets:253639505018 errors:305619311 dropped:0 overruns:305375168 frame:244143

RX packets:253639552428 errors:305619311 dropped:0 overruns:305375168 frame:244143

RX packets:253639566818 errors:305619311 dropped:0 overruns:305375168 frame:244143

RX packets:253639585722 errors:305619311 dropped:0 overruns:305375168 frame:244143

RX packets:253639597202 errors:305619311 dropped:0 overruns:305375168 frame:244143

RX packets:253639610209 errors:305619311 dropped:0 overruns:305375168 frame:244143

RX packets:253639622800 errors:305619311 dropped:0 overruns:305375168 frame:244143

RX packets:253639642350 errors:305620450 dropped:0 overruns:305376307 frame:244143

RX packets:253639675509 errors:305620450 dropped:0 overruns:305376307 frame:244143

RX packets:253639723772 errors:305620471 dropped:0 overruns:305376328 frame:244143

hebda_data_23:~ # for i in `seq 1 10`; do ifconfig p4p2 | grep RX | grep overruns; sleep 1; done

RX packets:253639788669 errors:305620773 dropped:0 overruns:305376630 frame:244143

RX packets:253639812355 errors:305621201 dropped:0 overruns:305377058 frame:244143

RX packets:253639834600 errors:305621201 dropped:0 overruns:305377058 frame:244143

RX packets:253639892990 errors:305621455 dropped:0 overruns:305377312 frame:244143

RX packets:253639913026 errors:305621455 dropped:0 overruns:305377312 frame:244143

RX packets:253639919136 errors:305621455 dropped:0 overruns:305377312 frame:244143

RX packets:253639935095 errors:305622380 dropped:0 overruns:305378237 frame:244143

RX packets:253639954560 errors:305623012 dropped:0 overruns:305378869 frame:244143

RX packets:253639961150 errors:305623012 dropped:0 overruns:305378869 frame:244143

RX packets:253639971680 errors:305623012 dropped:0 overruns:305378869 frame:244143

业务配置

Gp DB 4.3

问题描述

安装应用后网卡的使用情况如下图:

但是在高峰时通过nagios会发现整个集群每个节点都报下面的错误,裸跑的时候也有类似的报错,但是没有来得及抓网卡的包:

Interface 11

Active checks of the service have been disabled - only passive checks are being accepted Perform Extra Service Actions

CRITICAL 09-20-2016 10:47:51 0d 0h 11m 46s 1/1 CRIT - [p4p2] (up) MAC: f4:e9:d4:9d:cb:92, 10.00 Gbit/s, in: 262.67 MB/s, in-errors: 0.16%(!!) >= 0.1, out: 237.76 MB/s

实际使用的命令是:

echo '<<>>'

sed 1,2d /proc/net/dev

整体上来看,errors在0.1%-0.6%之间,极少的能达到1%,当时的流量也从20M-200MB左右不等。

第一个问题是:这是不是问题?我个人感觉应该是,所以个人花了精力来处理,各位大神意见?

第一个问题是:如何解决?我有一点思路,请大神拍一下。

看了网上大家写的,怀疑问题是在rx errors,而且我看overrun比较多,是否不是ring_buffer的问题,而是中断的问题?

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值