linux内存条故障,linux – 如何从MCE消息中找到故障内存模块?

我试图了解MCE消息,以找出服务器上哪个内存模块坏.此消息出现在/var/log/kern.log中的一台服务器中,今天冻结了两次.

Apr 13 22:39:22 mbox kernel: [36247975.116860] sbridge: HANDLING MCE MEMORY ERROR

Apr 13 22:39:22 mbox kernel: [36247975.116867] CPU 0: Machine Check Exception: 0 Bank 5: 8c00004000010090

Apr 13 22:39:22 mbox kernel: [36247975.116869] TSC 0 ADDR 4a0d75900 MISC 21405cdc86 PROCESSOR 0:206d7 TIME 1428957562 SOCKET 0 APIC 0

Apr 13 22:39:22 mbox kernel: [36247975.951013] EDAC MC0: 1 CE memory read error

我怀疑一个坏的内存模块.服务器是2x Xeon E5-2650,带有8x8Go内存模块(每个CPU有8个内存插槽)

这是lshw的内存模块数量:

*-memory:0

description: System Memory

physical id: 2d

slot: System board or motherboard

*-bank:0

description: DIMM DDR3 1333 MHz (0,8 ns)

product: 9965516-197.A

vendor: Kingston

physical id: 0

serial: B83AE5C2

slot: P1_DIMMA1

size: 8GiB

width: 64 bits

clock: 1333MHz (0.8ns)

*-bank:1

description: DIMM Synchronous [empty]

product: Dimm1_PartNum

vendor: Dimm1_Manufacturer

physical id: 1

serial: Dimm1_SerNum

slot: P1_DIMMA2

width: 64 bits

*-bank:2

description: DIMM DDR3 1333 MHz (0,8 ns)

product: 9965516-048.A

vendor: Kingston

physical id: 2

serial: EC309238

slot: P1_DIMMB1

size: 8GiB

width: 64 bits

clock: 1333MHz (0.8ns)

*-bank:3

description: DIMM Synchronous [empty]

product: Dimm4_PartNum

vendor: Dimm4_Manufacturer

physical id: 3

serial: Dimm4_SerNum

slot: P1_DIMMB2

width: 64 bits

*-bank:4

description: DIMM DDR3 1333 MHz (0,8 ns)

product: 9965516-048.A

vendor: Kingston

physical id: 4

serial: E9305438

slot: P1_DIMMC1

size: 8GiB

width: 64 bits

clock: 1333MHz (0.8ns)

*-bank:5

description: DIMM Synchronous [empty]

product: Dimm7_PartNum

vendor: Dimm7_Manufacturer

physical id: 5

serial: Dimm7_SerNum

slot: P1_DIMMC2

width: 64 bits

*-bank:6

description: DIMM DDR3 1333 MHz (0,8 ns)

product: 9965516-048.A

vendor: Kingston

physical id: 6

serial: E7305738

slot: P1_DIMMD1

size: 8GiB

width: 64 bits

clock: 1333MHz (0.8ns)

*-bank:7

description: DIMM Synchronous [empty]

product: Dimm10_PartNum

vendor: Dimm10_Manufacturer

physical id: 7

serial: Dimm10_SerNum

slot: P1_DIMMD2

width: 64 bits

*-memory:1

description: System Memory

physical id: 3f

slot: System board or motherboard

*-bank:0

description: DIMM DDR3 1333 MHz (0,8 ns)

product: 9965516-197.A

vendor: Kingston

physical id: 0

serial: B63A08C3

slot: P2_DIMME1

size: 8GiB

width: 64 bits

clock: 1333MHz (0.8ns)

*-bank:1

description: DIMM Synchronous [empty]

product: Dimm1_PartNum

vendor: Dimm1_Manufacturer

physical id: 1

serial: Dimm1_SerNum

slot: P2_DIMME2

width: 64 bits

*-bank:2

description: DIMM DDR3 1333 MHz (0,8 ns)

product: 9965516-048.A

vendor: Kingston

physical id: 2

serial: EA309638

slot: P2_DIMMF1

size: 8GiB

width: 64 bits

clock: 1333MHz (0.8ns)

*-bank:3

description: DIMM Synchronous [empty]

product: Dimm4_PartNum

vendor: Dimm4_Manufacturer

physical id: 3

serial: Dimm4_SerNum

slot: P2_DIMMF2

width: 64 bits

*-bank:4

description: DIMM DDR3 1333 MHz (0,8 ns)

product: 9965516-048.A

vendor: Kingston

physical id: 4

serial: E7305938

slot: P2_DIMMG1

size: 8GiB

width: 64 bits

clock: 1333MHz (0.8ns)

*-bank:5

description: DIMM Synchronous [empty]

product: Dimm7_PartNum

vendor: Dimm7_Manufacturer

physical id: 5

serial: Dimm7_SerNum

slot: P2_DIMMG2

width: 64 bits

*-bank:6

description: DIMM DDR3 1333 MHz (0,8 ns)

product: 9965516-048.A

vendor: Kingston

physical id: 6

serial: E7305B38

slot: P2_DIMMH1

size: 8GiB

width: 64 bits

clock: 1333MHz (0.8ns)

*-bank:7

description: DIMM Synchronous [empty]

product: Dimm10_PartNum

vendor: Dimm10_Manufacturer

physical id: 7

serial: Dimm10_SerNum

slot: P2_DIMMH2

width: 64 bits

*-memory:2 UNCLAIMED

physical id: 7

*-memory:3 UNCLAIMED

physical id: 9

您可以注意到,#5银行没有内存模块.所以我的问题是:你是否同意这条消息是关于内存故障的?如果是这样,我怎样才能找到要替换的模块?

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值