MRC系列(二)MRC中HOST对错误的处理

1、Memory Discovery

1.1 SPD read protocol errors

SPD read protocol errors are generally not correctable. A SPD read error (NACK) will appear as DIMM not present. The MRC will attempt to recover a SMBus hang condition and retry the transaction up to three times before giving up.

不可纠正的,出现错误后,会retry 三次,依然失败的话,就会判定该slot不存在内存条

1.2 SPD data errors

SPD data errors are not correctable. The MRC will qualify the SPD data that is needed for subsequent initialization steps. The MRC will not verify the SPD CRC value by default, but a MRC input parameter can be provided to verify the CRC.

SPD数据错误是不可纠正的,MRC默认不会校验SPD的CRC,但是MRC的输入参数是会被校验的------通过I2C或者I3C从SPD读出数据,无从得知该数据读的正确与否,然后将该数据与字典做匹配,不符合的即判断为error

1.3 DIMM population errors

DIMM population errors are generally not correctable when the POR configuration table is being enforced. If the POR table is not enforced, some combinations of memory population (such as fewer ranks in slot 0 compared to slot 1) can be tolerated.

内存填充比如:不应将SODIMMs与RDIMM或UDIMM混合使用,允许混合使用RDIMM和LRDIMM,违反POR(Plan of Record)的都视为错误

2、Memory Training

In the memory training phase, the MRC uses the memory controller to test the DDR bus and adjust timing/Vref for optimal margins. Errors found during memory training are assumed to affect the interconnect of DDR signals on the bus (Command, Address, and Data). An error on one rank can potentially affect the same signals connected to another rank on the same channel due to non-target termination requirements.

一个rank 上出现的错误可能会影响同一个信号连接的另一个rank

2.1 Correctable data errors

Correctable data errors on a given rank can be tolerated as long as they do not conflict with correctable errors on another rank on the channel. A correctable data error consists of Dq/Dqs stuck-at 0/1 faults on a single x4 nibble in independent channel mode, or a byte in lockstep mode. A primary Dq or Dqs failure on a x8 device will affect the whole byte. An MRC input parameter is provided to allow tolerating correctable data errors during memory training, or to treat the errors as uncorrectable.

只要不与通道上另一个rank上的可纠正错误冲突,给定rank上的可纠正数据错误是可以容忍的。通过MRC提供的一个输入参数,mem training期间可以容忍可纠正的数据错误,或者将其视为不可纠正的

2.2 Uncorrectable data errors

Uncorrectable data errors during the memory training will result in the DDR channel being disabled. An uncorrectable data error occurs when one or more ranks have Dq/ Dqs errors spanning multiple nibbles in independent channel mode, or multiple bytes in lockstep mode.

出现不可纠正的数据错误,HOST会将该DDR channel禁用掉

2.3 Uncorrectable command/address errors

Uncorrectable command/address errors during the memory training will result in the DDR channel being disabled. An uncorrectable command/address error occurs when the DIMM unexpectedly asserts the ALERT signal due to a Parity error.

出现不可纠正的C/A错误,HOST会将该DDR channel禁用掉,这种错误一般是由于Parity error产生的

3、Memory Test

3.1 Transient data errors

Transient data errors during memory test will be tolerated and the given rank will be available for the system memory map. The system relies on runtime RAS features for handling transient ECC errors on the rank.

RAS:Reliability, Availability, Serviceability

3.2 Persistent correctable data errors

Persistent correctable data errors during memory test can be tolerated and the given rank can be available for the system memory map. The MRC provides an input parameter to tolerate correctable data errors during memory test or to treat those errors as uncorrectable.

判断有该错误时 可以给一个默认值

3.3 Persistent uncorrectable data errors

Persistent uncorrectable data errors during memory test will result in the rank being excluded from the system memory map. The rank will still participate in termination on the channel, but it will be excluded from all DRAM read/write commands including patrol scrub. If all ranks on the channel are mapped out, then the channel will be disabled.

当有该错误时,DIMM会被 mapped out,所在的channel也会被禁用

  • 22
    点赞
  • 19
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值