システムメモリDIMMの修正可能なメモリエラーです
環境
- FAS & AFF Systemsの略
- ONTAP 9
問題
- 修正可能な ECC ( CECC )エラーが 1 時間以内に 10 回報告されました。
- SNMP トラップツールでエラーが表示される:
[productTrapData.0 = cecc_log.summary:Total of 1 new correctable ECC errors just reported. You might want to check system memory. 5 correctable ECC errors reported since booting. ; productSerialNum.0 = [productSerialNum],[DC=XXXXX-OS]]
[productTrapData.0 = cecc_log.entry:1: ECC error at DIMM-2: CE-02-1921-xxx,ADDR [address],(Node(0), Memory controller(0), CH(0), DIMM(1), Rank(0), Bank Group(0), Bank(0x2), Row(0xfd2c), Col(0x150),Correctable Machine Check Error at CPUxx. BDWL_HA0 Error:
- EMS でも同じエラーメッセージが表示されます。
[?] Thu Jun 17 03:40:18 JST [hostname: cecc_logger: cecc_log.entry:notice]: 1: ECC error at DIMM-2: CE-02-1921-xxx,ADDR 0x1029c86a80,(Node(0), Memory controller(0), CH(0), DIMM(1), Rank(0), Bank Group(0), Bank(0x2), Row(0xfd2c), Col(0x150), Correctable Machine Check Error at CPU15. BDWL_HA0 Error: STATUS<0x8c00004000010090>(Val,MiscV,AddrV,CorrSts(0),CorrCnt(0x1),ExtErr(0x1),ErrCode(Channel 0, Read)ErrCode(0x90))MISC<0x0000000150149486>(HaDbBank(0),PE(0),ReqOpcode(0xa),RNID(0),RTID(0xa),HTID(0x4a))
[?] Thu Jun 17 03:40:18 JST [hostname: cecc_logger: cecc_log.summary:notice]: Total of 1 new correctable ECC errors just reported. You might want to check system memory. 1 correctable ECC errors reported since booting.