L2 watchdog リセット後に「 ADR timeout 」イベントが報告される
環境
- ONTAP 9
- FAS8200
- AFF A300
問題
ADR(Asynchronous DRAM Refresh)後のL2ウォッチドッグタイムアウトが原因でノードが停止しました。events all
ログの例:
Record 392: Sat Feb 27 03:41:59 2021 [IPMI.critical]: ADR event ID: 0 at time: 1614397319
Record 393: Sat Feb 27 03:41:59 2021 [IPMI.critical]: ADR Battery destage cycles: 19
Record 394: Sat Feb 27 03:42:01 2021 [IPMI Event.critical]: NMI
Record 395: Sat Feb 27 03:42:01 2021 [IPMI.notice]: bd00 | 02 | EVT: 6fc824ff | System_Watchdog | Assertion Event, "Timer interrupt"
Record 396: Sat Feb 27 03:42:02 2021 [IPMI Event.critical]: L2 watchdog timeout hard reset
Record 397: Sat Feb 27 03:42:02 2021 [Trap Event.critical]: hwassist l2_watchdog_reset (29)
Record 398: Sat Feb 27 03:42:02 2021 [Trap Event.critical]: SNMP l2_watchdog_reset (29)
Record 399: Sat Feb 27 03:42:02 2021 [SysFW.notice]: ADR Timeout
Record 400: Sat Feb 27 03:42:02 2021 [SysFW.notice]: ADR Timeout
Record 401: Sat Feb 27 03:42:07 2021 [IPMI.notice]: be00 | 02 | EVT: 6fc104ff | System_Watchdog | Assertion Event, "Hard reset"
Record 402: Sat Feb 27 03:42:12 2021 [SP.critical]: Filer Reboots