CHW-3241: 最新のSPバージョン11.11のFAS2720でl2_watchdog_resetが発生
問題
- watchdogリセットが原因でノードがリブート
Node-02: cf_hwassist: cf.hwassist.takeoverTrapRecv:notice]: hw_assist: Received takeover hw_assist alert from partner(Node-n1), system_down because l2_watchdog_reset.
- SPログが表示されます
[IPMI.notice]: 0029 | 02 | EVT: 6fc124ff | System_Watchdog | アサーションイベント, "Hard reset"
[IPMI Event.critical]: L2 watchdog timeout hard reset
[IPMI Event.critical]: System reset
[IPMI Event.critical]: L2 watchdog action completed
[Trap Event.critical]: hwassist l2_watchdog_reset (29)
[IPMI.notice]: 002a | 02 | EVT: 0301ffff | SysReset | アサーションイベント, "State Asserted"
[IPMI.notice]: L2 to L1 is 1(s) 1189(us)
[IPMI.notice]: 002b | 02 | EVT: 0301ffff | CPU_Cat_Error | アサーションイベント, "State Asserted"
[BMC.notice]: L2_WDOG ASUP email を 120 second 遅延