AFF A250でSPハートビートが停止したためにシステムがリブートする
環境
- BMC 15.10
- AFF - C250
問題
SP HBTが停止したためにノードがリブートする
Sat Aug 19 03:46:24 -0400 [cluster-01: spmgrd: sp.heartbeat.stopped:debug]: Have not received a IPMI heartbeat from the Service Processor (SP) in last 600 seconds.Sat Aug 19 03:46:24 -0400 [cluster-01: spmgrd: callhome.sp.hbt.missed:debug]: Call home for SP HBT MISSED Sat Aug 19 03:56:44 -0400 [cluster-01: spmgrd: callhome.sp.hbt.stopped:debug]: Call home for SP HBT STOPPEDSat Aug 19 03:59:08 -0400 [cluster-01: env_mgr: sp.ipmi.lost.shutdown:EMERGENCY]: SP heartbeat stopped and cannot be recovered. To prevent hardware damage and data loss, the system will shut down in 10 minutes.Sat Aug 19 04:09:08 -0400 [cluster-01: env_mgr: monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (System reboot to recover the BMC)Partner takes over due to see partner rebooting:Sat Aug 19 04:09:33 -0400 [cluster-02: cf_main: cf.fsm.takeover.on.reboot:debug]: Failover monitor: One node initiated automatic takeover after detecting that its partner node is rebooting.