BIOS POST障害が検出されました:SP IPMI障害
環境
- AFFとFASプラットフォーム
問題
- シャーシ内の1つのノードがI2Cバスをロックします
- パートナーからハートビートが検出されなかったため、テイクオーバーが開始されました。
例:
[node_name : cf_main: cf.fsm.takeover.noHeartbeat:alert]: Failover monitor: Takeover initiated after no heartbeat was detected from the partner node.
[node_name : monitor: monitor.globalStatus.critical:EMERGENCY]: This node has taken over .
[env_mgr: monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (System reboot to recover the SP)
- センサーはこれ以上読み取れません
- 停止したノードのブートに失敗しました。
例:
Failed to recover SPIPMI:Get controller FRU inventory:failedIPMI:Get midplane FRU 0 inventory:failedConfiguring Devices ...IPMI PCI Slot Control failed....BIOS POST Failure(s) detected: SP IPMI failure. Abort AUTOBOOT
or
Failed to recover SPIPMI:Get controller FRU inventory:failedIPMI:Get midplane FRU 0 inventory:failedConfiguring Devices ...IPMI PCI Slot Control failed.CPU = 2 Processor(s) Detected. Intel(R) Xeon(R) CPU D-1587 @ 1.70GHz (CPU 0) CPUID: 0x00050664. Cores per Processor = 16 CPU1 (CPU 1) CPUID: 0x00000000. Cores per Processor = 0131072 MB System RAM Installed.SATA (AHCI) Device: ATP SATA III mSATA AF120GSMHI-NT2Boot Loader version 8.1.0Copyright (C) 2000-2003 Broadcom Corporation.Portions Copyright (C) 2002-2023 NetApp, Inc. All Rights Reserved.BIOS POST Failure(s) detected: BMC IPMI failure. Abort AUTOBOOT
- I2Cバスがロックされているため、複数のセンサーが読み取り不能です。
例:[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Temperature is Unreadable
[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Current is Unreadable
[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Fan1 Speed is Unreadable
[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Fan2 Speed is Unreadable
[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Over Temp is Unreadable
[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Over Volt is Unreadable
[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Over Curr is Unreadable
[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 InPower Monitor is Unreadable
[node_name : cphmd: hm.alert.raised:alert]: Alert Id = CriticalFanFruFaultAlert , Alerting Resource = xxxx raised by monitor chassis
[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 2 is degraded: PSU2 Over Curr is Unreadable
[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 2 is degraded: PSU2 Crest Factor is Unreadable
[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 2 is degraded: PSU2 InPower Monitor is Unreadable
[node_name : env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 2 is degraded: PSU2 Over Temp is Unreadable