「Resetting SP from primary FW」または「SP IPMI failure」でシステムをブートできない
環境
- ONTAP 9
- FAS2720
- FAS2750
- AFF C190用
- AFF A220
- AFF A300 / FAS8200
- FAS2650
- FAS2620
- サービスプロセッサ(SP)
問題
- ノードが停止し、 SP HBTが失われたか、またはSP HBTが停止したことを示すEMSログが表示されます。
[nodename: monitor: monitor.globalStatus.critical:EMERGENCY]: Multiple fans has failed: Sysfan1 F1, Sysfan1 F2, Sysfan2 F1, Sysfan2 F2. Power Supply Status Critical: PSU1.
[nodename: monitor: monitor.globalStatus.critical:EMERGENCY]: Power Supply Status Critical: PSU1.
[nodename: cphmd: hm.alert.cleared:notice]: Alert Id = CriticalFruMultiFaultAlert , Alerting Resource = XXXXXXXXXXXX cleared by monitor chassis
[Nodename: spsm_listener: callhome.sp.hbt.missed:notice]: Call home for SP HBT MISSED
[Nodename: spsm_listener: callhome.sp.hbt.stopped:alert]: Call home for SP HBT STOPPED
- または、サービスプロセッサ (SP)が
SP load is high
コンソールログにエラーを報告し、ノードが停止した場合。
[SP.notice]: SP load is high: 3.12 2.59 2.02
[SP.notice]: SP load is high: 3.54 2.90 2.21
[IPMI.notice]: e601 | 02 | EVT: 0301ffff | Attn_Sensor1 | Assertion Event, "State Asserted"
[SP.emergency]: SP reset initiated by storage controller
[IPMI.notice]: e701 | c0 | OEM: ffff70005000 | ManufId: 150300 | SP Reset Externally
[IPMI.notice]: e801 | c0 | OEM: fcff70000000 | ManufId: 150300 | POS Register: Unexpected Reset
- ローダーからノードをブートできない。エラー:
LOADER-A> boot_ontap
Loading X86_64/freebsd・・・
Loading X86_64/freebsd・・・
Starting program at ・・・
NetApp Data ONTAP 9.3P4
***************************************
This platform is not supported in this release.
The system will now halt
***************************************
BIOS Version: 11.1
Portions Copyright (C) 2014-2017 NetApp, Inc. All Rights Reserved.
Initializing System Memory ...
Loading Device Drivers ...
Waiting for SP ...
SP failure. Resetting SP from primary FW. This can take a few minutes
または
Failed to recover SP
IPMI:Get controller FRU inventory:failed
IPMI:Get midplane FRU 0 inventory:failed
Configuring Devices ...
IPMI PCI Slot Control failed.
CPU = 1 Processor(s) Detected.
Intel(R) Xeon(R) CPU D-1587 @ 1.70GHz (CPU 0)
CPUID: 0x00050664. Cores per Processor = 16
131072 MB System RAM Installed.
SATA (AHCI) Device: SV9MST6D120GLM41NP
Boot Loader version 6.0.10
Copyright (C) 2000-2003 Broadcom Corporation.
Portions Copyright (C) 2002-2020 NetApp, Inc. All Rights Reserved.
BIOS POST Failure(s) detected: SP IPMI failure. Abort AUTOBOOT
- マザーボードの交換後に停止したコントローラが同じエラーでブートしない
- SP
events all
メッセージ:
Record 231: Sun Aug 1 00:25:04 2021 [SysFW.notice]: Failed to recover SP
Record 232: Sun Aug 1 00:25:04 2021 [SysFW.critical]: IPMI:Get controller FRU inventory:failed
Record 233: Sun Aug 1 00:25:04 2021 [SysFW.notice]: IPMI:Get midplane FRU 0 inventory:failed
Record 234: Thu Jan 1 00:05:00 1970 [Trap Event.critical]: hwassist post_error (26)
events all
パートナーノードへのSPログの メッセージ:
Sat Oct 15 13:05:38 2016 [Agent.notice]: Local Serial Exchange Error Internal MLER[4] asserted
Mon Oct 17 08:38:52 2016 [Agent.notice]: Local Invalid Serial Exchange Bus Internal MLER[5] asserted
Thu Jan 01 00:00:36 1970 [Agent.notice]: Midplane I2C Local Buffers Not Ready Internal MLER[6] de-asserted
Mon Oct 17 08:52:11 2016 [Agent.notice]: Midplane Local Grant Timeout Internal MLER[2] asserted