環境上の理由シャットダウン(複数のファンで障害が発生した場合)
環境
- A400
- ONTAP 9.13.1P6
- BMCファームウェア13.11P1
問題
初期設定後のリブート中に、環境上の理由によるシャットダウンが発生した
BIOS Version: 16.9
PEI start.
CPU PEI initialization.
Wait BMC 30 seconds.
Wait BMC self-test result.
BMC self-test: Softfail.
UPI initialization.
CPU initialization.
Running full memory initialization.
CPU reset.
BIOS Version: 16.9
PEI start.
CPU PEI initialization.
Wait BMC self-test result.
BMC self-test: Softfail.
UPI initialization.
CPU initialization.
Running full memory initialization.
SPI FLASH: Primary BIOS
PEI end.
DXE start.
ERROR: Class:0; Subclass:20000; Operation: 1005
ERROR: Class:0; Subclass:20000; Operation: 1001
USB initialization.
PCI host bridge initialization.
CSM initialization.
PCI Bus initialization start.
BDS start.
Console output devices connect.
Version 2.20.1276. Copyright (C) 2023 American Megatrends, Inc. BIOS Date: [System Information]CPU = 2 Processors DetectedCPU 0 : Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz Core : 10CPU 1 : Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz Core : 10Memory Size : 147456 MB
Ready to boot.
Boot Loader version 8.1.0
Copyright (C) 2000-2003 Broadcom Corporation.
Portions Copyright (C) 2002-2023 NetApp, Inc. All Rights Reserved.
ACPI RSDP Found at 0x6cc2c000
Starting AUTOBOOT press Ctrl-C to abort...
Loading X86_64/freebsd/image2/kernel:0x200000/1328352 0x345000/9752196 0xc91e90/1401092 0xe00000/248896 0xe3d000/4344480 0x1261aa0/5891424 Entry at 0xffffffff80345000
Loading X86_64/freebsd/image2/platform.ko:0x1800000/1347584 0x1949000/409600 0x19ad000/0 0x19ad000/16 0x19ad040/492352 0x1a25380/80 0x1a253d0/88 0x1a25428/2288 0x1a25d18/4696 0x1a26f70/112 0x1a26fe0/216 0x1a270b8/88 0x1a27110/88 0x1a27168/88 0x1a271c0/88 0x1a27218/88 0x1a27270/88 0x1a272c8/88 0x1a27320/4888 0x1a28638/3160 0x1a29290/616 0x1bfffc0/56352 0x1c0dbe0/3336 0x1c0e8f0/132588 0x1c2eee0/448 0x1c2f0a0/169 0x1c9f468/1410792 0x1df7b50/310224 0x1e43720/152736 0x1e68bc0/9480 0x1e6b0c8/1848 0x1e6b800/6864 0x1e6d2d0/14664 0x1e70c18/39024 0x1e7a488/14088 0x1e7db90/264 0x1e7dc98/25872 0x1e841a8/264 0x1e842b0/264 0x1e843b8/264 0x1e844c0/264 0x1e845c8/264 0x1e846d0/264 0x1e847d8/264 0x1e848e0/648 0x1e84b68/336 0x1e84cb8/240 0x1c2f149/0 0x1c2f150/243480 0x1c9f1fa/621 0x1c6a868/215442
Starting program at 0xffffffff80345000
---<<BOOT>>---
NetApp Data ONTAP 9.13.1P6
random: registering fast source Intel Secure Key RNG
IPMI device unit 0 rev. 1, firmware rev. 13.04, version 2.0, device support mask 0xbf
IPMI device unit 1 rev. 1, firmware rev. 13.04, version 2.0, device support mask 0xbf
Read HAOSC config (0x1111) from bootarg
Copyright (C) 1992-2023 NetApp.
All rights reserved.
*******************************
* *
* Press Ctrl-C for Boot Menu. *
* *
*******************************
netapp_begin started…
Apr 24 15:04:14 [node-02:monitor.chassisFan.removed:ALERT]: Chassis fan Fan4 is removed
Apr 24 15:04:47 [node-02:monitor.chassisFanFail.xMinShutdown:EMERGENCY]: Multiple Chassis Fan failure: System will shut down in 2 minutes.
Apr 24 15:05:12 [node-02:callhome.c.fan.fru.rm:error]: Call home for CHASSIS FAN FRU REMOVED: Fan4
Apr 24 15:05:42 [node-02:callhome.c.fan.fru.shut:error]: Call home for MULTIPLE CHASSIS FAN FAILED: System will shut down in 2 minutes
Apr 24 15:05:52 [node-02:wafl.analytics.enterOverload:notice]: The analytics subsystem is not keeping up with the amount of work being generated by the system.
Apr 24 15:05:56 [node-02:fmmb.disk.notAccsble:notice]: All Local mailbox disks are inaccessible.
Apr 24 15:05:56 [node-02:cf.fm.notkoverClusterDisable:error]: Failover monitor: takeover disabled (restart)
Apr 24 15:05:56 [node-02:wafl.transition.cp.completed:notice]: Transition CP with reason flush_b4_mounted, 00000000 for replaying=0,0 unmounting=0,0 total=2,1 volumes with a total of total=105 incoming=3 dirty buffers took 29ms with longest CP phases being CP_P2_FLUSH=17, CP_P2V_INO=4, CP_P3A_VOLINFO=1 on aggregate aggr0.
Apr 24 15:05:56 [node-02:cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of node-01 disabled (Controller Failover takeover disabled).
Apr 24 15:05:57 [node-02:kern.syslog.msg:notice]: domain xing mode: off, domain xing interrupt: false
Apr 24 15:05:57 [node-02:monitor.fan.failed:ALERT]: Multiple fans has failed: SysFan4 F1, SysFan4 F2.
Apr 24 15:05:57 [node-02:clam.invalid.config:error]: Local node (name=unknown, id=0) is in an invalid configuration for providing CLAM functionality. CLAM cannot determine the identity of the HA partner.
Apr 24 15:06:00 [node-02:monitor.globalStatus.critical:EMERGENCY]: Controller failover of node-01 is not possible: Controller Failover takeover disabled. Multiple fans has failed: SysFan4 F1, SysFan4 F2.
Apr 24 15:07:12 [node-02:monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (Multiple fans failed)
Thu Apr 24 15:07:14 UTC 2025
login:
Terminated
.
Uptime: 4m4s