メインコンテンツへスキップ

電源の問題によりコントローラの自動テイクオーバーが完了しました

Views:
75
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
hw
Last Updated:

環境

  • ONTAP 9
  • AFFシステム
  • FASシステム

問題

  • パートナーノードで自動テイクオーバーが発生しました。以下はEMS-LOG-FILEで確認できます
callhome.sfo.takeover:alert]: Call home for CONTROLLER TAKEOVER COMPLETE AUTOMATIC
cf_takeover: callhome.reboot.takeover:notice]: Call home for PARTNER REBOOT (CONTROLLER TAKEOVER)
cf_takeover: cf.fm.takeoverComplete:notice]: Failover monitor: takeover completed

splog_main: mgr.boot.reason_abnormal:EMERGENCY]: System rebooted due to a power glitch.
splog_main: callhome.reboot.glitch:notice]: Call home for REBOOT (power glitch)

  • シェルフでもEMS-LOG-FILEで電源障害が確認されています

cf_hwassist: cf.hwassist.takeoverTrapRecv:notice]: hw_assist: Received takeover hw_assist alert from partner(cluster1-01), system_down because power_loss.
dsa_worker5: ses.status.psWarning:error]: DS224-12 (S/N SHF#############) shelf 0 on channel 7a power warning for Power supply 2: warning status; DC undervoltage. This module is on the rear of the shelf at the bottom right.
dsa_worker4: ses.status.psWarning:error]: DS224-12 (S/N SHF#############) shelf 10 on channel 9d power warning for Power supply 2: warning status; DC undervoltage. This module is on the rear of the shelf at the bottom right.

Sat Jan 11 00:30:40 +0100 [snes1p208_01: dsa_worker0: callhome.shlf.power.intr:error]: Call home for SHELF POWER INTERRUPTED

  • ASUP HAグループ通知には次の内容も表示されます

HA Group Notification (CHASSIS POWER SUPPLY DEGRADED: PSU3) ERROR

HA Group Notification (CHASSIS POWER DEGRADED: Power Supply Status Critical: PSU3.) ERROR

HA Group Notification (SHELF POWER INTERRUPTED) ERROR

HA Group Notification (SHELF_FAULT) ERROR

  • SP-LATEST-SYSTEM-EVENT-LOG を確認すると、次のことがわかります:

Record 589: Fri Aug 06 19:01:37.000000 2021 [SP.emergency]: System input power lost
Record 590: Thu Jan 01 00:00:49.400961 1970 [IPMI.notice]: 7204 | c0 | OEM: ffff7000ff00 | ManufId: 150300 | SP Power Reset
Record 591: Thu Jan 01 00:00:49.450536 1970 [IPMI.notice]: 7304 | c0 | OEM: fcff70560000 | ManufId: 150300 | POS Register: Power on Reset(Normal Power Cycle)

Record 407: Fri Feb 28 03:34:17.489482 2020 [Agent.notice]: 127.880: 3 : AC Power Loss Signal PSU1 de-asserted
Record 408: Fri Feb 28 03:34:17.489664 2020 [Agent.notice]: 128.100: 4 : AC Power Loss Signal PSU2 de-asserted
Record 409: Fri Feb 28 03:34:17.557708 2020 [Agent.notice]: 196.145: 4 : AC Power Loss Signal PSU2 asserted
Record 410: Fri Feb 28 03:34:17.570049 2020 [Agent.notice]: 208.526: 3 : AC Power Loss Signal PSU1 asserted
Record 411: Fri Feb 28 03:34:17.635848 2020 [Agent.notice]: 274.301: 14 : Attention LED (at Midplane) asserted
Record 412: Fri Feb 28 03:34:23.431854 2020 [Agent.notice]: 070.290: 14 : Attention LED (at Midplane) de-asserted
Record 413: Fri Feb 28 03:34:27.516634 2020 [SP.warning]: AC_OK Low Detected
Record 419: Fri Feb 28 03:39:47.942198 2020 [SP.critical]: Filer Reboots

  • 両方のノードへの電源が同時に失われた場合、テイクオーバーは発生しません。その後、リブートが直接実行されます。

[BMC.notice]: Eventd: Got an AC_OK Failed Interrupt ...
[IPMI.notice]: 01d8 | 02 | EVT: 0300ffff | Power_Good | Assertion Event, "State Deasserted"
[IPMI.notice]: 01d9 | 02 | EVT: 0300ffff | Power_Proc_OK | Assertion Event, "State Deasserted"
[IPMI.notice]: 01da | 02 | EVT: 6f01ffff | PSU1_Present | Assertion Event, "Absent"
[IPMI.notice]: 01db | 02 | EVT: 6f01ffff | PSU2_Present | Assertion Event, "Absent"
[IPMI.notice]: 01dc | 02 | EVT: 0301ffff | AC_Power_Fail | Assertion Event, "State Asserted"
[IPMI.notice]: 01dd | 02 | EVT: 0301ffff | LAN_MGMT_0_Rst | Assertion Event, "State Asserted"
[IPMI.notice]: 01de | 02 | EVT: 0900ffff | Wrench_Port_Up | Assertion Event, "Device Disabled"
[IPMI.notice]: 01df | 02 | EVT: 0300ffff | AC_Power_Fail | Assertion Event, "State Deasserted"
[IPMI.notice]: 01e0 | 02 | EVT: 0300ffff | LAN_MGMT_0_Rst | Assertion Event, "State Deasserted"
[BMC.warning]: AC_OK Low Detected
[IPMI.notice]: 01e1 | 02 | EVT: 015000ad | P3V3 | Assertion Event, "Lower Non-critical going low " | Reading: 0.000 | Threshold: 3.027
[IPMI.notice]: 01e2 | 02 | EVT: 015200a5 | P3V3 | Assertion Event, "Lower Critical going low " | Reading: 0.000 | Threshold: 2.887
[IPMI.notice]: 01e3 | 02 | EVT: 6fc21fff | System_FW_Status | Assertion Event, "System Firmware restarting"
[IPMI.notice]: 01e4 | 02 | EVT: 015003af | P12V_STBY | Assertion Event, "Lower Non-critical going low " | Reading: 0.186 | Threshold: 10.850
[IPMI.notice]: 01e5 | 02 | EVT: 015203aa | P12V_STBY | Assertion Event, "Lower Critical going low " | Reading: 0.186 | Threshold: 10.540
[IPMI.notice]: 01e6 | 02 | EVT: 6fc201ff | System_FW_Status | Assertion Event, "Memory initialization in progress"
[SysFW.notice]: Destage is started
[IPMI.notice]: 01e7 | 02 | EVT: 6fc203ff | System_FW_Status | Assertion Event, "Memory Initialization done"
[IPMI.notice]: 01e8 | 02 | EVT: 6fc21fff | System_FW_Status | Assertion Event, "System Firmware restarting"
[IPMI.notice]: 01e9 | 02 | EVT: 6fc201ff | System_FW_Status | Assertion Event, "Memory initialization in progress"
[SysFW.notice]: Time completing destage: 16 seconds
[IPMI.notice]: 01ea | 02 | EVT: 6fc220ff | System_FW_Status | Assertion Event, "Bootloader is running"
[IPMI.notice]: 01ea | c0 | OEM: ffff7000ff00 | ManufId: 150300 | BMC Power Reset
[IPMI.notice]: 01eb | c0 | OEM: fcff70560000 | ManufId: 150300 | POS Register: Power on Reset(Normal Power Cycle)

Or SP-LATEST-SYSTEM-EVENT-LOG.
  2cc | 11/26/2025 | 09:43:49 | Power Unit #0x60 | Power off/down | Asserted
 2cd | 11/26/2025 | 09:44:07 | OEM record c0 | 000000 | 000105000000
 2ce | 01/01/2000 | 00:00:20 | System Event #0xff | Timestamp Clock Sync | Asserted
 2cf | 11/26/2025 | 09:46:18 | System Event #0xff | Timestamp Clock Sync | Asserted
 2d0 | 11/26/2025 | 09:46:18 | Battery #0x4a | State Deasserted
 2d1 | 11/26/2025 | 09:46:18 | Battery #0x4b | State Asserted
 2d2 | 11/26/2025 | 09:46:18 | Battery #0x4c | State Asserted
 2d3 | 11/26/2025 | 09:46:18 | Battery #0x4d | State Deasserted
 2d4 | 11/26/2025 | 09:46:18 | Other FRU #0x50 | 
 2d5 | 11/26/2025 | 09:46:18 | Other FRU #0x50 | 
 2d6 | 11/26/2025 | 09:46:18 | Other FRU #0x50 | 
 2d7 | 11/26/2025 | 09:46:18 | Other FRU #0x50 | 
 2d8 | 11/26/2025 | 09:46:25 | Power Supply #0x20 | Presence detected | Asserted
 2d9 | 11/26/2025 | 09:46:25 | Power Supply #0x25 | Presence detected | Asserted
 2da | 11/26/2025 | 09:46:25 | Power Supply #0x72 | Presence detected | Asserted
 2db | 11/26/2025 | 09:46:25 | Power Supply #0x73 | Presence detected | Asserted
 2dc | 11/26/2025 | 09:46:26 | OEM record c0 | 000000 | 000105000000
 2dd | 11/26/2025 | 09:46:34 | Battery #0x4f | State Deasserted
 2de | 11/26/2025 | 09:46:35 | OEM record df | FPGA pull BMC whole reset
 2df | 11/26/2025 | 09:46:35 | OEM record df | Pilot FPGA AC cycle

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.