メインコンテンツへスキップ

HAペアの両ノードが電源喪失により再起動する

Views:
43
Visibility:
Public
Votes:
0
Category:
aff-series
Specialty:
hw
Last Updated:

環境

  • FASシステム
  • AFFシステム

問題

  • HA ペアの両方のノードが同時に再起動します。
  • 両方の PSU の DC 低電圧と AC 障害の EMS ログの例 (両方のノードで同時に繰り返されます)。

[node_name: dsa_worker3: ses.status.psWarning:error]: DS224-12 (S/N 012345678910) shelf 0 on channel 0b power warning for Power supply 1: warning status; DC undervoltage. This module is on the rear of the shelf at the bottom left.
[node_name: dsa_worker4: ses.status.psError:alert]: DS224-12 (S/N 012345678910) shelf 0 on channel 0b power error for Power supply 1: critical status; AC Fail. This module is on the rear of the shelf at the bottom left.
[node_name: dsa_worker4: callhome.shlf.power.intr:error]: Call home for SHELF POWER INTERRUPTED
[node_name: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0b. Check fans, power supplies, disks, and temperature sensors.
[node_name: power_low_monitor: monitor.chassisPower.degraded:alert]: Chassis power is degraded: Power Supply Status Critical: PSU1.
[node_name: power_low_monitor: callhome.chassis.power:error]: Call home for CHASSIS POWER DEGRADED: Power Supply Status Critical: PSU1.
[node_name: monitor: monitor.globalStatus.critical:EMERGENCY]: Power Supply Status Critical: PSU1. Disk shelf fault.
[node_name: dsa_worker2: ses.status.psInfo:info]: DS224-12 (S/N 9872957495809) shelf 0 on channel 0b power supply information for Power supply 1: normal status.
[node_name: dsa_worker0: ses.status.psWarning:error]: DS224-12 (S/N 012345678910) shelf 0 on channel 0b power warning for Power supply 2: warning status; DC undervoltage. This module is on the rear of the shelf at the bottom right.
[node_name: dsa_worker2: callhome.shlf.ps.fault:error]: Call home for SHELF POWER SUPPLY WARNING

  • BMC/SP イベントは電源損失を報告します (両方のノードで同時に繰り返されます)。

Record 2435: Mon Dec 05 22:33:43.000000 2022 [BMC.emergency]: System input power lost
Record 2436: Sun Jan 01 00:00:22.310000 2017 [IPMI.notice]: 05f2 | c0 | OEM: ffff7000ff00 | ManufId: 150300 | BMC Power Reset
Record 2437: Sun Jan 01 00:00:22.330000 2017 [IPMI.notice]: 05f3 | c0 | OEM: fcff70560000 | ManufId: 150300 | POS Register: Power on Reset(Normal Power Cycle)

または

Record 1596: Sat Sep 11 08:03:16 2021 [SP.emergency]: System input power lost
Record 1597: Thu Jan  1 00:00:32 1970 [IPMI.notice]: ce01 | c0 | OEM: ffff7000ff00 | ManufId: 150300 | SP Power Reset
Record 1598: Thu Jan  1 00:00:32 1970 [IPMI.notice]: cf01 | c0 | OEM: fcff70560000 | ManufId: 150300 | POS Register: Power on Reset(Normal Power Cycle)

  • BMC/SP システム ログが電源の問題を報告します (両方のノードで同時に繰り返されます):

BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: EventFilter: event on sensor(#0x32 dir:3) match (15) ALERT
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: EventFilter: event on sensor(#0x34 dir:3) match (15) ALERT
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)
BMC hsam[1426]: FRU /chassis-1 LED on
BMC hsam[1426]: FRU /chassis-1/controller-b/cna-3 LED on
BMC hsam[1426]: HSAM OS(bmc):cmd(set) FLD(cna-4):fault(Overcurrent Protection Fault)
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: EventFilter: event on sensor(#0x5b dir:3) match (15) ALERT
BMC hsam[1426]: FRU /chassis-1 LED on
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)
BMC hsam[1426]: FRU /chassis-1/controller-b/cna-4 LED on
BMC hsam[1426]: HSAM OS(bmc):cmd(set) FLD(cna-1):fault(Overcurrent Protection Fault)
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: EventFilter: event on sensor(#0x5d dir:3) match (15) ALERT
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: EventFilter: event on sensor(#0x5e dir:3) match (15) ALERT
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)

  • PSU および/またはコントローラーを再装着または交換した後も問題は解決しません。

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.

 

  • この記事は役に立ちましたか?