AFF A700、FAS9000のFAN障害による環境シャットダウン
環境
- ONTAP 9
- AFF A700/FAS9000
- SPファームウェアバージョン4.7以降
問題
- 「複数のファンが故障しました」というイベントによりコントローラーの電源がオフになり、パートナーによる引き継ぎが発生します。
- AutoSupport生成:
HA Group Notification (MULTIPLE FAN FAILED: System will shut down in 2 minutes) ERROR - EMS/イベントログには、次のファンエラーが1つ以上表示されます:
::> event log show -event *fan*
[cluster-01: env_mgr: monitor.chassisFan.stop:error]: Chassis fan contains at least one stopped fan: FanB1 F1 at failed
[cluster-01: env_mgr: monitor.chassisFan.stop:error]: Chassis fan contains at least one stopped fan: FanB1 F2 at failed
[cluster-01: env_mgr: monitor.chassisFan.stop:error]: Chassis fan contains at least one stopped fan: FanB1 F3 at failed
[cluster-01: env_mgr: monitor.chassisFan.stop:error]: Chassis fan contains at least one stopped fan: FanB1 F4 at failed
[cluster-01: env_mgr: monitor.chassisFanFail.xMinShutdown:EMERGENCY]: Multiple Chassis Fan failure: System will shut down in 2 minutes.
[cluster-01: monitor: monitor.globalStatus.critical:EMERGENCY]: Multiple fans has failed: FanB1 F1, FanB1 F2, FanB1 F3, FanB1 F4.
[cluster-01: env_mgr: callhome.c.fan.fru.shut:error]: Call home for MULTIPLE FAN FAILED: System will shut down in 2 minutes
[cluster-01: env_mgr: monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (Multiple fans failed)
- SP syslog(
sp status -d出力に含まれる)には次のものが含まれます:
[477 WARNING][Porting/platform/PDKFan.c:1397]FanDaemon: Failure to read FanModuleData from fan module 1
[477 WARNING][Porting/platform/PDKFan.c:1416]FanDaemon: Fan module 1 marked bad after 3 consecutive failures
[477 WARNING][Porting/platform/PDKFan.c:1397]FanDaemon: Failure to read FanModuleData from fan module 1
[477 WARNING][Porting/platform/PDKFan.c:1416]FanDaemon: Fan module 1 marked bad after 3 consecutive failures
system node environment sensors showの出力は、FRU FanB1(またはその他の影響を受けるファン)に障害/MULTIFAULTがあることを示します。