複数のファンに障害が発生すると、システムがシャットダウンし
環境
- AFFとFAS
- ONTAP 9
- Data ONTAP 8
問題
ファン障害メッセージが表示され、システムがシャットダウンします。
- EMS ログ:
Jul 16 23:22:58 [node-1:callhome.shlf.fan:EMERGENCY]: Call home for SHELF COOLING UNIT FAILED
Jul 16 23:23:07 [node-1:monitor.fan.critical:EMERGENCY]: 2 fans have failed. Replace them to avoid overheating. If not corrected, system will shutdown in 2 minutes.
Jul 16 23:23:37 [node-1:callhome.fans.failed:EMERGENCY]: Call home for MULTIPLE FAN FAILURE
Jul 16 23:25:27 [node-1:monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (Multiple fans failed)
AC_Power_
Fail
Asserted
Deasserted
Service Processor ( SP ;サービスプロセッサ)events all
ログで、エラーが繰り返し報告されます。
Record 2043: Sat Jul 17 08:03:44 2021 [IPMI Event.critical]: IPMI SEL log limit exceeded
Record 2044: Sat Jul 17 08:03:44 2021 [IPMI.notice]: fc07 | 02 | EVT: 0300ffff | AC_Power_Fail | Assertion Event, "State Deasserted"
Record 2045: Sat Jul 17 08:03:48 2021 [SP.warning]: Agent IIRQ false alarm
Record 2046: Sat Jul 17 08:04:06 2021 [IPMI Event.critical]: IPMI SEL log limit exceeded
Record 2047: Sat Jul 17 08:04:06 2021 [IPMI.notice]: fd07 | 02 | EVT: 0301ffff | AC_Power_Fail | Assertion Event, "State Asserted"
Record 2048: Sat Jul 17 08:04:08 2021 [IPMI Event.critical]: IPMI SEL log limit exceeded
Record 2049: Sat Jul 17 08:04:08 2021 [IPMI.notice]: fe07 | 02 | EVT: 0300ffff | AC_Power_Fail | Assertion Event, "State Deasserted"
Record 2050: Sat Jul 17 08:04:09 2021 [IPMI Event.critical]: IPMI SEL log limit exceeded
Record 2051: Sat Jul 17 08:04:09 2021 [IPMI.notice]: ff07 | 02 | EVT: 0301ffff | AC_Power_Fail | Assertion Event, "State Asserted"
- CPU 関連の情報を SP
system sensors
出力から読み取ることはできません。
Sensor Name | Current | Unit | Status | LCR | LNC | UNC | UCR
-----------------+------------+------------+------------+-----------+-----------+-----------+-----------
CPU0_Temp_Margin | na | degrees C | na | na | na | -5.000 | 0.000
In_Flow_Temp | 24.000 | degrees C | ok | 0.000 | 10.000 | 70.000 | 75.000
Out_Flow_Temp | 23.000 | degrees C | ok | 0.000 | 10.000 | 82.000 | 87.000
Memory_Hot | 0x0 | discrete | Deasserted | na | na | na | na
CPU_Hot | 0x0 | discrete | Deasserted | na | na | na | na
CPU_Cat_Error | 0x0 | discrete | Deasserted | na | na | na | na
CPU_Therm_Trip | 0x0 | discrete | Deasserted | na | na | na | na
CPU_VCC | na | Volts | na | 0.708 | 0.747 | 1.348 | 1.426
CPU_1.05V | na | Volts | na | 0.892 | 0.941 | 1.154 | 1.203
CPU_VTT | na | Volts | na | 0.931 | 0.989 | 1.213 | 1.261
LM56_Temp | na | degrees C | na | 0.000 | 10.000 | 79.000 | 84.000
CPU_1.5V | na | Volts | na | 1.271 | 1.348 | 1.649 | 1.727
1G_1.0V | na | Volts | na | 0.854 | 0.902 | 1.096 | 1.154
FC_0.9V | na | Volts | na | 0.776 | 0.815 | 0.989 | 1.038
FC_1.0V | na | Volts | na | 0.854 | 0.902 | 1.096 | 1.154
USB_5.0V | na | Volts | na | 4.253 | 4.495 | 5.492 | 5.759
PCH_3.3V | na | Volts | na | 2.798 | 2.973 | 3.625 | 3.800
SASS_1.0V | na | Volts | na | 0.854 | 0.902 | 1.096 | 1.145
SASS_1.2V | na | Volts | na | 1.018 | 1.077 | 1.319 | 1.377
IB_1.2V | na | Volts | na | 1.018 | 1.077 | 1.319 | 1.37