CPU0_Error "State Asserted" および "State Deasserted" が繰り返し報告される
環境
- FAS8200
- AFF A300
問題
- CPU0_Error "State Asserted"と"State Deasserted"がService Processor(SP)events allから繰り返し報告されます。影響を受けたノードでAttention LEDが点灯する場合があります
Record 1532: Fri Mar 26 03:45:32 2021 [IPMI.notice]: a403 | 02 | EVT: 0300ffff | CPU0_Error | Assertion Event, "State Deasserted"
 Record 1533: Fri Mar 26 03:45:39 2021 [IPMI.notice]: a503 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
 Record 1534: Fri Mar 26 03:47:32 2021 [IPMI.notice]: a603 | 02 | EVT: 0300ffff | CPU0_Error | Assertion Event, "State Deasserted"
 Record 1535: Fri Mar 26 03:47:39 2021 [IPMI.notice]: a703 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
 Record 1536: Fri Mar 26 03:47:46 2021 [IPMI.notice]: a803 | 02 | EVT: 0300ffff | CPU0_Error | Assertion Event, "State Deasserted"
 Record 1537: Fri Mar 26 03:47:53 2021 [IPMI.notice]: a903 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
 Record 1538: Fri Mar 26 03:48:07 2021 [IPMI.notice]: aa03 | 02 | EVT: 0300ffff | CPU0_Error | Assertion Event, "State Deasserted"
 Record 1539: Fri Mar 26 03:48:14 2021 [IPMI.notice]: ab03 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
 Record 1540: Fri Mar 26 03:48:28 2021 [IPMI.notice]: ac03 | 02 | EVT: 0300ffff | CPU0_Error | Assertion Event, "State Deasserted"
 Record 1541: Fri Mar 26 03:48:42 2021 [IPMI.notice]: ad03 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
- SPリブート後も問題が残る
Record 1695: Thu Jan  1 00:00:41 1970 [IPMI.notice]: bf03 | c0 | OEM: ffff70005000 | ManufId: 150300 | SP Reset Externally
 Record 1696: Thu Jan  1 00:00:41 1970 [IPMI.notice]: c003 | c0 | OEM: fcff70000000 | ManufId: 150300 | POS Register: Unexpected Reset
 Record 1697: Thu Jan  1 00:00:50 1970 [IPMI.notice]: c103 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
 Record 1698: Thu Jan  1 00:00:55 1970 [IPMI.notice]: c203 | 02 | EVT: 0300ffff | Fan_Override | Assertion Event, "State Deasserted"
- From system sensors show !normal出力、CPU Errorはfaultステータスです
Cluster::> sensors show -state !normal
   (system node environment sensors show)
 Node Sensor
      State Value/Units Crit-Low Warn-Low Warn-Hi Crit-Hi
 ---- --------------------- ------ ----------- -------- --------
 ------- -------
 node-1
    CPU0 Error       fault
                     ERROR
- SP system sensorsCPU0 ステータスはAssertedステータスにあります
Sensor Name    | Current   | Unit     | Status    | LCR     | LNC     | UNC     | UCR
 -----------------+------------+------------+------------+-----------+-----------+-----------+-----------
 CPU0_Temp_Margin | -60.000   | degrees C  | ok      | na     | na     | -10.000   | 0.000    
 In_Flow_Temp    | 21.000    | degrees C  | ok      | 0.000    | 5.000    | 50.000   | 55.000   
 Out_Flow_Temp   | 34.000    | degrees C  | ok      | 0.000    | 5.000    | 65.000   | 75.000   
 PCI_Slot_Temp   | 30.000    | degrees C  | ok      | 0.000    | 5.000    | 60.000   | 70.000   
 Smart_Bat_Temp   | 28.000    | degrees C  | ok      | 0.000    | 5.000    | 60.000   | 70.000   
 CPU0_Error     | 0x0     | discrete   | Asserted  | na     | na     | na     | na         
- SPファームウェアは最新の状態です