ACP コマンドが停止したため、 SP をリブートしています
環境
- ONTAP 9
- サービス プロセッサ(SP)
問題
- ACP アラートの問題がクリアされたあとの SP のリブート。EMS ログの 例:
[node_name-01: dsa_worker1: ses.status.ACPError:alert]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor error for SAS shelf ACP processor 2: critical status ; Alternate Control Path hardware failed This module is on the rear of the shelf at the top right, on shelf module B.
[node_name-01: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors.
[node_name-01: monitor: monitor.globalStatus.critical:EMERGENCY]: Disk shelf fault.
[node_name-01: dsa_worker2: ses.status.ACPError:alert]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor error for SAS shelf ACP processor 1: critical status ; Alternate Control Path hardware failed This module is on the rear of the shelf at the top left, on shelf module A.
[node_name-01: dsa_worker2: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 2: normal status.
[node_name-01: splog_main: splog.running.normally:info]: Process splogd is operating normally.
[node_name-01: dsa_worker1: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 1: normal status.
[node_name-01: statd: monitor.shelf.fault.ok:notice]: Fault previously reported on disk storage shelf attached to channel 0a has been corrected.
[node_name-01: monitor: monitor.globalStatus.ok:notice]: The system's global status is normal.
[node_name-02: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors.
[node_name-02: monitor: monitor.globalStatus.critical:EMERGENCY]: Disk shelf fault.
[node_name-02: dsa_worker1: ses.status.ACPError:alert]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor error for SAS shelf ACP processor 1: critical status ; Alternate Control Path hardware failed This module is on the rear of the shelf at the top left, on shelf module A.
[node_name-02: splog_main: splog.running.normally:info]: Process splogd is operating normally.
[node_name-02: dsa_worker3: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 2: normal status.
[node_name-02: dsa_worker2: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 1: normal status.
[node_name-02: statd: monitor.shelf.fault.ok:notice]: Fault previously reported on disk storage shelf attached to channel 0a has been corrected.
[node_name-02: monitor: monitor.globalStatus.ok:notice]: The system's global status is normal.
- 次のようなイベントメッセージ が表示されて、 SP が自動的にリブート
Record 833: Tue Oct 13 18:20:19 2020 [SP.critical]: Rebooting SP due to loss of ACP comms
- ACP ステータスは正常で正常に動作しています。
- 管理 e0M ポートを経由した送信フレーム数および 1 秒あたりのバイト数:
-- interface e0M (30 days, 20 hours, 46 minutes, 42 seconds) --
RECEIVE
…
TRANSMIT
>>>Total frames: 2992m | Frames/second: 1122 | Total bytes: 4523g
Bytes/second: 1696k | Total errors: 0 | Errors/minute: 0
Total discards: 0 | Queue overflow: 0 | Multi/broadcast: 90594
…
-- interface e0M (30 days, 20 hours, 44 minutes, 31 seconds) --
RECEIVE
…
TRANSMIT
>>>Total frames: 216m | Frames/second: 81 | Total bytes: 322g
Bytes/second: 120k | Total errors: 0 | Errors/minute: 0
Total discards: 0 | Queue overflow: 0 | Multi/broadcast: 90526
…
- ノード管理 LIF とクラスタ間 LIF で同じブロードキャストドメインを共有しています。