複数のディスクが見つからないため、AWSまたはGCP CVOがリブートされました

最後の更新
PDFとして保存

Views:: 21

Visibility:: Public

Votes:: 0

Category:: cloud-volumes-ontap-cvo

Specialty:: ds_cvo

Last Updated:

環境

Cloud Volumes ONTAP（CVO）
Amazon Web Services（AWS）
Google Cloud Provider （GCP）

問題

AWS / GCP CVOノードがサバイビングHAパートナーからのAutoSupportでリブートされました HA Group Notification (MULTIPLE DISKS MISSING) ERROR。
稼働しているノードのEMSログから、障害が発生したノードに接続されているミラーされたPool1ディスクへのアクセスを失ったことがわかります。

Mon Jun 03 16:23:02 +0000 [CVO-01: monitor: monitor.globalStatus.critical:EMERGENCY]: This node has taken over CVO-02. One or more mirrored aggregates are degraded.

Mon Jun 03 16:22:35 +0000 [CVO-01: dmgr_thread: raid.disk.missing:info]: Disk /aggr1/plex1/rg0/0d.10 S/N [00000000V9NeubcHXfRG] UID [00000000V9NeubcHXfRG] is missing from the system Mon Jun 03 16:22:35 +0000 [CVO-01: config_thread: raid.config.filesystem.disk.missing:info]: File system Disk /aggr1/plex1/rg0/0d.10 S/N [00000000V9NeubcHXfRG] UID [00000000V9NeubcHXfRG] is missing.

注：上記のエラーは、影響を受けたノードCVO-02が所有するすべてのディスクについて表示されます。

storage failover showの出力は、以下のようにPrevious giveback failed in module: raidを報告します：

::> storage failover show Takeover Node Partner Possible State Description -------------- -------------- -------- ------------------------------------- CVO-01 CVO-02 false Previous giveback failed in module: raid CVO-02 CVO-01 - Waiting for giveback

EMSログ（以下のエラーはRAIDの再同期が完了するまで繰り返される場合があります）：

Sat Jul 19 04:15:20 +0000 [CVO-01: cf_main: gb.cfo.abort.raid.fm:error]: Aggregate local:aggr8 is being resynced; canceling giveback. Sat Jul 19 04:15:20 +0000 [CVO-01: cf_main: cf.rsrc.givebackVeto:alert]: Failover monitor: raid: giveback canceled due to active state. Sat Jul 19 04:15:20 +0000 [CVO-01: cf_main: cf.fsm.autoGivebackVetoed:error]: Failover monitor: Automatic giveback has been deferred due to long running operations

このイベントの直後に、ディスクが見つからないことの残留症状として、次の AutoSupport アラートが生成される場合があります：

HA Group Notification (SYNCMIRROR PLEX FAILED) ALERT

NODEOQ：CVO-02からのHAグループ通知（NODE(S) OUT OF CLUSTER QUORUM）EMERGENCY

ノードはリブート後、提示されたAWS / GCPディスクへの接続を再確立でき、ギブバックが正常に完了します。