Azure VNet 暗号化のネットワーク問題により CVO ノードのリブート HA テイクオーバーを実行不可
環境
- Microsoft AzureのCloud Volumes ONTAP(CVO)
- ONTAP 9
問題
- AzureのCloud Volumes ONTAP HAクラスタで、1つのノードが予期せずリブートしてしまい、パートナー ノードがテイクオーバーを実行できないという完全な停止が発生しました。その後、両方のノードが正常な状態に戻りましたが、最初のインシデントで次のログ メッセージがトリガーされました:
[cluster-02:cf_main:callhome.partner.down:EMERGENCY]: Callhome for PARTNER DOWN, TAKEOVER IMPOSSIBLE[cluster-02:cf_main:cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of cluster-01 disabled (unsynchronized log).[cluster-02:cf_main:cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of cluster-01 disabled (HA interconnect error. Verify that the partner node is running and that the HA interconnect cabling is correct, if applicable. For further assistance, contact technical support).[cluster-01:nlbd:vsa.azure.nlb.probeInactive:alert]: Failed to receive Load Balancer probe (now inactive) for 2 ports (port range: 63001 to 63010), within 15 seconds.[cluster-01:mgwd:dns.server.timed.out:error]: DNS server 10.0.0.25 did not respond to vserver=snm_name08 within timeout interval.[cluster-01:vifmgr:vifmgr.cluscheck.droppedall:alert]: Total packet loss when pinging from cluster lif cluster-01_clus_1 (node cluster-01) to cluster lif cluster-02_clus_2 (node cluster-02).[cluster-01:raid.vol.reparity.issue:notice]: Aggregate aggr1_1801 has invalid NVRAM contents.[cluter-01:nv.data.loss.possible:notice]: An unexpected shutdown occurred while in high write speed mode, which possibly caused a loss of data.