メインコンテンツまでスキップ
NetApp Knowledge Base site will be down for 3 hours between Oct 26, 23:59 PST and Oct 27, 02:59 PST, for system maintenance and infrastructure update.

トランシーバ問題が原因で「ディスクの冗長性に失敗しました」

Views:
Visibility:
Internal
Votes:
0
Category:
metrocluster
Specialty:
MetroCluster
Last Updated:

環境

  • MetroCluster IP
  • Ciscoバックエンドスイッチ

問題

  1. エラーメッセージ:

Tue Sep 03 04:51:31 +0200 [ClusterA-02: wafl_exempt09: mirror.stream.qp.error:debug]: params: {'mirror': 'DR PARTNER', 'qp_name': 'WAFL', 'error': 'NVMM_ERR_MIRROR_POLL_TIMEOUT'}Tue Sep 03 04:51:31 +0200 [ClusterA-02: wafl_exempt09: nvmm.mirror.aborting:debug]: mirror of sysid 2, partner_type DR PARTNER and mirror state NVMM_MIRROR_ONLINE is aborted because of reason NVMM_ERR_MIRROR_POLL_TIMEOUT.
Tue Sep 03 04:51:31 +0200 [ClusterA-02: nvmm_error: mirror.stream.qp.error:debug]: params: {'mirror': 'DR PARTNER', 'qp_name': 'WAFL', 'error': 'NVMM_ERR_MIRROR_COMPLETION'}
Tue Sep 03 04:51:31 +0200 [ClusterA-02: nvmm_error: ems.engine.suppressed:debug]: Event 'rdma.rlib.event.error' suppressed 11 times in last 263 seconds.
Tue Sep 03 04:51:31 +0200 [ClusterA-02: nvmm_error: rdma.rlib.event.error:debug]: QP wafl event error: client disconnect.
Tue Sep 03 04:51:31 +0200 [ClusterA-02: nvmm_error: nvmm.mirror.offlined:debug]: params: {'mirror': 'DR_PARTNER'}
Tue Sep 03 04:51:31 +0200 [ClusterA-02: DR_heartbeat_thread: cf.ic.xferTimedOut:error]: HA interconnect: MCC_DRSOM transfer timed out.

その後、次のような再試行が成功します。

Tue Sep 03 04:51:32 +0200 [ClusterA-02: iw_cm_wq: rdma.rlib.connected:debug]: wafl:DR:A QP is now connected.

  1. 多数の異なるディスク(すべてリモートディスク)への適用が成功した再試行と混在したエラーメッセージ:

Tue Sep 03 04:51:34 +0200 [ClusterA-02: doneq0: scsi.mcc.adt.ioTransportError:error]: mcc_adt[2] - Transport error during execution of command: HA status 0x13: CAM transport status 0x1b: cdb 0x28:356b73b3:000d.
Tue Sep 03 04:51:34 +0200 [ClusterA-02: doneq0: scsi.mcc.adt.ioTransportError:error]: mcc_adt[2] - Transport error during execution of command: HA status 0x13: CAM transport status 0x1b : cdb 0x28:356b6555:000d....
Tue Sep 03 04:51:34 +0200 [ClusterA-02: scsi_cmdblk_strthr_admin: scsi.cmd.abortedByHost:error]: Disk device 0m.i1.2L17: Command aborted by host adapter: HA status 0x13: cdb 0x28:356b73b3:000d.
Tue Sep 03 04:51:34 +0200 [ClusterA-02: scsi_cmdblk_strthr_admin: scsi.cmd.abortedByHost:error]: Disk device 0m.i1.2L17: Command aborted by host adapter: HA status 0x13: cdb 0x28:356b6555:000d.
Tue Sep 03 04:51:34 +0200 [ClusterA-02: scsi_cmdblk_strthr_admin: scsi.cmd.retrySuccess:debug]: Disk device 0v.i1.0L17: request successful after retry #1/#0: cdb 0x28:356b73b3:000d (1967)

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

This is an internal KB article and its content should not be copy/pasted and shared with people outside of NetApp. Always seek Duty Manager authentication of caller for password reset requests. If you need further assistance post a question in Knowledge Xchange
NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.

 

  • この記事は役に立ちましたか?