DWDM メンテナンス後にリモートドライブで障害が発生する
環境
- MetroCluster IP
- ONTAP 9
問題
- EMS ログにエラーが記録されます
- リモートドライブで障害が報告されている
- SyncMirror プレックスで障害が発生したため、複数のディスクに AutoSupport がない
- EMS ログ:
Mon Nov 08 12:32:07 +0100 [ClusterA-02: kernel: iscsi.session.stateChanged:notice]: iSCSI session state is changed to Reconnecting for the target iqn.2016-07.com.netapp: (type: dr_auxiliary, address: 0.0.0.0:65200). Reason: no ping reply after 5 seconds.
Mon Nov 08 12:32:07 +0100 [ClusterA-02: intr: ctl.session.stateChanged:notice]: iSCSI CAM target layer's session state is changed to terminated for the initiator iqn.1994-09.org.freebsd: (address: 0.0.0.0). Reason: no ping reply after 5 seconds.
Mon Nov 08 12:32:07 +0100 [ClusterA-02: kernel: iscsi.session.stateChanged:notice]: iSCSI session state is changed to Reconnecting for the target iqn.2016-06.com.netapp: (type: dr_auxiliary, address: 0.0.0.0:65200). Reason: no ping reply after 5 seconds.
Mon Nov 08 12:32:07 +0100 [ClusterA-02: doneq0: scsi.mcc.adt.ioTransportError:error]: mcc_adt[1] - Transport error during execution of command: HA status 0x13: CAM transport status 0x1b : cdb 0x88:00000001bf178980:00000008.
Mon Nov 08 12:32:07 +0100 [ClusterA-02: doneq0: scsi.mcc.adt.ioTransportError:error]: mcc_adt[1] - Transport error during execution of command: HA status 0x13: CAM transport status 0x1b : cdb 0x88:00000001bf178190:00000008.
Mon Nov 08 12:32:07 +0100 [ClusterA-02: scsi_cmdblk_strthr_admin: scsi.cmd.abortedByHost:error]: Disk device 0v.i2.1L21: Command aborted by host adapter: HA status 0x13: cdb 0x88:00000001bf178980:00000008.
sysconfig -r
の出力に示された順序で指定されており、
Plex /ClusterA-02/plex1 (offline, failed, inactive, pool1) RAID group /ClusterA-02/plex1/rg0 (partial, block checksums) RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks) --------- ------ ------------- ---- ---- ---- ----- -------------- -------------- dparity 0m.i2.3L13P1 0m 20 12 1 SSD N/A 1799343/3685054464 1799351/3685070848 (fast zeroed) parity 0m.i1.0L14P1 0m 20 13 1 SSD N/A 1799343/3685054464 1799351/3685070848 (fast zeroed) data FAILED N/A 1799343/ - data FAILED N/A 1799343/ - data FAILED N/A 1799343/ - data FAILED N/A 1799343/ - data FAILED N/A 1799343/ - Raid group is missing 5 disks.