メインコンテンツへスキップ

バックエンドのフレックスアレイディスクの消失によるマルチディスク障害

Views:
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
HW
Last Updated:

環境

  • ONTAP 9
  • フレックスアレイ

問題

  • 複数のディスク障害により単一ノードが再起動しています:

Thu May 15 05:04:39 -0400 [Node-01: cf_main: cf.fsm.takeover.mdp:alert]: Failover monitor: takeover attempted after multi-disk failure on partner

  • 問題は単一のストレージポートに限定されています。
  • ストレージポート上のディスクへのIOが中断され、パートナースイッチ経由で再試行が成功したことを示すEMSメッセージが表示されました。

Thu May 15 00:23:37 -0400 [Node-02: slifc_timeout_1: fci.device.quiesce:debug]: Adapter 2c encountered a command timeout on Disk device Switch-1:21.126 (0x010b1500) LUN 2 cdb 0x2a:0d3619d3:019b retry: 0 Quiescing the device.
Thu May 15 00:23:40 -0400 [Node-02: slifc_timeout_1: fci.device.timeout:debug]: HBA 2c encountered a device timeout on Disk device Switch-1:21.126 (0x010b1500) LUN 2 cdb 0x2a:0d3619d3:019b retry: 0
Thu May 15 00:23:46 -0400 [Node-02: slifc_intrd: scsi.cmd.abortedByHost:error]: Disk device Switch-1:21.126L42: Command aborted by host adapter: HA status 0x4: cdb 0x2a:0d3619d3:019b. 
Thu May 15 00:23:46 -0400 [Node-02: slifc_intrd: scsi.cmd.retrySuccess:debug]: Disk device Switch-2:21.126L42: request successful after retry #1/#0: cdb 0x2a:0d3619d3:019b (24266).

  • 場合によっては、IO が中止されるのではなく、失敗してディスクが応答なしとしてマークされることがあります。

Thu May 15 05:04:39 -0400 [Node-02: slifc_intrd: scsi.cmd.pastTimeToLive:error]: Disk device Switch-1:21.126L42: request failed after try #1: cdb 0x8a:00000001cfccd24a:00000249. 
Thu May 15 05:04:39 -0400 [Node-02: config_thread: raid.config.filesystem.disk.not.responding:notice]: File system Disk /aggr1/plex0/rg0/Switch-1:21.126L42 Shelf - Bay - [HITACHI  OPEN-V 8301] S/N [XXXXXXXXXXXX] UID [xx...xx] is not responding.
Thu May 15 05:04:39 -0400 [Node-02: config_thread: cf.multidisk.fatalProblem:error]: Node encountered a multidisk error or other fatal error while waiting to be taken over. aggr aggr1: raid volfsm, fatal disk error in RAID group with no parity disk..  Raid type - raid0 Group name plex0/rg0 state NORMAL. 1 disk failed in the group. Disk Switch-1:21.126L19 Shelf - Bay - [HITACHI  OPEN-V 8301] S/N [XXXXXXXXXXXX] UID [xx..xx] error: disk operation timed out..

  • 再起動後、ディスクはすべて表示され、アグリゲートは正常になります。

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.

 

  • この記事は役に立ちましたか?