メインコンテンツまでスキップ

複数のディスク障害が原因で発生したテイクオーバー

Views:
14
Visibility:
Public
Votes:
0
Category:
metrocluster
Specialty:
7dot
Last Updated:

環境

  • Data ONTAP (7-Mode)8.2.5P5
  • FAS6250
  • 2ノードファブリック接続MetroCluster

問題

このメッセージには、ディスク関連の複数のエラーが示されています。
  • 複数のディスクの「書き込み処理中のチェックサムエントリが無効です」
  • 複数のディスクの「整合性のあるラベルセット(CLS)になっていないため、孤立したディスクがあります
  • 複数のディスクの「算出されたプレックス整合性ラベルセット」よりも新しいため、孤立ディスク
  • SyncMirror プレックスでAutoSupportがトリガーされませんでした
  • 「iskown.ownerReservationMismatch」エラーが発生しました
 
次に例を示します。
Sat May 15 04:50:41 UTC [Node01:raid.tetris.cksum.embed:CRITICAL]: Invalid checksum entry on Disk /aggr_Node01_data/plex1/rg1/Site01-sw1:2.126L36 Shelf 31 Bay 9 [NETAPP   X422_SLTNG600A10 NA02] S/N [SerialNumber], block #60799576, during write operation.  
Sat May 15 04:51:16 UTC [Node01:raid.assim.cls.notInCls:error]: Orphaning disk Site02-sw1:2.126L14 in plex aggr_Node01_data/1, because not in consistent label set (CLS). 
Sat May 15 04:51:16 UTC [Node01:raid.assim.cls.moreRecent:error]: Orphaning disk Site01-sw2:2.126L14 in plex aggr_Node01_data/0, because it is more recent (146175/1789746823, 146175/1789746823) than the calculated plex consistent label set (146174/1789745659).
Sat May 15 04:51:16 UTC [Node01:raid.assim.rg.missingChild:error]: Aggregate aggr_Node01_data, rgobj_verify: RAID object 0 has only 18 valid children, expected 22.  
Sat May 15 04:51:16 UTC [Node01:raid.assim.plex.missingChild:error]: Aggregate aggr_Node01_data, plexobj_verify: Plex 1 only has 1 working RAID groups (2 total) and is being taken offline  
Sat May 15 04:51:16 UTC [Node01:callhome.syncm.plex:CRITICAL]: Call home for SYNCMIRROR PLEX FAILED 
Sat May 15 04:51:17 UTC [Node01:raid.config.check.failedPlex:error]: Plex /aggr_Node01_data/plex1 has failed.  
Sat May 15 04:51:17 UTC [Node01:monitor.diskLabelCheckFailed:warning]: Periodic check of RAID Disk /aggr_Node01_data/plex1/rg0/Site01-sw1:2.126L54 Shelf 32 Bay 1 [NETAPP   X422_SLTNG600A10 NA02] S/N [SerialNumber] has failed. The system will correct the problem.  
Sat May 15 04:51:17 UTC [Node01:monitor.diskLabelCheckFailed:warning]: Periodic check of RAID Disk Site01-sw1:2.126L14 Shelf 30 Bay 13 [NETAPP   X422_SCOMP600A10 NA03] S/N [SerialNumber] has failed. The system will correct the problem.  
Sat May 15 04:51:17 UTC [Node01:raid.config.check.failedPlex:error]: Plex /aggr_Node01_data/plex1 has failed.  
Sat May 15 04:51:39 UTC [Node01:diskown.ownerReservationMismatch:warning]: disk Site01-sw2:2.126L12 (S/N SerialNumber) is supposed to be owned by this node but has a persistent reservation placed by node ?? (ID 28600)
 
このエラーが最初に発生した直後にノードがパートナーにテイクオーバーされるのは、ノードがデグレード状態になったためです。
 
例:
 A disk reservation was detected on disk Site01-sw1:2.126L8 at DDMMMYYYY 04:53:51
Ordinarily, this will only occur if the partner node has taken over.
This node will be shutdown.
HALT: HA partner has taken over disk reservations
Uptime: ddhhmmss
System rebooting...
 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.