メインコンテンツへスキップ

クラスタポートのCRCエラーが多いため、クラスタネットワークがデグレードしました

Views:
49
Visibility:
Public
Votes:
0
Category:
aff-series
Specialty:
hw
Last Updated:

環境

  • ONTAP 9
  • FAS / AFFシステム
  • CN1610クラスタスイッチ
  • BES-53248クラスタスイッチ

問題

  • CRCエラーが原因でクラスタネットワークがデグレード状態になっており、イベントログには次のエラーが記録されます。

[Node-01: intr: netif.linkErrors:error]: Excessive link errors on network interface e0b. Might indicate a bad cable, switch port, or NIC, or that a cable connector is not fully inserted in a socket. On a 10/100 port, might indicate a duplex mismatch.
[Node-01: vifmgr: vifmgr.cluscheck.hwerrors:alert]: Port e0a on node Node-01 is reporting a high number (at least 1 per 1000 packets) of observed hardware errors (CRC, length, alignment, dropped).
[Node-01: vifmgr: callhome.clus.net.degraded:alert]: Call home for CLUSTER NETWORK DEGRADED: CRC Errors Detected - High CRC errors detected on port e0a node Node-01

  • クラスタポートでリンクフラップが検出されると、イベントログに次のアラートが記録されます。

[Node-01: vifmgr: vifmgr.port.monitor.failed:error]: The "link_flapping" health check for port e0a (node Node-01) has failed. The port is operating in a degraded state.
[Node-01: vifmgr: callhome.clus.net.degraded:alert]: Call home for CLUSTER NETWORK DEGRADED: Frequent Link Flapping - Cluster port e0a on node Node-01 has experienced multiple link down notifications.

  • すべてのノードのクラスタポートでCRCエラーが大量に発生しています。

::> system node run -node <node-name> -command ifstat <port-name>

-- interface  e0a  (4 days, 14 hours, 42 minutes, 47 seconds) --

RECEIVE
 Total frames:   86771k | Frames/second:    218  | Total bytes:     289g
 Bytes/second:    727k | Total errors:   65389 | Errors/minute:    10
 Total discards:    0  | Discards/minute:    0  | Multi/broadcast:   121k
 Non-primary u/c:    0  | CRC errors:    22207 | Runt frames:      0
 Fragment:       0  | Long frames:      0  | Jabber:      41971
 Length errors:   1211  | No buffer:       0  | Xon:          0
 Xoff:         0  | Pause:         0  | Jumbo:       31475k
 Noproto:        0  | Error symbol:    243k | Illegal symbol:   217k
 Bus overruns:     0  | Queue drops:      0  | LRO segments:   62544k

  • スイッチ側でも多数のRxエラーとTxエラーとポートフラップが確認されています。

#show interface counters

Port        InOctets    InUcastPkts    InMcastPkts    InBcastPkts     InDropPkts      Rx Error
--------- ---------------- ---------------- ---------------- ---------------- ---------------- ----------------
0/1      63884683472614    34223820975       116925       80962         5       35838
0/2     265584648397991    43844458781       116922       81071         1      1961079

Port        OutOctets    OutUcastPkts    OutMcastPkts    OutBcastPkts    OutDropPkts      Tx Error
--------- ---------------- ---------------- ---------------- ---------------- ---------------- ----------------
0/1     265607061634499    43843431844      1638223       565759      1952351      1952351
0/2      63884090686727    34225894361      1638180       565624       35018       35015

  • ノード側のSFPを交換しても、エラーは停止しません。
  • ストレージでエラーが報告されたすべてのノード/ポートを、 network device-discovery show  の出力ごとに同じスイッチに接続できます。

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.