メインコンテンツへスキップ

スイッチ側SFPの障害によるBrocade スイッチ ポートのフラッピング

Views:
231
Visibility:
Public
Votes:
0
Category:
fabric-interconnect-and-management-switches
Specialty:
san
Last Updated:

環境

  • すべてのBrocadeスイッチ ハードウェア プラットフォーム
  • すべてのBrocade Fabric Operating System(FOS)ファームウェア レベル
  • バックエンドMCCスイッチ

問題

  • スイッチ ポートのステータスがswitchshowに「online」と表示される:

/fabos/bin/switchshow :
Index Slot Port Address Media  Speed     State   Proto
============================================================
75   9   11   0f4b40   id   N32   Online    FC  F-Port  10:00:94:40:c9:cf:4a:b1

  • C3 discards, C3timeout TX errors, link fail, loss sync および uncorr エラーは porterrshow で報告されます:

porterrshow 9/11
           frames      enc    crc    crc    too    too    bad    enc   disc   link   loss   loss   frjt   fbsy  c3timeout    pcs    uncor
        tx     rx      in    err    g_eof  shrt   long   eof     out   c3    fail    sync   sig                  tx    rx     err    err
75:  204.0m 293.1m   0      0      0      0      0      0      0    540    447    284    1    0      0    5400      0      0    96

  • Switchshow 出力はポートの状態が「In_Sync」であることを示しています -

Index Slot Port Address Media Speed State Proto 
============================================================ 
74 9 10 674a40 id N16 In_Sync FC

=============
Port  4:
=============
Length 62.5u:   0   (units 10 meters)
Length Cu:   0   (units 1 meter)
Vendor Name: BROCADE
Vendor OUI:  00:05:1e
Vendor PN:   57-0000088-01
Vendor Rev:  A
Wavelength:  850  (units nm)
Options:    003a Loss_of_Sig,Tx_Fault,Tx_Disable
BR Max:    0
BR Min:    0
Serial No:   HAF618230000T4B
RX Power:   -3.0   dBm (501.1uW)
TX Power:   -7.1   dBm (195.8 uW)

 

  • リンク リセット(LR_OUT)がFabriclog出力およびofflineで報告され、onlineイベントが報告されている

Switch 0; Fri Nov 11 11:23:12 2022 IST (GMT+5:30)
22:52:28.290358 SCN LR_PORT(0);g=0x19ee LR_OUT        A2,P0  A2,P0  75   NA   
23:08:39.902472 SCN LR_PORT(0);g=0x19ee LR_OUT        A2,P0  A2,P0  75   NA   
23:17:38.738930 SCN LR_PORT(0);g=0x19ee LR_OUT        A2,P0  A2,P0  75   NA   
23:22:26.529633 SCN LR_PORT(0);g=0x19ee LR_OUT        A2,P0  A2,P0  75   NA   
23:24:29.226184 SCN LR_PORT(0);g=0x19ee LR_OUT        A2,P0  A2,P0  75   NA   
23:25:11.419546 SCN LR_PORT(0);g=0x19ee LR_OUT        A2,P0  A2,P0  75   NA   
23:25:53.721693 SCN LR_PORT(0);g=0x19ee LR_OUT        A2,P0  A2,P0  75   NA

15:43:33.967361 SCN Port Offline;rsn=0x2,g=0x2e50       A2,P0  A2,P0  78   NA   
15:43:33.967370 *Removing all nodes from port         A2,P0  A2,P0  78   NA   
15:43:34.615134 SCN LR_PORT(0);g=0x2e50            A2,P0  A2,P0  78   NA   
15:43:34.694264 SCN Port Online; g=0x2e50,isolated=0     A2,P0  A2,P1  78   NA   
15:43:34.695063 Port Elp engaged               A2,P1  A2,P0  78   NA   
15:43:34.695079 *Removing all nodes from port         A2,P0  A2,P0  78   NA   
15:43:34.695304 SCN Port F_PORT                A2,P1  A2,P0  78   NA   
15:51:04.900869 SCN Port Offline;rsn=0x4,g=0x2e52       A2,P0  A2,P0  78   NA   
15:51:04.900878 *Removing all nodes from port         A2,P0  A2,P0  78   NA   
15:51:04.913521 SCN LR_PORT(0);g=0x2e52            A2,P0  A2,P0  78   NA   
15:51:04.986758 SCN Port Online; g=0x2e52,isolated=0     A2,P0  A2,P1  78   NA   
15:51:04.986848 Port Elp engaged               A2,P1  A2,P0  78   NA   
15:51:04.986862 *Removing all nodes from port         A2,P0  A2,P0  78   NA   
15:51:04.987210 SCN Port F_PORT                A2,P1  A2,P0  78   NA  

Slot  9/Port 11:
=============
RX Power:   -2.4   dBm (573.4uW)
TX Power:   -1.0   dBm (795.1 uW)

  • errdumpログでは、ルールdefALL_TARGET_PORTSSTATE_CHG_3defRD_1stDATA_TIME_11000およびdefRD_STATUS_TIME_12000ルールが生成され、ポートの状態が毎分3回以上変化していることを示します。

2023/01/28-16:03:41, [MAPS-1003], 187624, SLOT 1 | FID 128, WARNING, Switch_Name, slot9 port14, F-Port 9/14, Condition=ALL_TARGET_PORTS(STATE_CHG/min>3), Current Value:[STATE_CHG, 4], RuleName=defALL_TARGET_PORTSSTATE_CHG_3, Dashboard Category=Port Health.
2023/01/28-16:04:05, [MAPS-1003], 187625, SLOT 1 | FID 128, WARNING, Switch_Name, slot9 port14, F-Port 9/14, Condition=ALL_OTHER_F_PORTS(STATE_CHG/min>5), Current Value:[STATE_CHG, 6], RuleName=defALL_OTHER_F_PORTSSTATE_CHG_5, Dashboard Category=Port Healt
2023/01/28-16:04:53, [MAPS-1003], 187626, SLOT 1 | FID 128, WARNING, Switch_Name, slot9 port14, F-Port 9/14, Condition=ALL_TARGET_PORTS(STATE_CHG/min>3), Current Value:[STATE_CHG, 4], RuleName=defALL_TARGET_PORTSSTATE_CHG_3, Dashboard Category=Port Health.
2023/04/24-19:37:30 (IST), [MAPS-1003], 193109, スロット 2 | FID 128、警告、Switch_Name、フロー (SID=0x665641、DID=0x663440、ホスト ポート=10/6)、条件=sys_flow_monitor_scsi(RD_1stDATA_TIME/10SEC>11000)、現在の値:[RD_1stDATA_TIME、12952 マイクロ秒]、RuleName=defRD_1stDATA_TIME_11000、ダッシュボード カテゴリ=IO レイテンシ、クワイエット時間=10 分

2023/04/24-19:37:30 (IST), [MAPS-1003], 193110, SLOT 2 | FID 128, WARNING, Switch_Name, Flow (SID=0x665641,DID=0x663440,Host Port=10/6), Condition=sys_flow_monitor_scsi(RD_STATUS_TIME/10SEC>12000), Current Value:[RD_STATUS_TIME, 12953 Microseconds], RuleName=defRD_STATUS_TIME_12000, Dashboard Category=IO Latency, Quiet Time=10 min.

  • Frame timeout スイッチ側の errdump ログに報告されたイベントで、フレームを受信したポート(rx)と送信できないポート(tx)と、link reset イベントが記録されます:

2024/02/25-02:26:06 (GMT), [AN-1014], 588, FID 128, INFO, switch, Frame timeout detected, tx port 4 rx port 20, sid 500xx, did 704zz, timestamp 2024-02-25 02:26:06 .
2024/02/25-02:26:07 (GMT), [AN-1014], 608, FID 128, INFO, switch, Frame timeout detected, tx port 4 rx port 0, sid 700xx, did 704zz, timestamp 2024-02-25 02:26:07 .

2024/02/25-02:26:08 (GMT), [C4-1014], 628, CHASSIS | PORT 0/4, WARNING, switch,  Link Reset on Port S0,P4(22) vc_no=0 crd(s)lost=6 auto trigger.

 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.