メインコンテンツへスキップ

RoCEポートで予期しないLIFが停止しています

Views:
10
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
CORE
Last Updated:

環境

  • ONTAP 9.13.1以降
  • RDMA / RoCE経由のNFS
  • Mellanox / NVIDIA CX5 / CX6 / CX6-LX 10 / 25GbEまたは40 / 100GbE NIC

問題

  • 1つのRoCE対応 ポートにすでに127を超えるNFSデータLIFが設定されている場合は、次の手順を実行します。
    • LIFのフェイルオーバーや移行 が  原因でLIFがエラーなく動作停止状態になることがある
    • LIFの作成 は成功するが、 LIFの 動作は停止して おり、エラーがvifmgr.logに記録されている
clustershell::> network interface create -vserver vs0 -lif vs0_test -service-policy default-data-files -address 10.75.140.127 -netmask 255.255.255.0 -home-node node-02 -home-port e4a Info: LIF "vs0_test" on Vserver "vs0" was created successfully but could not be successfully configured on either its home port or any of its failover targets. The LIF's operational status will be reported as "down" until one or more failover targets becomes available. Use the "network interface show -vserver vs0 -lif vs0_test -failover" command to review the LIF's current failover configuration.
  • vifmgr.logのエラー

例: 

(03/26/2024 16:41:03): > [Net::LifStackAdapter::installLif] vserverId=3, lifId=1278, address=10.95.86.122, portName=e3a, lifProtocols=0x1 (03/26/2024 16:41:03): > [SkStackMgr::addLif] PARAM lifId 1278, portName e3a, address 10.95.86.122, ipspaceId 4294967295, vserverId 3, lifUuid 98cd9a48-ea28-11ee-ad09-d039eaa9ecf3, isMccRequest false, lifProtocols 0x001, serviceMask 0x000000013D000804, homeNode perfqa-vino-03 (03/26/2024 16:41:03): > [NbladeWriter::addLif] PARAM: lifId: 1278, address 10.95.86.122, netmask 255.255.255.0, ipspaceId: 4294967295, vserverId: 3, portName: e3a, isMccRequest: false, protocolMask: 00000001, serviceMask: 0x000000013D000804, homeNode^I: perfqa-vino-03(ccdcca33-ea25-11ee-ad09-d039eaa9ecf3) (03/26/2024 16:41:03): > [NbladeWriter::nitroPcpRpcCall] procNum=3, isIdemp=false (03/26/2024 16:41:03): > [DelayTracker::add_sample] ENTRY: object=nblade, delay_ms=53 (03/26/2024 16:41:03): < [DelayTracker::add_sample] EXIT: object=nblade, state=NORMAL (03/26/2024 16:41:03): < [NbladeWriter::nitroPcpRpcCall] elapsed time: 0s) (03/26/2024 16:41:03): [NbladeWriter::ScopedNitroRequest::sendRequest] RPC for procedure 3 completed, but returned error: NbladeWriter Error type unknown: 12046 (03/26/2024 16:41:03): < [NbladeWriter::ScopedNitroRequest::sendRequest] (03/26/2024 16:41:03): < [NbladeWriter::addLif] retval: NbladeWriter Error type unknown: 12046 (03/26/2024 16:41:03): [SkStackMgr::addLif] Unexpected error adding the LIF to the stack: NbladeWriter Error type unknown: 12046 (03/26/2024 16:41:03): < [SkStackMgr::addLif] complete, returning Unexpected error "NbladeWriter Error type unknown: 12046" encountered as a result of adding the LIF. (03/26/2024 16:41:03): [Net::LifStackAdapter::installLif] Failed to add the requested LIF: Unexpected error "NbladeWriter Error type unknown: 12046" encountered as a result of adding the LIF. (03/26/2024 16:41:03): [Net::AbortableHandle::commit] Caught an unexpected exception: Unexpected error "NbladeWriter Error type unknown: 12046" encountered as a result of adding the LIF. (03/26/2024 16:41:03): ERR{ commit() at src/framework/objects/base/AbortableHandle.cc:65 }
  • ポートは  RoCEオフロード機能を備えたNIC上にある(Mellanox / NVIDIA CX5/CX6/CX6-LXなど) 

例:

::> network port show -node node-02 -fields rdma-protocols node port rdma-protocols -------- ---- -------------- node-02 e0M - node-02 e1a roce node-02 e1b roce node-02 e3a roce node-02 e3b roce node-02 e3c roce node-02 e3d roce 7 entries were displayed.
  •  NFSサーバでRDMA が有効になっている (ONTAP 9.10.1以降ではデフォルト)

メモ: 確認するには、 vserver nfs show -fields rdma

 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.