メインコンテンツまでスキップ

「hung_task」と「hung_task_timeout_secs」でNVRAMカードで障害が発生したためノードがオフラインになりました

Views:
101
Visibility:
Public
Votes:
0
Category:
element-software<a>2009558150</a>
Specialty:
solidfire
Last Updated:

環境

SolidFire AFA:SF19210

問題

  • ノードがオフラインになり、それより前 にsf-master.infoに次の情報が表示されます 

2023-04-29T18:44:47.632229Z SFALPSF08 master-1[26751]: [APP-5] [Leader] 28567 CMIscsiConnectMo serviceshared/LeaderCoordinator.cpp:618:OnClusterMasterConnectCallback|Full vote, based on connection states shouldVote=1 stateVote=1 sequenceNumber=143 nodesWithWorkingEAContainers={57,72,86,126,154,155,185,199}
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@

  • dmesg -Tは、nvme0n1の「hung_task_timeout_sec」を表示します
crash> dmesg -T [Sat Apr 29 18:49:04 UTC 2023] INFO: task jbd2/nvme0n1-8:26613 blocked for more than 120 seconds. [Sat Apr 29 18:49:04 UTC 2023] Tainted: G O 4.19.37-solidfire8 #1 [Sat Apr 29 18:49:04 UTC 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  • クラッシュ後に複数のコアダンプが生成されました

  -rw-rw-rw- 1 dexterap engr 76763717096 Apr 29 12:20 dump.202304291845
    -rw-rw-rw- 1 dexterap engr 776107259 Apr 29 12:32 dump.202304291928

  • NVRAMカード「nvme0n1」でコアファイルが複数のカーネルパニックを示す
KERNEL: /sf_debug/12.3.2.3/lib64/modules/4.19.37-solidfire8/vmlinux-ember-x86_64-4.19.37-solidfire8 DUMPFILE: dump.202304291845 [PARTIAL DUMP] CPUS: 56 DATE: Sat Apr 29 18:45:09 UTC 2023 UPTIME: 380 days, 21:16:56 LOAD AVERAGE: 3.68, 3.95, 4.22 TASKS: 3273 NODENAME: QALPOGSF08 RELEASE: 4.19.37-solidfire8 VERSION: #1 SMP Mon Aug 17 14:34:57 UTC 2020 MACHINE: x86_64 (2600 Mhz) MEMORY: 383.9 GB PANIC: "Kernel panic - not syncing: hung_task: blocked tasks" PID: 299 COMMAND: "khungtaskd" TASK: ffff8f9c77b71d80 [THREAD_INFO: ffff8f9c77b71d80] CPU: 22 STATE: TASK_RUNNING (PANIC) [32908851.679379] INFO: task jbd2/nvme0n1-8:26613 blocked for more than 120 seconds. [32908852.259911] Kernel panic - not syncing: hung_task: blocked tasks

 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.