メインコンテンツへスキップ

ハードウェア障害が原因でStorageGRIDストレージノードがベースOSのアップグレードを停止しました

Views:
9
Visibility:
Public
Votes:
0
Category:
storagegrid
Specialty:
sgrid
Last Updated:

環境

すべてのStorageGRIDアプライアンス

問題

  • StorageGRIDのアップグレード中、ストレージノードが Upgrading BaseOS ステップで停止します。
  • SSHでノードに接続し、ノードがBaseOSになっていることを確認する 
    • Green root@SG はBaseOSであることを意味します。
  • /var/log/syslog 次の情報を報告します。
    •  dockerd[3299]: Error starting daemon: Devices cgroup isn't mounted
    • kernel: [  482.217615] CPU: 15 PID: 0 Comm: swapper/15 Kdump: loaded Tainted: G       OE    4.19.0-18-amd64 #1 Debian 4.19.208-1+ntapB
      kernel: [  482.217616] Hardware name: Default string Default string/Default string, BIOS 0.12.0 01/16/2017
      kernel: [  482.217616] Call Trace:
      kernel: [  482.217619]  <IRQ>
      kernel: [  482.217627]  dump_stack+0x66/0x81
      kernel: [  482.217630]  nmi_cpu_backtrace.cold.4+0x13/0x50
      kernel: [  482.217635]  ? lapic_can_unplug_cpu+0x80/0x80
      kernel: [  482.217640]  nmi_trigger_cpumask_backtrace+0xf9/0x100
      kernel: [  482.217645]  __handle_sysrq.cold.9+0x45/0xf2
      kernel: [  482.217650]  fpgaIsr.cold.5+0xbb/0x168 [fpga_pci]
      kernel: [  482.217655]  __handle_irq_event_percpu+0x46/0x190
      kernel: [  482.217657]  handle_irq_event_percpu+0x30/0x80
      kernel: [  482.217659]  handle_irq_event+0x3c/0x60
      kernel: [  482.217661]  handle_edge_irq+0x97/0x1e0
      kernel: [  482.217665]  handle_irq+0x1f/0x30
      kernel: [  482.217667]  do_IRQ+0x49/0xe0
      kernel: [  482.217671]  common_interrupt+0xf/0xf
      kernel: [  482.217672]  </IRQ>
      kernel: [  482.217677] RIP: 0010:cpuidle_enter_state+0xb9/0x320
      kernel: [  482.217678] Code: e8 5c 5e b2 ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 3b 02 00 00 31 ff e8 6e ea b7 ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff ff f3 01 00 00 48 2b 1c 24 ba ff ff ff 7f 48 39 c3
      kernel: [  482.217679] RSP: 0018:ffffaafc4635be90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffde
      kernel: [  482.217681] RAX: ffff94e6fede2140 RBX: 0000007045edefd5 RCX: 000000000000001f
      kernel: [  482.217682] RDX: 0000007045edefd5 RSI: 0000000040000431 RDI: 0000000000000000
      kernel: [  482.217683] RBP: ffff94e6fedea628 R08: 0000000000000004 R09: 0000000000021a00
      kernel: [  482.217684] R10: 00000cba5b622362 R11: ffff94e6fede1128 R12: 0000000000000004
      kernel: [  482.217684] R13: ffffffff95eb7238 R14: 0000000000000004 R15: 0000000000000000
      kernel: [  482.217691]  do_idle+0x228/0x270
      kernel: [  482.217694]  cpu_startup_entry+0x6f/0x80
      kernel: [  482.217696]  start_secondary+0x1a4/0x200
      kernel: [  482.217699]  secondary_startup_64+0xa4/0xb0

  • /var/log/kern.log レポート

localhost kernel: [21180973.168607] blk_update_request: I/O error, dev sda, sector 31254528 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
localhost kernel: [21180973.196905] blk_update_request: I/O error, dev sda, sector 31254400 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0

  • base-os-logs/run/mount-tmp/pge-actv-root/var/log  リブート中のレポート

StorageGRID-PGE root: [2025-02-05 03:17:39+00:00 SGA] mount: /efi: wrong fs type, bad option, bad superblock on /dev/sda4, missing codepage or helper program, or other error.
StorageGRID-PGE root: [2025-02-05 03:18:02+00:00 SGA] mount: /efi: wrong fs type, bad option, bad superblock on /dev/sda4, missing codepage or helper program, or other error.

  • /var/log/messages 次の情報を報告します。

    • kernel: [21954396.100230] sd 0:0:0:0: [sda] tag#17 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
      kernel: [21954396.109232] sd 0:0:0:0: [sda] tag#17 CDB: Read(10) 28 00 01 dc e7 80 00 00 08 00
      kernel: [21954396.123110] sd 0:0:0:0: [sda] tag#18 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
      kernel: [21954396.132108] sd 0:0:0:0: [sda] tag#18 CDB: Read(10) 28 00 01 dc e7 80 00 00 08 00
      kernel: [21954396.153728] sd 0:0:0:0: [sda] tag#5 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
      kernel: [21954396.162641] sd 0:0:0:0: [sda] tag#5 CDB: Read(10) 28 00 00 ee 78 02 00 00 02 00
      root: [2023-11-25 16:18:25+00:00 PIU] mount: /mnt/pge-inac-part: can't read superblock on /dev/sda2.
      root: [2023-11-25 16:18:25+00:00 PIU] mount /dev/sda2 /mnt/pge-inac-part failed; trying again
      root: [2023-11-25 16:18:26+00:00 PIU] [root@SG:/] >>> mount /dev/sda2 /mnt/pge-inac-part
      root: [2023-11-25 16:18:26+00:00 PIU] mount: /mnt/pge-inac-part: can't read superblock on /dev/sda2.
      root: [2023-11-25 16:18:26+00:00 PIU] mount /dev/sda2 /mnt/pge-inac-part failed; trying again
      root: [2023-11-25 16:18:27+00:00 PIU] [root@SG:/] >>> mount /dev/sda2 /mnt/pge-inac-part
      root: [2023-11-25 16:18:27+00:00 PIU] mount: /mnt/pge-inac-part: can't read superblock on /dev/sda2.

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.