マシンチェックの例外が原因でH410sノードがオフラインになっています
環境
- H410S
- H300S
問題
マシンチェックの例外が原因でH300S / H410sノードがオフラインになりました
カーネルバッファ内のメッセージ:
[110011069.865671] [UFW BLOCK] IN=Bond1G OUT= MAC=01:00:5e:00:00:01:c0:c5:20:57:a7:e8:08:00 SRC=0.0.0.0 DST=224.0.0.1 LEN=32 TOS=0x00 PREC=0xC0 TTL=1 ID=0 DF PROTO=2
[110011099.825417] [UFW BLOCK] IN=Bond1G OUT= MAC=01:00:5e:00:00:01:c0:c5:20:57:8d:e8:08:00 SRC=0.0.0.0 DST=224.0.0.1 LEN=32 TOS=0x00 PREC=0xC0 TTL=1 ID=0 DF PROTO=2
[110011108.333942] Disabling lock debugging due to kernel taint
[110011108.334012] mce: [Hardware Error]: CPU 8: Machine Check Exception: 5 Bank 12: fe01ac48001000c3
[110011108.342961] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8576b355>
[110011108.342971] {intel_idle 0x95/0x110}
[110011108.353756] mce: [Hardware Error]: TSC 334c26320e569ea ADDR 1f38384000 MISC 900331674fd908c
[110011108.362531] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1693647563 SOCKET 1 APIC 10 microcode b000021
[110011108.372251] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[110011108.381288] mce: [Hardware Error]: Machine check: Processor context corrupt
[110011108.388589] Kernel panic - not syncing: Fatal machine check