StorageGRIDアプライアンスがカーネルパニックによる致命的なハードウェアエラーで予期せずリブートする
環境
NetApp StorageGRID アプライアンス
問題
StorageGRIDがアラートを報告:
unexpected node rebootログバンドルをダウンロードして表示すると、次のエラーが表示されます:
base-os-logs/run/mount-tmp/pge-actv-root/var/log/storagegrid_crash_dmesg.DATE.log.gz[2926808.069037] Disabling lock debugging due to kernel taint [2926808.075153] mce: [Hardware Error]: CPU 39: Machine Check Exception: 5 Bank 1: fb80000000100134 [2926808.084333] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffffb77339e5> {intel_idle+0x85/0x130} [2926808.093433] mce: [Hardware Error]: TSC 18f0cc63c58320 MISC 86 [2926808.099101] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [2926808.099102] {1}[Hardware Error]: event severity: fatal [2926808.099103] {1}[Hardware Error]: Error 0, type: fatal [2926808.099104] {1}[Hardware Error]: section_type: general processor error [2926808.099104] {1}[Hardware Error]: processor_type: 0, IA32/X64 [2926808.099105] {1}[Hardware Error]: processor_isa: 2, X64 [2926808.099106] {1}[Hardware Error]: error_type: 0x01 [2926808.099106] {1}[Hardware Error]: cache error [2926808.099107] {1}[Hardware Error]: operation: 0, unknown or generic [2926808.099107] {1}[Hardware Error]: version_info: 0x0000000000050654 [2926808.099108] {1}[Hardware Error]: processor_id: 0x0000000000000078 [2926808.099108] Kernel panic - not syncing: Fatal hardware error!