AFF-A400のブートループ
環境
- AFF-A400
- Networking Adapter Cards NIC、100GbE、PCIe gen3 x16、Smart IO with P/N X1151、Riser adapter
問題
- BMCアクセスなし。
- コントローラ交換後も問題が残る
- ブートループのコンソール出力例:
...
 Wait BMC self-test result.
 BMC self-test: OK.
 UPI initialization.
 CPU
 initialization.
 Running full memory initialization.
 DIMM F0: NVDIMM with valid data and restage progressing.....
 DIMM F0: NVDIMM
 RESTAGE Success+++++++++
 SPI FLASH: Backup BIOS
 PEI end.
 DXE start.
 USB initialization.
 PCI host bridge initialization.
 CSM
 initialization.
 BIOS Version: 16.1
 PEI start.
 CPU PEI initialization.
 Wait BMC 30 seconds.
 Wait BMC self-test result.
 BMC
 self-test: OK.
 UPI initialization.
 CPU initialization.
 Running full memory initialization.
 CPU reset.
 BIOS Version: 16.1
 PEI
 start.
 CPU PEI initialization.
 Wait BMC self-test result.
 BMC self-test: OK.
 UPI initialization.
 CPU initialization.
 Running full
 memory initialization.
 DIMM F0: NVDIMM with valid data and restage progressing.....
 DIMM F0: NVDIMM RESTAGE Success+++++++++
 SPI
 FLASH: Primary BIOS
 PEI end.
 DXE start.
 USB initialization.
 PCI host bridge initialization.
 CSM initialization.
 ...
- BMCアクセスだがシステムは電源が入らない
- これはFaulty Riser Cardだった
- BIOSが正しくブートしなかったときにこの出力が表示される
BMC >
 BMC > system power status
 Host power is off
 BMC > system power status on
 [876 : 942 INFO]CHASSIS_POWER_UP from channel 1
[876 : 959 INFO]CHASSIS_CTRL_ACTION: action: 1; SysRestartCause = 1
 [876 : 959 INFO]POWER ON CHASSIS
[000101000646][IPMIMain][INFO]set bios to primary
 BMC > system power onstatus
 Host power is off
 BMC > [000101000703][876:959:IPMIMain][ERROR][PDKHW.c:PDK_PowerOnChassis:321]Timeout to wait POWER_ON: retry=31
 [000101000703][BIOS]PDK_PowerOnChassis change to ENABLE_WATCHDOG
 [000101000703][IPMIMain][INFO]InitSysResetTick
BMC > [000101000722][BIOS]BIOSMonitorTask change from ENABLE_WATCHDOG to MONITOR
 [000101000722][BIOS]Set WDT for 100 sec
 [000101000722(946685242.816776216)][WDT]Set WDT req : Use=0x5 Action=0x3 Pre-timeout=0x0 Flag=0x20 Timeout=0x3e8
 [000101000722(946685242.887375412)][WDT]Set WDT done: Use=0x5 Action=0x3 Pre-timeout=0x0 Flag=0x5 Timeout=0x3e8
 [000101000722(946685242.888665920)][WDT]InitCountDown=1000
system power status
 Host power is off
 BMC > [000101000733][BIOS]Monitor: (count = 910, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=90
 [000101000743][BIOS]Monitor: (count = 810, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=80
 [000101000754][BIOS]Monitor: (count = 710, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=70
 [000101000804][BIOS]Monitor: (count = 610, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=60
 [000101000815][BIOS]Monitor: (count = 510, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=50
 [000101000826][BIOS]Monitor: (count = 410, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=40
 [000101000836][BIOS]Monitor: (count = 310, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=30
 [000101000847][BIOS]Monitor: (count = 210, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=20
 [000101000858][BIOS]Monitor: (count = 110, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=10
 [000101000909][BIOS]Monitor: (count = 10, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=0
 [000101000909][BIOS]BIOSMonitorTask change from MONITOR to CHANGE_BIOS because time out;
 [000101000910][BIOS]BIOSMonitorTask change from CHANGE_BIOS to NONE_TIMER
 [000101000910][BIOS]BIOSMonitorTask: Add SEL for OEM Timeout
 [000101000910][BIOS]BIOSMonitorTask: ready to set up backup BIOS and power cycle
 [876 : 950 INFO]POWER CYCLE CHASSIS
[876 : 950 INFO]Lock WDT
 [000101000910][IPMIMain][INFO][easywdt_lock] Lock WDT
 [000101000912][IPMIMain][INFO]set bios to backup
 [000101000912][BIOS]PDK_PowerCycleChassis change to ENABLE_WATCHDOG
 [000101000912][BIOS]BIOSMonitorTask: lock and stop watchdog
 [000101000933][BIOS]BIOSMonitorTask change from ENABLE_WATCHDOG to MONITOR
 [000101000933][BIOS]Set WDT for 100 sec
 [000101000933(946685373.18518160)][WDT]Set WDT req : Use=0x5 Action=0x3 Pre-timeout=0x0 Flag=0x20 Timeout=0x3e8
 [000101000933(946685373.87177168)][WDT]Set WDT done: Use=0x5 Action=0x3 Pre-timeout=0x0 Flag=0x5 Timeout=0x3e8
 [000101000933][IPMIMain][INFO][easywdt_unlock] Unlock WDT
 [000101000933(946685373.92590288)][WDT]InitCountDown=1000
[000101000944][BIOS]Monitor: (count = 910, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=90
 [000101000955][BIOS]Monitor: (count = 810, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=80
 [000101001006][BIOS]Monitor: (count = 710, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=70
 [000101001016][BIOS]Monitor: (count = 610, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=60
BMC > system power status
 Host power is off
 BMC > [000101001027][BIOS]Monitor: (count = 510, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=50
 [000101001038][BIOS]Monitor: (count = 410, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=40
 [000101001048][BIOS]Monitor: (count = 310, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=30
 [000101001059][BIOS]Monitor: (count = 210, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=20
 [000101001110][BIOS]Monitor: (count = 110, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=10
 [000101001121][BIOS]Monitor: (count = 10, TmrUse = 0x45, ExpirationFlag = 0x4), t_bios_load_time=0
 [000101001121][BIOS]BIOSMonitorTask change from MONITOR to CHANGE_BIOS because time out;
 [000101001122][BIOS]BIOSMonitorTask change from CHANGE_BIOS to NONE_TIMER
 [000101001122][BIOS]BIOSMonitorTask: Add SEL for OEM Timeout
 [000101001122][BIOS]BIOSMonitorTask: already using backup BIOS