HBAに障害があると、複数のディスクで障害が発生します
環境
FAS2650
問題
- ノードが停止し、パートナーからテイクオーバーの実行が報告された
CLTFLT:HA Group Notification from Node-01 (CONTROLLER TAKEOVER COMPLETE AUTOMATIC) ALERT
- 障害ディスクが複数あります
cluster::> storage disk show -broken
Original Owner: Checksum Compatibility: block
Drawer Usable Physical Disk Outage Reason
HA Shelf Bay /Slot Chan Pool Type RPM Size Size
--------------- ------------- --- ----- --- ------ ----
4.2.4 failed 0a 2 4 -/- B NONE SAS 10000 - 1.64TB
4.2.12 failed 0a 2 12 -/- B NONE SAS 10000 - 1.64TB
4.2.16 failed 0a 2 16 -/- B NONE SAS 10000 - 1.64TB
4.3.4 failed 0b 3 4 -/- A NONE SAS 10000 - 1.64TB
4.3.6 failed 0b 3 6 -/- A NONE SAS 10000 - 1.64TB
4.3.20 failed 0b 3 20 -/- A NONE SAS 10000 - 1.64TB
cluster::> node run -node Node-02 sysconfig -r
Aggregate aggr2_cluster_02_SAS (failed, mixed_raid_type, partial, hybrid) (block checksums)
Plex /aggr2_cluster_02_SAS/plex0 (offline, failed, inactive)
RAID group /aggr2_cluster_02_SAS/plex0/rg0 (partial, block checksums, raid_dp)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0a.02.0 0a 2 0 SA:A 0 SAS 10000 1713523/3509295616 1716957/3516328368
parity 0b.03.0 0b 3 0 SA:B 0 SAS 10000 1713523/3509295616 1716957/3516328368
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
data FAILED N/A 1713523/ -
Raid group is missing 8 disks.
- アグリゲートが「障害」/「オフライン」と表示される
SCSI cmd checkconditions
ディスク障害が発生する前にEMSログに表示されます
[Node-02: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Disk device 0b.03.10: Check Condition: CDB 0x2a:a7fc9600:0200: Sense Data SCSI:hardware error - (0x4 - 0x3 0x0 0x82)(994).
[Node-02: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Disk device 0a.02.6: Check Condition: CDB 0x2a:a7fc9600:0200: Sense Data SCSI:hardware error - (0x4 - 0x3 0x0 0x8)(1480).
[Node-02: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Disk device 0a.02.8: Check Condition: CDB 0x2a:a7fc9600:0200: Sense Data SCSI:hardware error - (0x4 - 0x3 0x0 0x8)(1512).
[Node-02: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Disk device 0a.02.10: Check Condition: CDB 0x2a:a7fc9600:0200: Sense Data SCSI:hardware error - (0x4 - 0x3 0x0 0x8)(1525).