VMware Atomic Test and Set ハートビートは、ストレージへの接続に影響を及ぼします
環境
- Data ONTAP
- SANプロトコル(iSCSIおよびFC)
- VMware vSphere 5.5U2以降
- Atomic Test & Set(ATS)ハートビート
問題
- ESXiホスト で LUNが繰り返しリセットされます。 たとえば、vmkernel.logからは、次のようなイベントが表示されます。
2022-04-04T12:33:17.949Z: [scsiCorrelator] 5693120512199us: [vob.scsi.scsipath.por] Power-on Reset occurred on naa.600a0980xxxxxxxxxxxxxxxxxxxxxxx
2022-04-04T12:41:45.715Z: [scsiCorrelator] 5693628237688us: [vob.scsi.scsipath.por] Power-on Reset occurred on naa.600a0980xxxxxxxxxxxxxxxxxxxxxxx
2022-04-04T12:41:48.217Z: [scsiCorrelator] 5693630739987us: [vob.scsi.scsipath.por] Power-on Reset occurred on naa.600a0980xxxxxxxxxxxxxxxxxxxxxxx
2022-04-04T12:41:59.394Z: [scsiCorrelator] 5693641916248us: [vob.scsi.scsipath.por] Power-on Reset occurred on naa.600a0980xxxxxxxxxxxxxxxxxxxxxxx
- LUNのリセットに加えて、データストアハートビートタイムアウトの証拠もあります。
2022-04-04T08:03:16.087Z: [vmfsCorrelator] 5677371531974us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_1
2022-04-04T08:03:28.087Z: [vmfsCorrelator] 5677383532045us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_2
2022-04-04T08:03:28.087Z: [vmfsCorrelator] 5677383532279us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_3
2022-04-04T08:03:28.087Z: [vmfsCorrelator] 5677383532445us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_4
2022-04-04T12:33:17.354Z: [vmfsCorrelator] 5693572799758us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_5
2022-04-04T12:43:14.401Z: [vmfsCorrelator] 5694169846835us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_6
- ホストによって発行されたSCSIコマンドを確認すると、ATS(Atomic Test and Set) コマンドとSCSI Reservationコマンドの両方が発行されていることがわかります。
SCSIリザーブ
2022-04-04T08:03:31.105Z cpu1:2097761)NMP: nmp_PathDetermineFailure:3560: SCSI cmd RESERVE failed on path vmhba64:C2:T1:L3, reservation state on device naa.600a0980383136446f3f4e4c5a484355 is unknown.
2022-04-04T08:03:31.118Z cpu1:2097761)NMP: nmp_PathDetermineFailure:3560: SCSI cmd RESERVE failed on path vmhba64:C2:T1:L3, reservation state on device naa.600a0980383136446f3f4e4c5a484355 is unknown.
2022-04-04T08:03:31.118Z cpu19:2097779)NMP: nmp_PathDetermineFailure:3560: SCSI cmd RESERVE failed on path vmhba64:C2:T1:L2, reservation state on device naa.600a0980383136446f3f4e4c5a484356 is unknown.
2022-04-04T08:03:31.131Z cpu1:2097761)NMP: nmp_PathDetermineFailure:3560: SCSI cmd RESERVE failed on path vmhba64:C2:T1:L3, reservation state on device naa.600a0980383136446f3f4e4c5a484355 is unknown.
ATSの比較と書き込み
2022-04-04T09:35:44.604Z cpu16:2098262)NMP: nmp_ThrottleLogForDevice:3872: Cmd 0x89 (0x45baf02353c0, 2099582) to dev "naa.600a0980xxxxxxxxxxxxxxxxxxxxxxx" on path "vmhba64:C3:T1:L2" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x1d 0x0. Act:NONE
2022-04-04T12:30:30.386Z cpu27:2098262)NMP: nmp_ThrottleLogForDevice:3872: Cmd 0x89 (0x459adcd68040, 2099582) to dev "naa.600a0980383136446f3f4e4c5a484356" on path "vmhba64:C3:T1:L2" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x1d 0x0. Act:NONE
2022-04-04T12:33:16.355Z cpu1:2097617)NMP: nmp_ThrottleLogForDevice:3872: Cmd 0x89 (0x45bac31031c0, 2097193) to dev "naa.600a09803831364468244e4c5a47494e" on path "vmhba64:C1:T0:L0" Failed: H:0x5 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0. Act:EVAL
2022-04-04T12:43:13.402Z cpu2:2097617)NMP: nmp_ThrottleLogForDevice:3872: Cmd 0x89 (0x45baf1751140, 2097193) to dev "naa.600a09803831364468244e4c5a47494e" on path "vmhba64:C0:T0:L0" Failed: H:0x5 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0. Act:EVAL