AIQ Unified Manager「クラスタの監視に失敗しました」が、15分後に廃止されました
環境
- Active IQ Unified Manager 9.6以降(UM)
- すべてのOSプラットフォーム
問題
Cluster monitoring failed
ランダムなクラスタに対してランダムな時間にアラートを受信しました。
- ocumserver.log:
2024-10-15 22:11:27,725 WARN [oncommand] [reconciliation-0] [c.n.d.c.ClusterStatusListener] Acquisition Failed for cluster : xx.xx.xx.xx
message : storage-shelf-list-info; errno: 14007, reason: Node is not healthy, storage-adapter-get-adapter-info; errno: 14007, reason: Node is not healthy
- AU.LOG:
2024-10-15 22:11:03,943 INFO [foundation-poll-0] c.n.u.RestUtil (RestUtil.java:186) - Requesting Data from ONTAP: https://xx.xx.xxx.xx:443/api/network...red,monitoring
...
2024-10-15 22:11:03,960 INFO [foundation-poll-3] c.o.s.a.d.n.b.z.c.n.MetroclusterNodeMirroringDetailsBuilder (MetroclusterNodeMirroringDetailsBuilder.java:38) - Disabling SSL Certificate checking for Cluster Communication
2024-10-15 22:11:03,960 INFO [foundation-poll-3] c.n.u.RestUtil (RestUtil.java:186) - Requesting Data from ONTAP: https://xx.xx.xxx.xx:/443/api/cluster...ror,interfaces
2024-10-15 22:11:04,137 INFO [foundation-poll-2] c.o.s.a.d.n.b.z.c.n.FcpBuilder (FcpBuilder.java:100) - Disabling SSL Certificate checking for Cluster Communication
- 管理ログ:
Sat Oct 19 2024 02:14:20 +09:00 [kern_mgwd:info:2755] 0x82cd12700: 0: ERR: TABLES::disk: _getDisks: Node nodename not healthy, Not gathering disk attributes from this node.
Sat Oct 19 2024 02:14:20 +09:00 [kern_mgwd:info:2755] 0x82cd12700: 0: ERR: TABLES::disk: _getDisks: Node nodename not healthy, Not gathering disk attributes from this node.
Sat Oct 19 2024 03:10:24 +09:00 [kern_mgwd:info:2755] 0x824077800: 8503f4000005d7bb: ERR: TABLES::disk: _getDisks: Node nodename not healthy, Not gathering disk attributes from this node.
Sat Oct 19 2024 03:10:24 +09:00 [kern_mgwd:info:2755] 0x824077800: 8503f4000005d7bb: ERR: TABLES::disk: _getDisks: Node nodename not healthy, Not gathering disk attributes from this node.