運用停止タスクの開始後、すべてのStorageGRIDノードが青/不明と表示される
環境
- StorageGRID 11.6.0.10以前
- StorageGRID 11.7.0.3以前
- グリッドタスク( 運用停止など)が開始された
問題
- [Support] > [Grid topology] ですべてのノードが青/不明で表示されるか、ノードが表示されない
/var/local/log/nms.log
invalid Atom label error
/usr/local/lib/site_ruby/bycast/storage-grid/atom-container.rb
バンドルの処理中に検出によって検出されたことを示します
MI: |2023-07-18T13:46:59.082| NOTICE [DataConnectionManager] BundleProtocol.java:288: Processed bundle GTSB version 1 namespace BNDL instance 0
NMS: |2023-07-18T13:46:59.096| ERROR invalid Atom label "S>oK" (ArgumentError)
NMS: |2023-07-18T13:46:59.096| ERROR /usr/local/lib/site_ruby/bycast/storage-grid/atom-container.rb:44:in `label='
/var/local/log/nms.log
Java MIスレッドの接続が失われたことを示します
MI: |2023-07-21T13:10:25.725| ERROR [DATA_STREAM_25] AddNodeProtocol.java:226: Connection lost.
MI: |2023-07-21T13:10:25.761| NOTICE [CONTROL_STREAM] ControlConnection.java:191: Restarting control connection...
- service mgmt-apiをリスタートできない
/var/local/log/bycast-err.log
mgmt-apiのエラーを示します。
NMS: |2023-07-25T05:39:37.383| ERROR Exception in thread created by /usr/local/lib/site_ruby/mgmt-api/alertmanager/rules/prometheus-alert-rules-updater.rb:25:in `new'
NMS: |2023-07-25T05:39:37.383| ERROR Directory not empty @ dir_s_rmdir - /var/local/mgmt-api/prometheus-rules (Errno::ENOTEMPTY)
NMS: |2023-07-25T05:39:37.383| ERROR /usr/lib/ruby/2.5.0/fileutils.rb:1337:in `rmdir'
NMS: |2023-07-25T05:39:37.383| ERROR /usr/lib/ruby/2.5.0/fileutils.rb:1337:in `block in remove_dir1'
NMS: |2023-07-25T05:39:37.383| ERROR /usr/lib/ruby/2.5.0/fileutils.rb:1348:in `platform_support'
NMS: |2023-07-25T05:39:37.383| ERROR /usr/lib/ruby/2.5.0/fileutils.rb:1336:in `remove_dir1'
NMS: |2023-07-25T05:39:37.383| ERROR /usr/lib/ruby/2.5.0/fileutils.rb:1329:in `remove'
NMS: |2023-07-25T05:39:37.384| ERROR /usr/lib/ruby/2.5.0/fileutils.rb:691:in `block in remove_entry'
NMS: |2023-07-25T05:39:37.384| ERROR /usr/lib/ruby/2.5.0/fileutils.rb:1386:in `ensure in postorder_traverse'
NMS: |2023-07-25T05:39:37.384| ERROR /usr/lib/ruby/2.5.0/fileutils.rb:1386:in `postorder_traverse'
NMS: |2023-07-25T05:39:37.384| ERROR /usr/lib/ruby/2.5.0/fileutils.rb:689:in `remove_entry'
NMS: |2023-07-25T05:39:37.384| ERROR /usr/lib/ruby/2.5.0/fileutils.rb:717:in `remove_dir'
NMS: |2023-07-25T05:39:37.384| ERROR /usr/local/lib/site_ruby/mgmt-api/alertmanager/rules/prometheus-alert-rules-updater.rb:123:in `stage_rules!'
NMS: |2023-07-25T05:39:37.384| ERROR /usr/local/lib/site_ruby/mgmt-api/alertmanager/rules/prometheus-alert-rules-updater.rb:28:in `block (2 levels) in update_alert_rules!'
NMS: |2023-07-25T05:39:37.384| ERROR /usr/local/lib/site_ruby/mgmt-api/alertmanager/rules/prometheus-alert-rules-updater.rb:26:in `synchronize'
NMS: |2023-07-25T05:39:37.384| ERROR /usr/local/lib/site_ruby/mgmt-api/alertmanager/rules/prometheus-alert-rules-updater.rb:26:in `block in update_alert_rules!'
NMS: |2023-07-25T05:39:37.384| ERROR /usr/local/lib/site_ruby/mgmt-api/tools/api-thread.rb:21:in `block in initialize'
/var/local/log/nms.log
を示します。java.net.ConnectException: Connection refused (Connection refused)
MI: |2023-07-25T05:51:36.885| NOTICE [DATA_STREAM_36] NMSClustersUtils.java:255: Failed to call /localhost/alert-notification-sender-update
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at java.net.Socket.connect(Socket.java:556)
at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1223)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1337)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1312)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1521)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1495)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at com.bycast.config.NMSClustersUtils.notifyMgmtApiOfAlertSenderChange(NMSClustersUtils.java:235)
at com.bycast.config.NMSClustersUtils.setSendingClusterId(NMSClustersUtils.java:211)
at com.bycast.clusters.ClustersUtils.getSendingClusterId(ClustersUtils.java:224)
at com.bycast.clusters.ClustersUtils.getEmailNotificationSendingClusterId(ClustersUtils.java:165)
at com.bycast.transactions.protocols.AttributeNotifyProtocol.saveAttributeData(AttributeNotifyProtocol.java:184)
at com.bycast.transactions.protocols.AttributeNotifyProtocol.processAttrNotify(AttributeNotifyProtocol.java:150)
at com.bycast.transactions.protocols.AddNodeProtocol.startProcessing(AddNodeProtocol.java:192)
at com.bycast.transactions.connectionagent.DataConnection.dataProcessing(DataConnection.java:140)
at com.bycast.transactions.connectionagent.DataConnection.run(DataConnection.java:55)