アップグレードサービスを開始できないノードがあるStorageGRIDのアップグレードエラー
環境
StorageGRID 11.3以降
問題
- StorageGRID
アップグレード画面に「One or more nodes failed to start the upgrade service: node_name」と表示される。 - 問題は、プライマリ管理ノードをリブートしたあとも維持されます。
- gdu-server.logに、次の出力が表示されます。
I, [2020-11-24T06:39:27.039158 #3456] INFO -- gdu-server: Executing command `pgrep --full --list-full 'upgrade-service.rb($| -)' 2>&1` on node_name, [2020-11-24T06:39:27.127902 #3456] INFO -- gdu-server: Upgrade service is not running:
I, [2020-11-24T06:39:27.128137 #3456] INFO -- gdu-server: Copying upgrade service package /var/local/install/upgrade-service.tgz to HJSD-SG
I, [2020-11-24T06:39:27.128307 #3456] INFO -- gdu-server: Executing command `/sbin/ifconfig` on localhost
I, [2020-11-24T06:39:27.134291 #3456] INFO -- gdu-server: Executing command `hostname` on localhost
I, [2020-11-24T06:39:27.137485 #3456] INFO -- gdu-server: Copying file upgrade-service.tgz to host HJSD-SG
I, [2020-11-24T06:39:27.137724 #3456] INFO -- gdu-server: Executing command `test -e "/var/local/install/upgrade-service.tgz"` on localhost
I, [2020-11-24T06:39:27.256155 #3456] INFO -- gdu-server: Executing command `stat -Lc "%s""/var/local/install/upgrade-service.tgz" 2>/dev/null` on HJSD-SG
I, [2020-11-24T06:39:27.338835 #3456] INFO -- gdu-server: Executing command `stat -Lc "%s""/var/local/install/upgrade-service.tgz" 2>/dev/null` on localhost
I, [2020-11-24T06:39:27.347464 #3456] INFO -- gdu-server: Executing command `scp -o NumberOfPasswordPrompts=0 "/var/local/install/upgrade-service.tgz" root@node_name:"/var/local/install/upgrade-service.tgz" 2>&1` on localhost
E, [2020-11-24T06:39:27.385099 #3456] ERROR -- gdu-server: Unable to start upgrade service on node_name.
E, [2020-11-24T06:39:27.385298 #3456] ERROR -- gdu-server: Failed to execute scp command with '{:file=>"\"/var/local/install/upgrade-service.tgz\"", :host=>"localhost"}, {:host=>"node_name", :file=>"\"/var/local/install/upgrade-service.tgz\""}' on localhost - 1,