SnapshotがSnapshotの保持期間よりも長いためにスライスサービスが再起動する
環境
- NetApp SolidFireストレージノード
- NetApp Hシリーズストレージノード
- NetApp Elementソフトウェアv12.3.x以下
問題
sliceServiceUnhealthy
標準またはリモートレプリケーションのSnapshotを使用し、Snapshotを定期的に削除するスケジュールを設定しているクラスタから、次の警告が検出されました。
例:
25 2019-06-17T18:04:33.957Z Warning service 3 37 Yes 2019-06-17T18:10:38.400Z sliceServiceUnhealthy SolidFire Application cannot communicate with a metadata service.
23 2019-06-14T17:04:54.761Z Warning service 3 37 Yes 2019-06-14T17:09:38.927Z sliceServiceUnhealthy SolidFire Application cannot communicate with a metadata service.
20 2019-06-13T20:04:28.626Z Warning service 3 37 Yes 2019-06-13T20:08:47.734Z sliceServiceUnhealthy SolidFire Application cannot communicate with a metadata service.
- Active IQのイベントからSnapshotが定期的に削除されると、スライスサービスが 再起動され、時折コアダンプが生成されます。
例:
11806 2019-06-17T18:09:47.613Z serviceEvent Restarted SliceService: previous run killed with signal 6 (SIGABRT) core dump coreFileCount=1 servicecorelimit=7 49 3 37 { "replay":192.1844909617118 }
11801 2019-06-17T18:04:27.232Z sliceEvent Deleted snapshot due to reaching expiration date {Knowledgebase"snapshotID": 302, "expirationDate": "2019-06-17T18:00:00Z" }
11800 2019-06-17T18:04:27.226Z sliceEvent Deleted snapshot due to reaching expiration date {"snapshotID": 301, "expirationDate": "2019-06-17T18:00:00Z" }
11799 2019-06-17T18:04:27.220Z sliceEvent Deleted snapshot due to reaching expiration date {"snapshotID": 300, "expirationDate": "2019-06-17T18:00:00Z" }
11798 2019-06-17T18:04:27.214Z sliceEvent Deleted snapshot due to reaching expiration date {"snapshotID": 299, "expirationDate": "2019-06-17T18:00:00Z" }
8371 2019-06-14T17:09:10.752Z serviceEvent Restarted SliceService: previous run killed with signal 6(SIGABRT) core dump coreFileCount=1 servicecorelimit=7 49 3 37 { "replay": 176.2541414554888 }
8370 2019-06-14T17:04:22.484Z sliceEvent Deleted snapshot due to reaching expiration date {"snapshotID": 199, "expirationDate": "2019-06-14T17:00:02Z" }
8369 2019-06-14T17:04:22.479Z sliceEvent Deleted snapshot due to reaching expiration date {"snapshotID": 198, "expirationDate": "2019-06-14T17:00:01Z" }
8368 2019-06-14T17:04:21.904Z sliceEvent Deleted snapshot due to reaching expiration date {"snapshotID": 197, "expirationDate": "2019-06-14T17:00:02Z" }
7417 2019-06-13T20:08:29.169Z serviceEvent Restarted SliceService: previous run killed with signal 6(SIGABRT) core dump coreFileCount=1 servicecorelimit=7 49 3 37 { "replay": 114.2269542371808 }
7414 2019-06-13T20:04:21.117Z sliceEvent Deleted snapshot due to reaching expiration date {"snapshotID": 177, "expirationDate": "2019-06-13T20:00:01Z" }
7413 2019-06-13T20:04:21.111Z sliceEvent Deleted snapshot due to reaching expiration date {"snapshotID": 176, "expirationDate": "2019-06-13T20:00:01Z" }
- この問題は、ローカルまたはリモートレプリケーションクラスタのソース側とターゲット側の両方で発生します。