ディスクスペース不足によるActive IQ Unified Managerデータベースの破損

最後の更新
PDFとして保存

Views:: 27

Visibility:: Public

Votes:: 0

Category:: active-iq-unified-manager<a>2009656207</a>

Specialty:: OM

Last Updated:

環境

ActiveIQ Unified Manager （ AIQUM ） 9.6+
RHELまたはCentOS

問題

AIQUM GUIでクラスタの過去と現在のパフォーマンスデータを表示できない
MySQLエラーログで次のメッセージが報告されます。

[ERROR] [MY-000035] [Server] Disk is full writing './unified-manager.004948' (OS errno 28 - No space left on device). Waiting for someone to free space... Retry in 60 secs. Message reprinted in 600 secs.

2023-07-25T06:46:41.283486Z 1271

[ERROR] [MY-010907] [Server] Error writing file 'unified-manager' (errno: 28 - No space left on device)

2023-07-25T06:46:41.284609Z 1271

[ERROR] [MY-011072] [Server] Binary logging not possible. Message: An error occurred during flush stage of the commit. 'binlog_error_action' is set to 'ABORT_SERVER'. Server is being stopped..

2023-07-25T06:46:41Z UTC - mysqld got signal 6 ;

Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.

BuildID[sha1]=64a1d52e8c241c89abf59dc7d461f945ce41974c

Thread pointer: 0x7f66d61e2000

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

[Warning] [MY-012351] [InnoDB] Tablespace 1097, name 'netapp_performance/sample_qos_volume_workload_null#p#p0', file './netapp_performance/sample_qos_volume_workload_null#p#p0.ibd' is missing!

[Warning] [MY-012351] [InnoDB] Tablespace 1098, name 'netapp_performance/summary_qos_volume_workload_null', file './netapp_performance/summary_qos_volume_workload_null.ibd' is missing!

[Warning] [MY-012351] [InnoDB] Tablespace 3032, name 'netapp_performance/sample_fcpport#p#p13', file './netapp_performance/sample_fcpport#p#p13.ibd' is missing!

[ERROR] [MY-012592] [InnoDB] Operating system error number 2 in a file operation.

[ERROR] [MY-012593] [InnoDB] The error means the system cannot find the path specified.

[ERROR] [MY-012216] [InnoDB] Cannot open datafile for read-only: './netapp_performance/sample_cluster#p#p14.ibd' OS error: 71

Ocumserver.log & ServerMega.logは、 AIQUMが破損したために多くのパフォーマンステーブルが欠落していると報告しています。

ERROR [oncommand] [opmTaskExecutor-1] [c.n.i.s.a.dao.NodeInventoryDao] fetchNodeInventoryStats for Nodes query:  SELECT configSet.objid,  configSet.elementName,  configSet.elementResourceKey,  configSet.numberOfHours,  COALESCE(SUM(avgLatency*totalOps)/SUM(totalOps),SUM(avgLatency*totalOps)) latency, AVG(totalOps) ops,  AVG(opmSysThroughput) throughput,  AVG(opmSysReadThroughput) readThroughput,  AVG(opmSysUtilization) AS nodeUtilization,  configSet.freeCapacity,  configSet.totalCapacity,  configSet.clusterId,  configSet.clusterName,  configSet.clusterResourceKey,  configSet.thresholdPolicyId,  configSet.thresholdPolicyName,  configSet.clusterFqdn,  configSet.cacheReadThroughput,  configSet.usedHeadroom,  configSet.availableOps,  0 AS eventSeverity  FROM  ((SELECT n.objId AS objid,  n.name elementName,  n.resourceKey elementResourceKey,  72 numberOfHours,  (n.aggregateBytesTotal-n.aggregateBytesUsed) freeCapacity,  n.aggregateBytesTotal totalCapacity, c.objId clusterId,  c.name clusteRName,  c.resourceKey clusterResourceKey,  GROUP_CONCAT(DISTINCT tp.id ORDER BY tp.id) thresholdPolicyId, GROUP_CONCAT(DISTINCT tp.name ORDER BY tpm.policyId) thresholdPolicyName,  cluster.fqdn AS clusterFqdn  , AVG(ext.cacheReadThroughput) AS cacheReadThroughput  , AVG(cHRoomUsedPercent) usedHeadroom , AVG(availableOps) availableOps FROM netapp_model_view.node n  JOIN netapp_model_view.cluster c on c.objId=n.clusterId  LEFT JOIN ocum.cluster cluster ON (c.objid = cluster.id)  LEFT JOIN opm.threshold_policy_mapping tpm ON    (tpm.objectId=n.objId AND tpm.endTime is null)  LEFT JOIN opm.threshold_policy tp ON    (tp.id=tpm.policyId AND tp.elementType=9)  LEFT JOIN netapp_performance.summary_extcacheobj ext ON (n.objId = ext.objId AND ext.fromtime >= 1690050600000 AND ext.fromtime < 1690309864758)  LEFT JOIN netapp_performance.summary_opm_headroom_cpu sohc ON (n.objId = sohc.objId AND sohc.fromtime > 1690050600000 AND sohc.fromtime < 1690309864758) GROUP BY n.objId) configSet  LEFT JOIN netapp_performance.summary_node sn  ON (configSet.objId = sn.objId  AND fromtime >= 1690050600000  AND fromtime < 1690309864758) ) GROUP BY configSet.objId failed.

org.springframework.dao.TransientDataAccessResourceException: StatementCallback; SQL [ SELECT configSet.objid,  configSet.elementName,  configSet.elementResourceKey,  configSet.numberOfHours,  COALESCE(SUM(avgLatency*totalOps)/SUM(totalOps),SUM(avgLatency*totalOps)) latency, AVG(totalOps) ops,  AVG(opmSysThroughput) throughput,  AVG(opmSysReadThroughput) readThroughput,  AVG(opmSysUtilization) AS nodeUtilization,  configSet.freeCapacity,  configSet.totalCapacity,  configSet.clusterId,  configSet.clusterName,  configSet.clusterResourceKey,  configSet.thresholdPolicyId,  configSet.thresholdPolicyName,  configSet.clusterFqdn,  configSet.cacheReadThroughput,  configSet.usedHeadroom,  configSet.availableOps,  0 AS eventSeverity  FROM  ((SELECT n.objId AS objid,  n.name elementName,  n.resourceKey elementResourceKey,  72 numberOfHours,  (n.aggregateBytesTotal-n.aggregateBytesUsed) freeCapacity,  n.aggregateBytesTotal totalCapacity, c.objId clusterId,  c.name clusteRName,  c.resourceKey clusterResourceKey,  GROUP_CONCAT(DISTINCT tp.id ORDER BY tp.id) thresholdPolicyId, GROUP_CONCAT(DISTINCT tp.name ORDER BY tpm.policyId) thresholdPolicyName,  cluster.fqdn AS clusterFqdn  , AVG(ext.cacheReadThroughput) AS cacheReadThroughput  , AVG(cHRoomUsedPercent) usedHeadroom , AVG(availableOps) availableOps FROM netapp_model_view.node n  JOIN netapp_model_view.cluster c on c.objId=n.clusterId  LEFT JOIN ocum.cluster cluster ON (c.objid = cluster.id)  LEFT JOIN opm.threshold_policy_mapping tpm ON    (tpm.objectId=n.objId AND tpm.endTime is null)  LEFT JOIN opm.threshold_policy tp ON    (tp.id=tpm.policyId AND tp.elementType=9)  LEFT JOIN netapp_performance.summary_extcacheobj ext ON (n.objId = ext.objId AND ext.fromtime >= 1690050600000 AND ext.fromtime < 1690309864758)  LEFT JOIN netapp_performance.summary_opm_headroom_cpu sohc ON (n.objId = sohc.objId AND sohc.fromtime > 1690050600000 AND sohc.fromtime < 1690309864758) GROUP BY n.objId) configSet  LEFT JOIN netapp_performance.summary_node sn  ON (configSet.objId = sn.objId  AND fromtime >= 1690050600000  AND fromtime < 1690309864758) ) GROUP BY configSet.objId]; (conn=868) Tablespace is missing for table `netapp_performance`.`summary_extcacheobj`.; nested exception is java.sql.SQLTransientConnectionException: (conn=868) Tablespace is missing for table `netapp_performance`.`summary_extcacheobj`.

ERROR [default task-59] c.n.o.p.f.p.t.SampleTable (SampleTable.java:373) - Failed to execute prepared statement for com.netapp.oci.platform.framework.performance.tables.PartitionedSampleTable@6fcaf188: java.sql.SQLSyntaxErrorException: (conn=71) Table 'netapp_performance.sample_qos_service_center_27699' doesn't exist