Datasets
Standard Dataset
DC Health - Node performance data from a real datacenter of the Instituto Metropole Digital/UFRN
- Citation Author(s):
- Submitted by:
- Walter Neto
- Last updated:
- Wed, 08/03/2022 - 09:35
- DOI:
- 10.21227/e870-4e87
- Data Format:
- Research Article Link:
- Links:
- License:
- Categories:
- Keywords:
Abstract
This work aims to identify anomalous patterns that could be associated with performance degradation and failures in datacenter nodes, such as Virtual Machines or Virtual Machines clusters. The early detection of anomalies can enable early remediation measures, such as Virtual Machines migration and resource reallocation before losses occur. One way to detect anomalous patterns in datacenter nodes is using monitoring data from the nodes, such as CPU and memory utilization. This way, a common challenge in the field of anomaly detection and online anomaly detection is the unavailability of labeled data. To assist this challenge, this dataset contains unlabeled real data from node performance collected from the Instituto Metrópole Digital datacenter. The files were composed of a dataset group for each cluster of computing (clusters 1 to 6), a dataset for a controller cluster, and one for the storage node. The files were collated from hosts over a year (2021 to 2022).
1) Read dataset description file;
2) Load the data;
3) Select the relevant variables;
Documentation
Attachment | Size |
---|---|
Read-me - dataset documentation for DC Health - Node performance data from a real datacenter.pdf | 118.55 KB |
Comments
Data to be used for grad school assignment