DC Health - Node performance data from a real datacenter of the Instituto Metropole Digital/UFRN

Citation Author(s):
Lopes Neto
Instituto Federal de Educação, Ciência e Tecnologia do Rio Grande do Norte
de Morais Barroca Filho
Instituto Metrópole Digital/Universidade Federal do Rio Grande do Norte
Suassuna Nunes
Instituto Metrópole Digital/Universidade Federal do Rio Grande do Norte
Submitted by:
Walter Neto
Last updated:
Wed, 08/03/2022 - 09:35
Data Format:
Research Article Link:
0 ratings - Please login to submit your rating.


This work aims to identify anomalous patterns that could be associated with performance degradation and failures in datacenter nodes, such as Virtual Machines or Virtual Machines clusters. The early detection of anomalies can enable early remediation measures, such as Virtual Machines migration and resource reallocation before losses occur. One way to detect anomalous patterns in datacenter nodes is using monitoring data from the nodes, such as CPU and memory utilization. This way, a common challenge in the field of anomaly detection and online anomaly detection is the unavailability of labeled data. To assist this challenge, this dataset contains unlabeled real data from node performance collected from the Instituto Metrópole Digital datacenter. The files were composed of a dataset group for each cluster of computing (clusters 1 to 6), a dataset for a controller cluster, and one for the storage node. The files were collated from hosts over a year (2021 to 2022).



1) Read dataset description file;
2) Load the data;
3) Select the relevant variables;


Data to be used for grad school assignment

Submitted by Erik Tomlinson on Tue, 10/25/2022 - 18:43