Skip to main content

CSV

We introduce a benchmark of distributed algorithms execution over big data. The datasets are composed of metrics about the computational impact (resource usage) of eleven well-known machine learning techniques on a real computational cluster regarding system resource agnostic indicators: CPU consumption, memory usage, operating system processes load, net traffic, and I/O operations. The metrics were collected every five seconds for each algorithm on five different data volume scales, totaling 275 distinct datasets.

Categories:

Matlab Simulink was used to develop an emulator for the Viessmann Vitorond 200 Gas Fired Boiler VD2 Series 380 and a series of faults were modeled along with normal data across the expected range of operation to create a labelled dataset with approximately 27,500 cases for training and testing boiler fault classification models. 

Categories:

Category

Empirical line methods (ELM) are frequently used to correct images from aerial remote sensing. Remote sensing of aquatic environments captures only a small amount of energy because the water absorbs much of it. The small signal response of the water is proportionally smaller when compared to the other land surface targets.

 

This dataset presents some resources and results of a new approach to calibrate empirical lines combining reference calibration panels with water samples. We optimize the method using python algorithms until reaches the best result.

Categories:

This dataset contains heavy-machinery data from the Brazilian industrial sector. The data was collected in a poultry feed factory located  in the state of Minas Gerais, Brazil. Its process can be summarized to creating pellets of ration for poultry from corn or soybeans and added nutrients. The factory produces at fullscale over the entire year, thus it has well-behaved usage patterns at any time. It operates from Mondays through Fridays (and occasionally on Saturdays, in case production is below the monthly target) on a daily three-turn shift from 10:00 PM to 05:00 PM.

Categories: