Machine Learning

This data repository contains test data and corresponding test code for evaluating the performance of a machine learning model. The dataset includes 950 labeled samples across 7 different classes. The test code provides implementations of several common evaluation metrics, including accuracy, precision, recall, and F1-score. This resource is intended to facilitate the benchmarking and comparison of different machine learning algorithms on a standardized task.


A PCB 130D20 microphone is kept in a stand and facing towards the tool tip to capture the emitted sound. The microphone is connected to the computer through a specially designed signal conditioner. GoldWave software is used to record the captured sound with sampling frequency set to 44100Hz. A series of machining experiments were conducted on a Turning machine(XLTURN from MTAB) with Carbide Insert and aluminium work piece of 38mm diameter. Throughout the experiment a constant feed rate set at 0.5 mm/rotation was uniformly maintained.


This dataset is composed of 2000 time-series (1000 Read and 1000 Write) realized from the much larger cloud storage workload released to the research community by the Alibaba group. The original dataset can be download from here: (

This original dataset collected over 31 days contains read/write data for 1000 storage volumes. The schema for each file given the file names and columns per file is explained:

Dataset of images of dragon fruit plants, collected from different media and taken from a dragon fruit field in Rio Branco, Brazil, with a total of 600 images
classified among 300 photos of sick plants, with fish eyes among others and 300 photos of healthy plants. For many of the photos, a simple smartphone 
camera was used to capture the images.



Current neural network solutions for channel estimation are frequently tested by training and testing on one example channel or similar channels. However, data-driven algorithms often degrade significantly on other channels which they are not trained on, because they cannot extrapolate their training knowledge. Online training can fine-tune the offline-trained neural networks to compensate for this degradation, but its feasibility is challenged by the tremendous computational resources required.


Asthma is a common respiratory disease that affects people in many countries. It causes an attack that harms those patients and can cause death. This attack is related to many risk factors, including biosignals and environmental conditions. Here, we provide a dataset (584 entries) on the asthma biosignals and environmental conditions. This dataset was collected from 21 participants who have different levels of asthma disease. It was collected from the Makkah region in Saudi Arabia (Makkah and Jeddah cities) for three months, from 24-march – 30-June 2021.


This dataset includes electronics repair guides from MyFixitDataset. From the MyFixit dataset, 50 repair manuals were randomly selected from each of the Mac, PC, Phone, and Electronics categories. These 200 repair manuals with 4754 distinct steps were to be selected and loaded into the Assembly Guidance Ontology. Each step was evaluated by the authors, and suitable parameters were assigned for the Digital Assembly Guidance System.


The JKU-ITS AVDM contains data from 17 participants performing different tasks with various levels of distraction.
The data collection was carried out in accordance with the relevant guidelines and regulations and informed consent was obtained from all participants.
The dataset was collected using the JKU-ITS research vehicle with automated capabilities under different illumination and weather conditions along a secure test route within the