Machine Learning
When training supervised deep learning models for despeckling SAR images, it is necessary to have a labeled dataset with pairs of images to be able to assess the quality of the filtering process. These pairs of images must be noisy and ground truth. The noisy images contain the speckle generated during the backscatter of the microwave signal, while the ground truth is generated through multitemporal fusion operations. In this paper, two operations are performed: mean and median.
- Categories:
In the domain of gait recognition, the scarcity of non-simulated, real-world data significantly hampers the performance and applicability of recognition systems. To address this limitation, we present a comprehensive gait recognition dataset - GaitMotion- collected using built-in sensors of Android smartphones in an uncontrolled, real-world environment. This dataset captures the walking activity of 24 subjects (14 females and 10 males) above 18 years old and weighing at least 50 kg.
- Categories:
The dataset is a self-constructed wafer surface defect dataset, with each image captured in real-time. The extraction and segmentation of wafer image have been performed, and each image represents a single individual die. The dataset primarily includes images of defect-free dies, as well as four types of defective images: particle, scratch, stain, and liquid residual. A total of 500 images are included, and the various types of defects within the images have been annotated using the Make Sense online annotation tool.
- Categories:
A dataset has been created by recoloring three existing datasets: NeRF Synthetic, LLFF, and Mip 360. The recoloring was performed to provide ground truth for validating recoloring applications. NeRF Synthetic was recolored using Blender, while LLFF and Mip 360 were processed in Photoshop. For each scene in the datasets, 11 images were recolored, ensuring consistency across the datasets.
- Categories:
Intrusion detection in Unmanned Aerial Vehicle (UAV) networks is crucial for maintaining the security and integrity of autonomous operations. However, the effectiveness of intrusion detection systems (IDS) is often compromised by the scarcity and imbalance of available datasets, which limits the ability to train accurate and reliable machine learning models. To address these challenges, we present the "CTGAN-Enhanced Dataset for UAV Network Intrusion Detection", a meticulously curated and augmented dataset designed to improve the performance of IDS in UAV environments.
- Categories:
Well logs are interpreted/processed to estimate the in-situ reservoir properties (petrophysical, geomechanical, and geochemical), which is essential for reservoir modeling, reserve estimation, and production forecasting. The modeling is often based on multi-mineral physics or empirical formulae. When sufficient amount of training data is available, machine learning solution provides an alternative approach to estimate those reservoir properties based on well log data and is usually with less turn-around time and human involvements.
- Categories:
This repository contains the datasets produced using different data generation strategies to train data driven models (e.g., decision trees, gradient tree boosting, and deep neural networks), and to evaluate their performances. The data generation strategies are described, and the results are presented in the conference paper: "Training Data Generation Strategies for Data-driven Security Assessment of Low Voltage Smart Grids" J. Cuenca, E. Aldea, E. Le Guern-Dall'o, R. Féraud, G. Camilleri, and A. Blavette. IEEE ISGT EU 2024, Dubrovnik, Croatia, Oct 2024.
- Categories:
Human pose estimation has applications in numerous fields, including action recognition, human-robot interaction, motion capture, augmented reality, sports analytics, and healthcare. Many datasets and deep learning models are available for human pose estimation within the visible domain. However, challenges such as poor lighting and privacy issues persist. These challenges can be addressed using thermal cameras; nonetheless, only a few annotated thermal human pose datasets are available for training deep learning-based human pose estimation models.
- Categories:
The dataset contains the ground-based observations of crop growth stages for Canada's prairie provinces (Manitoba, Saskatchewan and Alberta) from 2019 to 2020. Crop growth stages were visually observed from the side of the fields on a weekly cycle until the fields were harvested. The BBCH (Biologische Bundesanstalt, Bundessortenamt und CHemische Industrie) scale was used to stage growth.
- Categories:
Concept-1K is a novel dataset designed to facilitate research on incremental learning in large language models. It comprises 1,023 concepts represented as knowledge triplets, focusing on recently emerged topics to minimize data leakage. By providing a fine-grained approach to evaluating model performance, Concept-1K enhances the understanding of how these models learn and retain new information.
- Categories: