data augmentation
This dataset comprises 2 million synthetic samples generated using the Variational Autoencoder-Generative Adversarial Network (VAE-GAN) technique. The dataset is designed to facilitate cardiovascular disease prediction through various demographic, physical, and health-related attributes. It contains essential physiological and behavioral indicators that contribute to cardiovascular health.
Dataset Description The dataset consists of the following features:
- Categories:
This dataset comprises 2 million synthetic samples generated using the Variational Autoencoder-Generative Adversarial Network (VAE-GAN) technique. The dataset is designed to facilitate cardiovascular disease prediction through various demographic, physical, and health-related attributes. It contains essential physiological and behavioral indicators that contribute to cardiovascular health.
Dataset Description The dataset consists of the following features:
- Categories:
dataset on Indian banknotes that was collected to aid research in fields like financial technology, security, and machine learning. The collection contains notes for both older and more recent generations of Indian currency, including ₹1, ₹2, ₹5, ₹10, ₹20, ₹50, ₹100, ₹200, and ₹500. Each note has been carefully scanned and sorted. Important details have been noted, including the note's design, serial number, and security features including small printed text, security threads, and watermarks. We systematically gathered, checked, and labeled every note in order to create this dataset.
- Categories:
The "CloudPatch-7 Hyperspectral Dataset" comprises a manually curated collection of hyperspectral images, focused on pixel classification of atmospheric cloud classes. This labeled dataset features 380 patches, each a 50x50 pixel grid, derived from 28 larger, unlabeled parent images approximately 4402-by-1600 pixels in size. Captured using the Resonon PIKA XC2 camera, these images span 462 spectral bands from 400 to 1000 nm.
- Categories:
This dataset is associated with TODOS: Thermal sensOr Data-driven Occupancy Estimation System for Smart Buildings. It is a novel system for estimating occupancy in intelligent buildings, TODOS uses a low-cost, low-power thermal sensor array along with a passive infrared sensor. We introduce a novel data processing pipeline that allows us to automatically extract features from the thermal images using an artificial neural network. Through an extensive experimental evaluation, we show that TODOS provides occupancy detection accuracy of 98% to 100% under different scenarios.
- Categories:
Data diversity and volume are crucial to the success of training deep learning models, while in the medical imaging field, the difficulty and cost of data collection and annotation are especially huge. Specifically in robotic surgery, data scarcity and imbalance have heavily affected the model accuracy and limited the design and deployment of deep learning-based surgical applications such as surgical instrument segmentation.
- Categories:

Data augmentation is commonly used to increase the size and diversity of the datasets in machine learning. It is of particular importance to evaluate the robustness of the existing machine learning methods. With progress in geometrical and 3D machine learning, many methods exist to augment a 3D object, from the generation of random orientations to exploring different perspectives of an object. In high-precision applications, the machine learning model must be robust with respect to the small perturbations of the input object.
- Categories: