Dataset Search

Four sets of remote sensing precipitation data in the upper basin of Yalong River Basin

This dataset supports the research presented in the paper "Comprehensive Hydrological Evaluation of Satellite-Based Precipitation Products: Integrating Model Parameter and Structural Uncertainty" It includes precipitation data from four satellite-based gridded products, prepared for hydrological modeling over 21 sub-basins within the study area. The dataset is designed to drive three hydrological models with different structures: HYMOD, SWAT, and BTOPMC.

Categories:

Weather

A High-Power Modular Radio Frequency Converter with Wide-Range Power Regulation Under ZVS Operation

Radio frequency converter (RFC) is widely used in the semiconductor manufacturing industry, which requires high power, high efficiency, and a wide power regulation range. To fulfill these demands, this paper proposes efficient modular power amplifiers (PAs), where multiple RFC modules are constructed using full-bridge class-D PAs. The constant-current output characteristics of individual modules and the system-level power delivery capability are comprehensively analyzed.

Categories:

Power and Energy

A Wearable Vision-based System for Fall Detection

The CSV data files in the ZIP archive are analytical datasets extracted and processed from the RUG-EGO-FALL dataset, intended to support fall detection research using wearable first-person perspective devices. The data includes visual motion information for each video frame, calculated using the ORB (Oriented FAST and Rotated BRIEF) feature point algorithm in combination with the Lucas-Kanade optical flow method.

Categories:

BadSTR: Backdoor Attack on Scene Text Recognition in IoT

Recent researches have shown that non-sequential tasks based on deep neural networks (DNN), such as image classification and object detection, are vulnerable to backdoor attacks, leading to incorrect model predictions. As a crucial task in computer vision, Scene Text Recognition (STR) is widely used in IoT fields such as intelligent transportation systems and intelligent surveillance. Therefore, a high degree of security is needed to ensure the accuracy of the system for text recognition. However, there are currently no studies on STR backdoor attacks.

Categories:

Security

Dense prediction dataset

The NYUD V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect. It features: 1449 densely labeled pairs of aligned RGB and depth images 464 new scenes taken from 3 cities, 407,024 new unlabeled frames

Categories:

Artificial Intelligence

4-element umbrella antenna array dataset

To generate the training and validation datasets, the fullwave electromagnetic solver FEKO is utilized. The phase of the reference antenna (Antenna 1) is fixed at 0°, while the phases of the other three antennas are varied in 30° steps within the range [0◦ , 360◦ ]. This results in 13×13×13 = 2197 distinct phase combinations. For each combination, the farfield radiation intensities are extracted on the XOY, YOZ, and XOZ planes, yielding a total of 2197 × 3 data samples. For testing, 343 new phase combinations are generated within the interval [15◦ , 195◦ ] using 30° steps (i.e., 7×7×7).

Categories:

Artificial Intelligence

4-element umbrella antenna array dataset

To generate the training and validation datasets, the fullwave electromagnetic solver FEKO is utilized. The phase of the reference antenna (Antenna 1) is fixed at 0°, while the phases of the other three antennas are varied in 30° steps within the range [0◦ , 360◦ ]. This results in 13×13×13 = 2197 distinct phase combinations. For each combination, the farfield radiation intensities are extracted on the XOY, YOZ, and XOZ planes, yielding a total of 2197 × 3 data samples. For testing, 343 new phase combinations are generated within the interval [15◦ , 195◦ ] using 30° steps (i.e., 7×7×7).

Categories:

Artificial Intelligence

PE_VAE_GAN: A Network for Adaptive ERT Image Reconstruction

This paper proposed a PE-VAE-GAN network that adaptively selected image reconstruction networks based on flow pattern classification, significantly improving the quality of Electrical Resistance Tomography (ERT) reconstructed images.To address insufficient feature extraction from voltage data,we presented a pseudo-image encoding method that converted the one-dimensional voltage signals into the two-dimensional grayscale images.

Categories:

PASA

We evaluate the performance of our proposed method using four benchmark datasets: MNIST, CIFAR-10, Traffic-sign Recognition (TSR), and Room-occupancy Detection (ROD). Each dataset is divided into training and test sets, with specific proportions as described below.MNIST: This dataset consists of grayscale images of handwritten digits, with 10 distinct classes. It includes 60,000 training images and 10,000 test images, each formatted as a 28x28 pixel grayscale map.CIFAR-10: Unlike MNIST, CIFAR-10 is a dataset of color images.

Categories:

Computer Vision

Bitcoin Tweets 2022

Bitcoin(₿) is a cryptocurrency invented in 2008 by an unknown person or group of people using the pseudonym Satoshi Nakamoto. The currency began use in 2009 when its implementation was released as open-source software.

Categories:

Vibration Signal Datasets for Bearing Fault Diagnosis and Out-of-Distribution Detection

This dataset comprises vibration signals collected from bearing test rigs under both healthy and faulty conditions, designed to support research in fault diagnosis and out-of-distribution (OOD) detection. The data includes:

CWRU Dataset: Signals from the Case Western Reserve University bearing test platform, sampled at 12 kHz, covering normal operation and three fault types (inner race, outer race, and rolling element faults) with varying severities (0.007–0.021 inches). OOD samples are explicitly labeled for validation.

Categories:

Dataset For PerturbVFL

Vertical Federated Learning (VFL) enables multiple organizations to collaboratively train machine learning models without sharing raw data, particularly suited for tabular datasets with aligned sample IDs but disjoint feature spaces. Despite its growing relevance in privacy-sensitive sectors such as finance and healthcare, publicly available benchmarks for VFL on tabular data remain limited. This paper introduces and categorizes a collection of real-world tabular datasets tailored for VFL research, highlighting their feature distribution, domain applicability, and security relevance.

Categories:

Artificial Intelligence

Micronutrient Deficiency Data

Paper : Assessment of Inference Improvements for Facial Micronutrient Deficiency Detection using Attention-Enhanced YOLOv5

Authors : Amey Agarwal, Shreya Rathod, Riva Rodrigues, Nirmitee Sarode, Dhananjay R. Kalbande

Desciption

This is a dataset of 7 classes : 6 facial skin problems and 1 null class.

A facial skin problem may be identified in an image and marked using Bounding Box Annotation.

Acne Class indicates deficiency of Vitamin D

Blackhead and Nodules are types of acne

Categories:

imdg10

This dataset contains data collected from 10 participants using a liquid metal data glove equipped with 12 sensors. The data is primarily aimed at facilitating research in the fields of human - computer interaction, motion tracking, and related areas. The use of a liquid metal data glove offers unique advantages in terms of flexibility and sensitivity, providing rich and detailed information about finger movements.

Categories:

Wearable Sensing

Earthquake precursory anomaly datasets

This dataset integrates multi-source geophysical precursor data from seismically active regions, and constructs standardized data containing spatial and temporal parameters and anomalous signal intensities through denoising, normalization, and anomaly correction, aiming to support the detection of seismic precursor anomalies, the study of mechanisms, and the development of early-warning models, and to provide a high-reliability data base for seismic anomaly detection and risk assessment.

Categories:

Geo-Sensing

11 VOCs dataset of G919 electronic nose

This dataset is the original resistance data and gas response data of 11 VOCs in 8 commercial gas sensors collected by the self-developed G919 electronic nose device. Eleven types of VOCs were detected, including acetone, ethanol, butyl acetate, methanol, dimethylbenzene, isopropanol, methylbenzene, benzaldehyde, hexane, n-propanol and ethylene glycol. Eight commercial gas sensors were employed, including MQ-2, MQ-3, MQ-4, MQ-5, MQ-6, MQ-7, MQ-8, MQ-9.

Categories:

Machine Learning

ComGPT

The Dolphins network is based on observations of associations between pairs of dolphins, where edges represent associations between dolphins. The Football network is based on college football matches. Nodes represent college football teams, communities correspond to different leagues, and edges correspond to games played between them. The Polbooks network is a co-purchasing network for books. Each node in the dataset represents a book about US politics. An edge between two books indicates that they are often purchased together by customers.

Categories:

Social Sciences

Cross-subject Cognitive State Evaluation in Aviation Multi-Task Scenarios

This repository contains resources for EEG data processing and cognitive load recognition using a Multi-Head Attention EEGNet model. It includes original EEG data, MATLAB code for preprocessing, and Python code for classification.

With the ethics approval obtained from our institution, this study acquired 30 subjects aged between 18 to 29 to conduct research. Informed written consents were attained from all participants. The selection of participants follows a standardized and rigorous protocol that they have to meet the following requirements:

Categories:

Artificial Intelligence

Valorant Champions Tour 2024: Pacific and EMEA Round Data

This dataset comprises 1301 rounds collected from the 2024 Valorant Champions Tour Pacific and EMEA. Specifically, those rounds contain completed information on Team A and B's number of available ultimate abilities each round, the average number of ultimate points until ultimate ability for each team per round, and each team's total loadout value per round. Each round's outcomes are labeled and weighed against the predicted outcomes in future logistic regression modeling.

Categories:

Machine Learning

A word-level Wi-Fi CSI based Deep Bangladeshi Sign Language Dataset(WiBaSL)

WiFi-based human sensing has shown remarkable potential to detect sign language gestures in a non-intrusive manner. However, most previous works focus on American Sign Language detection, ignoring applications in other widely used languages such as Bangla Sign Language. There also remains a lack of collection of sign language gestures for Activities of Daily Life (ADL) necessary for instructing children with Autism Spectrum Disorder (ASD).

Categories:

TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. TriviaQA includes 95K question-answer pairs authored by trivia enthusiasts and independently gathered evidence documents, six per question on average, that provide high quality distant supervision for answering the questions.

Categories:

Artificial Intelligence

Motor Imagery 4-class Dataset

Fourteen participants completed this experiment. All participants were university students, physically healthy, with no history of neurological or psychiatric disorders, and were all right-handed. Written informed consent was obtained from each participant after a detailed explanation of the study protocol. All procedures were conducted in accordance with the Declaration of Helsinki. Due to issues encountered during the data collection process, the data from Participant S001 is deemed unreliable and has been excluded from further analysis.

Categories:

Biophysiological Signals

Fed3DL

CIFAR-10 and CIFAR-100 datasets comprise images of 10 and 100 categories, respectively, with a fixed size of 32x32 pixels in color.

Tiny-ImageNet dataset consists of 200 categories with approximately 120,000 samples, where each class contains 500 training images, 50 validation images, and 50 test images, with each image sized at 64 x 64.

Categories:

Computer Vision

A Portable 6D Surgical Instrument Magnetic Localization Dataset2

This dataset provides 6D magnetic localization data for surgical instrument tracking, focusing on position and orientation estimation in minimally invasive procedures. It includes various trajectory experiments such as square, circular, saddle-shaped, and helical paths, along with simulated minimally invasive knee surgery and needle sampling experiments. Additionally, it contains dynamic error correction verification data. Data is collected using 16 LIS3MDL magnetometers at 300 Hz, offering both raw and filtered data for algorithm validation.

Categories:

Sensors

Displaying 241 - 264 of 8240 results