Dataset Search

Numerical framework for multi-stage control of surface polishing

We consider the automation of polishing process for manufactured components, which is typically an iterative, multi-stage process that depends heavily on the practitioner’s expertise and visual inspection to guide decisions on polishing pad changes and fine-tuning of control parameters. We use a model-free, on-policy actor-critic reinforcement learning (RL) algorithm to determine the choice of pad, downforce, rotational speed, polishing duration for each stage, and the total number of polishing / inspection stages.

Categories:

Artificial Intelligence

Spread Spectrum Time Domain Reflectometry tests on Microwave Breast Phantom

Spread spectrum time domain reflectometry (SSTDR) is proposed to replace the VNA or UWB pulsed systems and switches in a microwave imaging system. These tests evaluate an SSTDR system (Keysight N7081A) from 2-4 GHz. 16 ultrawideband (UWB) antennas were placed in contact with the breast phantom. The McGill breast phantom is a hemispherical carbon-based phantom with the electrical properties of fat. A cylindrical hole allows for the insertion of a plug with fat properties or fat+tumor properties. These were both measured and provided in the attached data set.

Categories:

Medical Imaging

Air Temperature in East London

This dataset accompanies the IEEE IoT Journal paper titled "A Dual System IoT Strategy for Hyperlocal Spatial-Temporal Microclimate Monitoring in Urban Environments Using LoRa." It is intended for validating bespoke sensors against commercial sensors. The data were collected using two different types of sensors deployed at eight locations in East London, starting on August 1, 2023, and covering a period of one year.

Categories:

Climate Change/Environmental

Model development dataset

Data was acquired using data aquisiton interface in a laboratory on a flow control unit. The data has been transformed into two excel spreadsheets which was later used in Matlab. This dataset also consists of three Matlab codes. First one is the code for the experiment in which the ANN models were developed. Second Matlab code is the code for data importing from the excel spreadsheets and the third Matlab code is the data preparation code for the Simulink purposes in order to test the models.

Categories:

Artificial Intelligence

Household appliance-usage preferences, appliance energy consumption and hourly renewable-energy production, per season and date

Twelve (12) realistic datasets encapsulating residents’ preferences, with each dataset representing the appliance-usage preferences expressed for a variant set of households by their respective residents for a specific season and day. The preferences were extracted from the REFIT dataset, a public 500MB dataset which contains real kW readings of the power output for the most energy-intensive shiftable/real-time appliances in 20 households in the UK, between September 2013 and July 2015.

Categories:

IoT

The Channel Fading Characteristic Measurement Dataset of the North Campus of Xidian University

The measurements in this study were carried out at Xidian University's north campus in China. The building density and height in this area are typical of urban environments, and there are fewer uncertainties that could affect the experimental results. Figure shows an aerial view of the measurement environment, including the chosen Tx and Rx locations. Centered at each receiver, a square with a side length 15 times the wavelength of the transmitted signal was constructed. The receiving antenna was moved inside each square following the path shown in the figure.

Categories:

Power and Energy

StyleBench

Style Transfer Evaluation Benchmark (StyleBench) in submitted paper "StyleShot: A SnapShot on any Style".
To comprehensively evaluate the effectiveness and generalization ability of style transfer methods, we build StyleBench that covers 73 distinct styles, ranging from paintings, flat illustrations, 3D rendering to sculptures with varying materials. For each style, we collect 5-7 distinct images with variations.
In total, our StyleBench contains 490 images across diverse styles.

Categories:

Machine Learning

Synthetic Data for Smart Meter Attack Detection

This dataset contains synthetic smart meter data with simulated cyber attacks, designed to support research in anomaly detection, cybersecurity, and energy consumption analysis. The dataset is based on 159 users from the Smart Meters in London dataset, selected for their regular consumption patterns. This larger dataset can be found in

https://www.kaggle.com/datasets/jeanmidev/smart-meters-in-london,

which is a refactorised version of the data found in

https://data.london.gov.uk/dataset/smartmeter-energy-use-data-in-london-households.

Categories:

LSApp: Large dataset of Sequential mobile App usage

During the study period, with the help of 292 participants, we were able to collect 599,635 app usage records. Here, we summarize the main characteristics of the participants based on the submitted surveys. 59% of the participants were female and 50% aged between 25 and 34. Participants were from all kinds of educational backgrounds ranging from high school diploma to PhD. In particular, 32% of them had a college degree, followed by 30% with a bachelor's degree. Smartphone was the main device used for connecting to the Internet for 53% of the participants, followed by laptop (25%).

Categories:

Machine Learning

App Usage Behavior Modeling and Prediction

The Tsinghua App Usage Dataset is a large-scale mobile application usage dataset collected over one week in one of China’s largest cities. It contains anonymized app usage logs from 1,000 users, capturing detailed information on 2,000 identified apps across 9,800 base stations. Each record includes user ID, timestamp, base station location, app ID, and traffic consumption, allowing for comprehensive analysis of individual and regional mobile usage patterns.

Categories:

Machine Learning

Munsell Re-renotation: Revised

The dataset contains a revised version of the psychophysical color difference Munsell Re-renotation dataset.

Munsell Re-renotation is a psychophysical dataset describing large color differences, featuring 2986 colors characterized by standard colorimetric coordinates (x, y, Y) and coordinates within the Munsell system (H, V, C).The Munsell Re-renotation iteration enhances the uniformity of the system compared to its predecessor, the Munsell Renotation dataset.

Categories:

Other

SaPGAN

With the rapid advancement of large language models (LLMs), Model-as-a-Service (MaaS) has emerged as a powerful paradigm, enabling providers to deliver pre-trained models, computational resources, and database management within a unified platform.

Categories:

Artificial Intelligence

Process bus scalability with MU and emulated SV

This dataset presents the captures of data packets in Wireshark for the scalability analysis of the Process Bus in Digital Substations based on the traffic of Sampled Values, through connection in redundant local area networks (LAN A and LAN B). LAN A is a 100 Mbps network that is saturated close to 10 Sampled Values suscriptions, while LAN B is a 1Gbps network that is not saturated and is used as a reference for the loss of data packets in each case. Emulated SV data is provided by software and a laboratory signal injector.

Categories:

Power and Energy

Fluid Flow Images

The dataset comprises images generated using computational fluid dynamics (CFD) simulations for two cases: flow past an elliptic cylinder and flow past an aerofoil. Here are the details:

Elliptic Cylinder Dataset:

Images: 124 for low-speed and 124 for high-speed.
Conditions: Simulated for Reynolds numbers of 200 (low-speed) and 5000 (high-speed).

Aerofoil Dataset:

Images: 250 for low-speed and 250 for high-speed.
Conditions: Simulated under similar Reynolds number settings of 200 and 5000 for laminar and turbulent flows, respectively.

Categories:

the data in the paper

Transformers, especially dry-type transformers, which nowadays are going to be employed instead of oil-type transformers, are one of the major equipment in the generation, transmission and distribution network of electric energy. The transformer insulation strength may reduce due to partial discharge (PD) occurrence, and this can finally result in its insulation failure.

Categories:

Other

Ammonia, Acetone, Formaldehyde Dataset

This data set is related to gas sensing, time-series classification. It has three columns: Time, I1, and Target. The Target column refers to the type of gas measured with the following classification:

•1 for Ammonia (NH₃)

•2 for Acetone (C₃H₆O) & Formaldehyde (CH₂O)

The data has been processed in a number of stages for accuracy and consistency. Raw data was first gathered using a Metal Oxide Semiconductor (MOS) sensor. After acquisition, the data were cleaned and processed, such as Savitzky-Golay filtering to remove noise or artifacts and smooth the signal.

Categories:

Machine Learning

Nurturing the Future of English Language Education: Between Technology, Tradition and Gaps across Indonesian Education Levels

New technology solutions including tablets and advanced applications exist in modern classrooms across Indonesia but typically they miss the core educational objectives. The process of elementary school children memorizing the emoticon "happy" represents a lack of comprehension while high school students find themselves overwhelmed by IoT data and teachers work to adapt to AI-based educational requirements.

Categories:

The data for PVs and loads

The data for PVs and loads are sourced from “Multi-agent reinforcement learning for active voltage control on power distribution networks”, collected from Jan. 1 st, 2012 to Dec. 31 st, 2014 over a 3-minute interval. In this paper, the data is described as follows:

Categories:

Electric Utility

The data for PVs and loads from “Multi-agent reinforcement learning for active voltage control on power distribution networks”

The data for PVs and loads are sourced from “Multi-agent reinforcement learning for active voltage control on power distribution networks”, collected from Jan. 1 st, 2012 to Dec. 31 st, 2014 over a 3-minute interval. In this paper, the data is described as follows:

Categories:

Electric Utility

Stroke Prognosis Dataset Taizhou and Fuyong

The Fuyong dataset records 134 stroke patients who received treatment in the Shenzhen Fuyong People's Hospital between March 1, 2022, and September 31, 2024. Besides, medical records of 435 stroke patients treated in the Affiliated Taizhou People's Hospital of Nanjing Medical University between January 1, 2020, and December 31, 2023, are included in the Taizhou dataset. These two datasets use the pre- and post-thrombolysis of the NIHSS scores as a metric for evaluating the immediate efficacy of the thrombolytic intervention.

Categories:

Human-to-machine Application Packet Arrival Time

<p>The uploaded datasets contain packet arrival times from six distinct human-to-machine (H2M) applications for traffic modeling collected from a VR-based H2M experimental platform. The master device comprises VR gloves, each with two orientation sensors on the thumb and wrist (9-DOF) and five flexible sensors per finger for movement and force tracking. The sensors sample at 200 Hz, transmitting control signals via a customized Bluetooth interface (30 m range). Each control instance has 93 elements.

Categories:

IoT

LeafNet: A large-scale dataset for training image-text models in leaf disease identification

The PlantVillage dataset, with over 54,000 images spanning 14 plant species and 26 disease types, has been widely used for leaf disease classification. However, it is limited in both scale and diversity. To address these limitations, we developed LeafNet, a large-scale dataset designed to support foundation models for leaf disease diagnosis. LeafNet comprises over 186,000 images from 22 crop species, covering 43 fungal diseases, 8 bacterial diseases, 2 mould (oomycete) diseases, 6 viral diseases, and 3 mite-induced diseases, categorized into 97 classes.

Categories:

Cryptococcus neoformans spontaneous Raman spectroscopy

This dataset is from our paper "Bridging Lab-to-Clinic: Microbiological Screening via Swin-Ultra Transformer with Transfer Learning", which aims to validate the extension of the lab-verified bacterial classification model to the gene-type screening of unseen pathogens in clinical settings.

Categories:

Biomedical and Health Sciences

Supplementary Material - Supporting Usability Inspection Education with a Tailored ChatGPT: An Empirical Study on Learning Heuristic Evaluation

Experimental Setup:

We conducted the experiment over two separate days. On the first day, the participants inspected the object Canva (Web version), while on the second day, they inspected the object Skoob (Web version). Our study had 31 participants (undergraduate students enrolled in the HCI - Human-Computer Interaction Course).

Categories:

Education and Learning Technologies

Displaying 721 - 744 of 8315 results