Machine Learning

IoTForge Pro

The necessity for strong security measures to fend off cyberattacks has increased due to the growing use of Industrial Internet of Things (IIoT) technologies. This research introduces IoTForge Pro, a comprehensive security testbed designed to generate a diverse and extensive intrusion dataset for IIoT environments. The testbed simulates various IIoT scenarios, incorporating network topologies and communication protocols to create realistic attack vectors and normal traffic patterns.

Categories:: Artificial Intelligence
Wireless Networking
IoT
Machine Learning
Sensors
Communications
Remote Sensing
Security

338 Views

Bone Cement Removal with Audio-Monitoring and Erosion Depth

This dataset comprises extensive multi-modal data related to the experimental study of ultrasonically excited pulsating fluid jets used for bone cement removal. Conducted at the Institute of Geonics, Ostrava, Czech Republic, the study explores the effect of varying standoff distances on erosion profiles, under controlled parameters including a fixed nozzle diameter, sonotrode frequency, supply pressure, and robot arm velocity. The dataset includes numerical data representing ablation profiles, captured as a large CSV file, and audio recordings captured using a high-resolution microphone.

Categories:: Artificial Intelligence
Signal Processing
Machine Learning
Sensors
Biomedical and Health Sciences

202 Views

COVID-19 on YouTube: A Data-Driven Analysis of Sentiment, Toxicity, and Content Recommendations

Please cite the following paper when using this dataset:

Vanessa Su and Nirmalya Thakur, “COVID-19 on YouTube: A Data-Driven Analysis of Sentiment, Toxicity, and Content Recommendations”, Proceedings of the IEEE 15th Annual Computing and Communication Workshop and Conference 2025, Las Vegas, USA, Jan 06-08, 2025 (Paper accepted for publication, Preprint: https://arxiv.org/abs/2412.17180).

Abstract:

Categories:: Artificial Intelligence
Education and Learning Technologies
Machine Learning
Computational Intelligence
COVID-19
Demographic
Health

152 Views

WRIVA Public Data

The IARPA WRIVA program aims to develop software systems that can create photorealistic, navigable 3D site models using a highly limited corpus of imagery, to include ground level imagery, surveillance height imagery, airborne altitude imagery, and satellite imagery. Additionally, where imagery lacks metadata indicating geolocation, information about camera parameters, or is corrupted by artifacts, WRIVA seeks to detect and correct these factors to incorporate the imagery in site-modelling and other downstream image processing and analysis algorithms.

Categories:: Machine Learning
Image Fusion
Computer Vision

627 Views

SpringProd and ApacheProd - executable text-code datasets

M. Kacmajor and J.D. Kelleher, "ExTra: Evaluation of Automatically Generated Source Code Using Execution Traces" (submitted to IEEE TSE)

Categories:: Artificial Intelligence
Machine Learning

27 Views

SpringTC - an executable text-code dataset

M. Kacmajor and J.D. Kelleher, "ExTra: Evaluation of Automatically Generated Source Code Using Execution Traces" (submitted to IEEE TSE)

Categories:: Artificial Intelligence
Machine Learning

49 Views

Palmer Penguins 100k

To provide machine learning and data science experts with a more robust dataset for model training, the well-known Palmer Penguins dataset has been expanded from its original 344 rows to 100,000 rows. This substantial increase was achieved using an adversarial random forest technique, effectively generating additional synthetic data while maintaining key patterns and features. The method achieved an impressive accuracy of 88%, ensuring the expanded dataset remains realistic and suitable for classification tasks.

Categories:: Machine Learning
Social Sciences

387 Views

MobRFFI: A WiFi RF Fingerprinting Dataset with Granular Multi-Receiver Signal Capture

MobRFFI is a WiFi device fingerprinting and re-identification dataset collected in the Orbit testbed facility in July and April 2024. The dataset contains raw IQ samples of WiFi transmissions captured at 25 Msps on channel 11 (2462 MHz) in the 2.4 GHz band, using Ettus Research N210r4 USRPs as receivers and a set of WiFi nodes equipped with Atheros AR5212 chipsets as transmitters. The data collection spans two days (July 19 and August 8, 2024) and includes 12,068 capture files totaling 5.7 TB of data.

Categories:: Wireless Networking
Digital signal processing
IoT
Machine Learning

179 Views

Low voltage DC series arc fault

This dataset is collected to investigate detection algorithms for DC series arc faults on resource constrained devices. The data is current signal collected from test-bench build to simulate arc fault under various conditions according to the standard UL1699B. Different condition includes the power mode regulated by the elctronic load in the system for the simulation of different system dynamics under topology series and parallel.

Categories:: Signal Processing
Machine Learning
Power and Energy
Arc Flash

97 Views

GNSS Interference Spectrum Controlled Low-Frequency

Jamming devices present a significant threat by disrupting signals from the global navigation satellite system (GNSS), compromising the robustness of accurate positioning. The detection of anomalies within frequency snapshots is crucial to counteract these interferences effectively. A critical preliminary measure involves the reliable classification of interferences and characterization and localization of jamming devices.

Categories:: Artificial Intelligence
Signal Processing
Machine Learning
Sensors
Standards Research Data

259 Views

Machine Learning

Machine Learning

Pages