Machine Learning

This dataset is designed for the purpose of curve fitting, a key process in the reconstruction of implicit curves. It encompasses a collection of point cloud data that has been sampled directly from curves, as well as the code necessary to generate point cloud data from these curves.

Categories:
129 Views

X-CANIDS Dataset (In-Vehicle Signal Dataset)

In March 2024, one of our recent research "X-CANIDS: Signal-Aware Explainable Intrusion Detection System for Controller Area Network-Based In-Vehicle Network" was published in IEEE Transactions on Vehicular Technology. Here we publish the dataset used in the article. We hope our dataset facilitates further research using deserialized signals as well as raw CAN messages.

Real-world data collection. Our benign driving dataset is unique in that it has been collected from real-world environments.

Categories:
1601 Views

Inertial sensors are widely used in a variety of applications. A common task is orientation estimation. To tackle such a task, attitude and heading reference system algorithms are applied. Relying on the gyroscope readings, the accelerometer measurements are used to update the attitude angles, and magnetometer measurements are utilized to update the heading angle. In indoor environments, magnetometers suffer from interference that degrades their performance.

Categories:
55 Views

This dataset comprises audio recordings of ultra-high-frequency ambient noise stored in the lossless waveform format (WAW). The recordings were sampled at a frequency sample rate of 2.048 MHz and then provided at a downsampled audio rate of 48 kHz for compatibility and practical usage. The total length of the dataset is 01:30:29, consisting of approximately 260 million data points. (2024-03-30)

Categories:
25 Views

Plasma-based semiconductor processing is highly sensitive, thus even minor changes in the procedure can have serious consequences. The monitoring and classification of these equipment anomalies can be performed using fault detection and classification (FDC). However, class imbalance in semiconductor process data poses a significant obstacle to the introduction of FDC into semiconductor equipment. Overfitting can occur in machine learning due to the diversity and imbalance of datasets for normal and abnormal.

Categories:
119 Views

In our ever-expanding world of advanced satellite and communications systems, there's a growing challenge for passive radiometer sensors used in the Earth observation like 5G. These passive sensors are challenged by risks from radio frequency interference (RFI) caused by anthropogenic signals. To address this, we urgently need effective methods to quantify the impacts of 5G on Earth observing radiometers. Unfortunately, the lack of substantial datasets in the radio frequency (RF) domain, especially for active/passive coexistence, hinders progress.

Categories:
462 Views

Anomaly detection plays a crucial role in various domains, including but not limited to cybersecurity, space science, finance, and healthcare. However, the lack of standardized benchmark datasets hinders the comparative evaluation of anomaly detection algorithms. In this work, we address this gap by presenting a curated collection of preprocessed datasets for spacecraft anomalies sourced from multiple sources. These datasets cover a diverse range of anomalies and real-world scenarios for the spacecrafts.

Categories:
768 Views

This data is the reanalysis of sea surface temperature provided by Extended Reconstruction Sea Surface Temperature version 5 (ERSST v.5) from January 1854 to December 2022, Hadley Centre Global Sea Ice and Sea Surface Temperature (HadISST) from January 1870 to December 2022, and COBE-SST2 Sea Surface Temperature and Ice (COBE-SST2) from January 1854 to December 2022. All data is re-gridded to have the same spatial resolution of 2.0° × 2.0°, and the grid spans from 88°N to 88°S and 0°E to 358°E via bilinear interpolation from the initial grid. This dataset is in NetCDF4 format.

Categories:
100 Views

Image representation of Malware-benign dataset. The Dataset were compiled from various sources malware repositories:  The Malware-Repo, TheZoo,Malware Bazar, Malware Database, TekDefense. Meanwhile benign samples were sourced from system application of Microsoft 10 and 11, as well as open source software repository such as Sourceforge, PortableFreeware, CNET, FileForum. The samples were validated by scanning them using Virustotal Malware scanning services. The Samples were pre-processed by transforming the malware binary into grayscale images following rules from Nataraj (2011).

Categories:
351 Views

Pages