Skip to main content

Machine Learning

Comprehensive dataset (5000 spectra) of simulated grating biosensor reflections in Excel format. Generated via Lumerical FDTD, it includes 11 parameters (thickness, RI, peak wavelength, FWHM, reflectance, etc.). It is ideal for data visualization, sensor response exploration, and AI/ML benchmarking. The full dataset in Excel format is coming soon! Follow this repository to be notified when it's released. In the meantime, feel free to browse the README for more information about the project.

Categories:

This dataset provides measurements of cerebral blood flow using Radio Frequency (RF) sensors operating in the Ultra-Wideband (UWB) frequency range, enabling non-invasive monitoring of cerebral hemodynamics. It includes blood flow feature data from two arterial networks, Arterial Network A and Arterial Network B. Statistical features were manually extracted from the RF sensor data, while autonomous feature extraction was performed using a Stacked Autoencoder (SAE) with architectures such as 32-16-32, 64-32-16-32-64, and 128-64-32-16-32-64-128.

Categories:

DALHOUSIE NIMS LAB ATTACK IOT DATASET 2025-1 dataset comprises of four prevalent types attacks, namely Portscan, Slowloris, Synflood, and Vulnerability Scan, on nine distinct Internet of Things (IoT) devices. These attacks are very common on the IoT eco-systems because they often serve as precursors to more sophisticated attack vectors. By analyzing attack vector traffic characteristics and IoT device responses, our dataset will aid to shed light on IoT eco-system vulnerabilities.

Categories:

This dataset comprises Terahertz (THz) images collected to support the research presented in the IEEE Access paper titled Diagnosing Grass Seed Infestation: Convolutional Neural Network Based Terahertz Imaging. The dataset is intended for the detection and classification of grass seeds embedded in biological samples, specifically ham, covered with varying thicknesses of wool. The images were captured at different frequencies within the THz spectrum, providing valuable data for the development of deep-learning models for seed detection.

Categories:

This dataset contains human motion data collected using inertial measurement units (IMUs), including accelerometer and gyroscope readings, from participants performing specific activities. The data was gathered under controlled conditions with verbal informed consent and includes diverse motion patterns that can be used for research in human activity recognition, wearable sensor applications, and machine learning algorithm development. Each sample is labeled and processed to ensure consistency, with raw and augmented data available for use. 

Categories:

Network telescopes collect and record unsolicited Internet-wide traffic destined to a routed but unused address space usually referred to as “Darknet” or “blackhole” address space. Among the largest network telescopes in the US, Merit Network operates one that receives unsolicited internet traffic on around 475k unused IP addresses. On an average day, the network telescope receives approximately 41.5k packets per second and around 17M bits per second. Description of the attached dataset:

1. Data Source:

Categories:

This is a dataset that contains the testing results presented in the manuscript "Exploring the Potential of Offline LLMs in Data Science: A Study on Code Generation for Data Analysis", and it aims to assess offline LLMs' capabilities in code generation for data analytics tasks. Best utilization of the dataset would occur after thorough understanding of the manuscript. A total of 250 testing results were generated for each of the two LLMs evaluated. They were merged, leading to the creation of this current dataset.

Categories:

The PermGuard dataset is a carefully crafted Android Malware dataset that maps Android permissions to exploitation techniques, providing valuable insights into how malware can exploit these permissions. It consists of 55,911 benign and 55,911 malware apps, creating a balanced dataset for analysis. APK files were sourced from AndroZoo, including applications scanned between January 1, 2019, and July 1, 2024. A novel construction method extracts Android permissions and links them to exploitation techniques, enabling a deeper understanding of permission misuse.

Categories: