Machine Learning

DALHOUSIE NIMS LAB ATTACK IOT DATASET 2025-1 dataset comprises of four prevalent types attacks, namely Portscan, Slowloris, Synflood, and Vulnerability Scan, on nine distinct Internet of Things (IoT) devices. These attacks are very common on the IoT eco-systems because they often serve as precursors to more sophisticated attack vectors. By analyzing attack vector traffic characteristics and IoT device responses, our dataset will aid to shed light on IoT eco-system vulnerabilities.

Categories:
71 Views

This dataset comprises Terahertz (THz) images collected to support the research presented in the IEEE Access paper titled Diagnosing Grass Seed Infestation: Convolutional Neural Network Based Terahertz Imaging. The dataset is intended for the detection and classification of grass seeds embedded in biological samples, specifically ham, covered with varying thicknesses of wool. The images were captured at different frequencies within the THz spectrum, providing valuable data for the development of deep-learning models for seed detection.

Categories:
18 Views

This dataset contains human motion data collected using inertial measurement units (IMUs), including accelerometer and gyroscope readings, from participants performing specific activities. The data was gathered under controlled conditions with verbal informed consent and includes diverse motion patterns that can be used for research in human activity recognition, wearable sensor applications, and machine learning algorithm development. Each sample is labeled and processed to ensure consistency, with raw and augmented data available for use. 

Categories:
29 Views

Network telescopes collect and record unsolicited Internet-wide traffic destined to a routed but unused address space usually referred to as “Darknet” or “blackhole” address space. Among the largest network telescopes in the US, Merit Network operates one that receives unsolicited internet traffic on around 475k unused IP addresses. On an average day, the network telescope receives approximately 41.5k packets per second and around 17M bits per second. Description of the attached dataset:

1. Data Source:

Categories:
103 Views

This is a dataset that contains the testing results presented in the manuscript "Exploring the Potential of Offline LLMs in Data Science: A Study on Code Generation for Data Analysis", and it aims to assess offline LLMs' capabilities in code generation for data analytics tasks. Best utilization of the dataset would occur after thorough understanding of the manuscript. A total of 250 testing results were generated. They were merged, leading to the creation of this current dataset.

Categories:
34 Views

The PermGuard dataset is a carefully crafted Android Malware dataset that maps Android permissions to exploitation techniques, providing valuable insights into how malware can exploit these permissions. It consists of 55,911 benign and 55,911 malware apps, creating a balanced dataset for analysis. APK files were sourced from AndroZoo, including applications scanned between January 1, 2019, and July 1, 2024. A novel construction method extracts Android permissions and links them to exploitation techniques, enabling a deeper understanding of permission misuse.

Categories:
341 Views

The data is collected from the deployed IoT sensor node at a pilot farm in Narrabri, Australia. The dataset includes information about soil characteristics such as soil moisture and soil temperature at 20-40-60 cm depth. The sensor node also provides information about environmental influencers, which are critical in constructing machine learning models to predict Evapotranspiration in diverse soil and environmental conditions.

Categories:
606 Views

The necessity for strong security measures to fend off cyberattacks has increased due to the growing use of Industrial Internet of Things (IIoT) technologies. This research introduces IoTForge Pro, a comprehensive security testbed designed to generate a diverse and extensive intrusion dataset for IIoT environments. The testbed simulates various IIoT scenarios, incorporating network topologies and communication protocols to create realistic attack vectors and normal traffic patterns.

Categories:
250 Views

Please cite the following paper when using this dataset:

Vanessa Su and Nirmalya Thakur, “COVID-19 on YouTube: A Data-Driven Analysis of Sentiment, Toxicity, and Content Recommendations”, Proceedings of the IEEE 15th Annual Computing and Communication Workshop and Conference 2025, Las Vegas, USA, Jan 06-08, 2025 (Paper accepted for publication, Preprint: https://arxiv.org/abs/2412.17180).

Abstract:

Categories:
112 Views

M. Kacmajor and J.D. Kelleher, "ExTra: Evaluation of Automatically Generated Source Code Using Execution Traces" (submitted to IEEE TSE)

Categories:
19 Views

Pages