The DroneDetect dataset consists of 7 different models of popular Unmanned Aerial Systems (UAS) including the new DJI Mavic 2 Air S, DJI Mavic Pro, DJI Mavic Pro 2, DJI Inspire 2, DJI Mavic Mini, DJI Phantom 4 and the Parrot Disco. Recordings were collected using a Nuand BladeRF SDR and using open source software GNURadio. There are 4 subsets of data included in this dataset, the UAS signals in the presence of Bluetooth interference, in the presence of Wi-Fi signals, in the presence of both and with no interference.


Sample rate: 60Mbits/s

Bandwidth: 28MHz

Centre Freq: 2.4375GHz

Each recording consists of 1.2 x 10^8 complex samples equating to 2 seconds recording time. Data is saved into ‘.dat’ files  and the complex data is saved as interleaved floats. ‘’ is included for the data to be loaded into python and further split into smaller samples 20ms in length.

Files are categorised by interference, then by flight mode –

Switched on = ON

Hovering = HO

Flying = FY

Each file name uses an interference identifier, 00 for a clean signal, 01 for Bluetooth only, 10 for Wi-Fi only and 11 for Bluetooth and Wi-Fi interference concurrently. An example file name for Mavic Mini switched on in the presence of Bluetooth and Wi-Fi interference would be:

MIN + 11 + 00 + 00 = MIN_1100_00.dat





The .zip archive contains a folder ‘tasks’, and a .csv file, “analysis_results.csv” which is a table with 4077 entries. The .csv table is delimeted by comma. Each subfolder of the ‘tasks’ folder represents an analysis task of a unique sample. The association between tasks and samples is shown in the analysis_results.csv table, which contains the analysis results per sample. Each row in the table represents a botnet sample and holds information such as analysis task id, file hash, URL of the server where the sample was captured from, as well as the analysis results for that sample.  For each task id, the corresponding folder contains: 1) the results of the analysis (analysis_result.json); 2) the captured traffic (capture.pcap); 3) the recorded system calls (syscalls.json) and 4) the botnet sample file (ELF binary) with the original filename. Depending on the IoT botnet sample analysed, the network traffic may include port scanning, exploitation, C2 communications and DDoS traffic.




Dataset used in the article "The Reverse Problem of Keystroke Dynamics: Guessing Typed Text with Keystroke Timings". CSV files with dataset results summaries, the evaluated sentences, detailed results, and scores. Results data contains training and evaluation ARFF files for each user, containing features of synthetic and legitimate samples as described in the article. The source data comes from three free text keystroke dynamics datasets used in previous studies, by the authors (LSIA) and two other unrelated groups (KM, and PROSODY, subdivided in GAY, GUN, and REVIEW).


Dataset including over 40,000 generated images of malicious binaries for malware classification in machine learning as outlined in NARAD - A Novel Auto-learn Real-time Fuzzy Machine Learning Anomaly Detection and Classification System.


This dataset supports researchers in the validation process of solutions such as Intrusion Detection Systems (IDS) based on artificial intelligence and machine learning techniques for the detection and categorization of threats in Cyber Physical Systems (CPS). To that aim, data have been acquired from a water distribution hardware-in-the-loop testbed which emulates water passage between nine tanks via solenoid-valves, pumps, pressure and flow sensors. The testbed is composed by a real partition which is virtually connected to a simulated one.


This dataset has related to the paper "A hardware-in-the-loop Water Distribution Testbed (WDT) dataset for cyber-physical security testing".
We provide four different acquisitions:
1) A normal acquisition without attacks ("normal.csv" for network traffic and "dataset_norm.csv" for physical measures)
2) Three acquisitions where different types of attacks and physical faults are reproduced ("attack_1.csv", "attack_2.csv" and "attack_3.csv" for network traffic and "dataset_att_1.csv", "dataset_att_2.csv" and "dataset_att_3.csv" for physical measures)
In addition to .csv files we provide four .pcap files ("attack_1.pcap", "attack_2.pcap", "attack_3.pcap" and "normal.pcap") which refer to network acquisitions for the four previous scenarios.
A README.xlsx file summarizes the key features of the entire dataset.


Supplemental material for paper "Energy Efficiency Analysis of Post-Quantum Cryptographic Algorithms."


Please see README file for instructions and information about the content of these files.


·       9/11 hijackers network dataset [20]: The 9/11 hijackers network incorporates 61 nodes (each node is a terrorist involved in 9/11 bombing at World Trade Centers in 2011). Dataset was prepared based on some news report, and ties range from ‘at school with’ to ‘on the same plane’. The Data consists of a mode matrix with 19*19 terrorist by terrorist having trusted prior contacts with 1 mode matrix of 61 edges of other involved associates.


The S3 dataset contains the behaviour (sensors, statistics of applications, and voice) of 21 volunteers interacting with their smartphones for more than 60 days. The type of users is diverse, males and females in the age range from 18 until 70 have been considered in the dataset generation. The wide range of age is a key aspect, due to the impact of age in terms of smartphone usage. To generate the dataset the volunteers installed a prototype of the smartphone application in on their Android mobile phones.



The data set is compressed into a zip file. Please unzip this file in the desired place and inside the main folder, you will find the file with the instructions and the details of the database.


The Development of an Internet of Things (IoT) Network Traffic Dataset with Simulated Attack Data.

Abstract— This research focuses on the requirements for and the creation of an intrusion detection system (IDS) dataset for an Internet of Things (IoT) network domain.

DARPA is releasing these files in the public domain to stimulate further research. Their release implies no obligation or desire to support additional work in this space. The data is released as-is. DARPA makes no warranties as to the correctness, accuracy, or usefulness of the released data. In fact, since the data was produced by research prototypes, it is practically guaranteed to be imperfect.

The data containing red team activities is divided into three sets, each corresponding to the three days of evaluation: 23Sep19, 24Sep19, and 25Sep19. The fourth set (23Sep19-night) contains no threats and contains data from the first night of evaluations, when clients were left running unattended overnight to collect additional baseline data.

During the initial one thousand client test, each mainframe server hosted fifty Windows clients. Half of the clients were taken down from each server for data collection, reducing the number of clients to five hundred, which resulted in a client machine naming continuity gap (e.g. Sys001-Sys025, Sys051-Sys075, …, Sys951-Sys975).

A full description of the contents, including message formats and file structure can be found in the file attached to this page and included in the root directory of the OpTC.tar.gz.