MQTT-IoT-IDS2020: MQTT Internet of Things Intrusion Detection Dataset

- Citation Author(s):
- Submitted by:
- Hanan Hindy
- Last updated:
- DOI:
- 10.21227/bhxy-ep04
- Data Format:
- Categories:
- Keywords:
Abstract
Message Queuing Telemetry Transport (MQTT) protocol is one of the most used standards used in Internet of Things (IoT) machine to machine communication. The increase in the number of available IoT devices and used protocols reinforce the need for new and robust Intrusion Detection Systems (IDS). However, building IoT IDS requires the availability of datasets to process, train and evaluate these models. The dataset presented in this paper is the first to simulate an MQTT-based network. The dataset is generated using a simulated MQTT network architecture. The network comprises twelve sensors, a broker, a simulated camera, and an attacker. Five scenarios are recorded: (1) normal operation, (2) aggressive scan, (3) UDP scan, (4) Sparta SSH brute-force, and (5) MQTT brute-force attack. The raw pcap files are saved, then features are extracted. Three abstraction levels of features are extracted from the raw pcap files: (a) packet features, (b) Unidirectional flow features and (c) Bidirectional flow features. The csv feature files in the dataset are suited for Machine Learning (ML) usage. Also, the raw pcap files are suitable for the deeper analysis of MQTT IoT networks communication and the associated attacks.
Instructions:
The dataset consists of 5 pcap files, namely, normal.pcap, sparta.pcap, scan_A.pcap, mqtt_bruteforce.pcap and scan_sU.pcap. Each file represents a recording of one scenario; normal operation, Sparta SSH brute-force, aggressive scan, MQTT brute-force and UDP scan respectively. The attack pcap files contain background normal operations. The attacker IP address is “192.168.2.5”. Basic packet features are extracted from the pcap files into CSV files with the same pcap file names. The features include flags, length, MQTT message parameters, etc. Later, unidirectional and bidirectional features are extracted. It is important to note that for the bidirectional flows, some features (pointed as *) have two values—one for forward flow and one for the backward flow. The two features are recorded and distinguished by a prefix “fwd_” for forward and “bwd_” for backward.
In reply to really very interesting, by Randi Rizal
In reply to how i can download csv file by Khizra Arooj
In reply to how i can download csv file by Khizra Arooj
first you must do create IEEEDataPort account. Secondable you go to login in your account. Next you can use datasets.
In reply to how i can download csv file by Khizra Arooj
In reply to Hello! by Carlos Pinto
In reply to Hello by James Brown
In reply to Salut super job *smiley pouce by SAN GOKU
In reply to Hello by h f
Thank you so much