X-IIoTID: A Connectivity- and Device-agnostic Intrusion Dataset for Industrial Internet of Things
Industrial Internet of Things (IIoTs) are high-value cyber targets due to the nature of the devices and connectivity protocols they deploy. They are easy to compromise and, as they are connected on a large scale with high-value data content, the compromise of any single device can extend to the whole system and disrupt critical functions. There are various security solutions that detect and mitigate intrusions. However, as they lack the capability to deal with an IIoT's co-existing heterogeneity and interoperability, developing new universal security solutions to fit its requirements is critical. This is challenging due to the scarcity of accurate data about IIoT systems' activities, connectivities and attack behaviors. In addition, owing to their multi-platform connectivity protocols and multi-vendor devices, collecting and creating such data is also challenging. To tackle these issues, we propose a holistic approach for generating an appropriate intrusion dataset for an IIoT called X-IIoTID, connectivity- and device-agnostic intrusion dataset for fitting the heterogeneity and interoperability of IIoT systems. It includes the behaviors of new IIoT connectivity protocols, activities of recent devices, diverse attack types and scenarios, and various attack protocols. It defines an attack taxonomy and consists of multi-view features, such as network traffic, host resources, logs and alerts. X-IIoTID is evaluated using popular machine and deep learning algorithms and compared with eighteen intrusion datasets to verify its novelty.
The X-IIoTID dataset represents a carefully formulated simulacrum of recent attackers' tactics, techniques and procedures and the realistic IIoT systems' activities, including industrial control loops' devices (i.e., sensors, actuators and controllers), edge, mobile and cloud traffic and activities, the behaviours of their new connectivity protocols (e.g., MQTT, CoAP, WebSocket) and services, the diverse communication patterns (Machine-to-Machine, Human-to-Machine, and Machine-to-Human), and large volume network traffic and systems' events. It has connectivity- and device-agnostic features, making it suitable for the heterogeneous nature and interoperability demand of IIoT systems. These features were extracted from network traffic, system logs, application logs, device's resources (CPU, input/Output, Memory, and others), and commercial Intrusion detection systems' logs (OSSEC and Zeek/Bro).
The X-IIoTID dataset in its final version has 820,834 instances (421,417 observations/instances for normal and 399,417 for attacks) with 68 features (including three label levels, i.e., normal and attack, normal and sub-category attack, normal and sub-sub-category attack).
Reconnaissance: Generic scanning, scanning vulnerabilities, WebSocket fuzzing, discovering CoAP resources.
Weaponisation: Brute force attack, dictionary attack, and the malicious insider.
Exploitation: reverse shell and Man-in-the-Middle.
Lateral Movement: MQTT cloud broker-subscription, Modbus-register reading, and TCP Relay attack.
Command and Control
Tampering: poisoning of cloud data (i.e., false data injections), fake notification
Ransom Denial of Service attack
The full description of dataset: https://ieeexplore.ieee.org/document/9504604