Datasets
Standard Dataset
Cyber-security Modbus ICS dataset
- Citation Author(s):
- Submitted by:
- Tiago Cruz
- Last updated:
- Thu, 01/31/2019 - 10:23
- DOI:
- 10.21227/pjff-1a03
- Data Format:
- Links:
- License:
- Categories:
- Keywords:
Abstract
This dataset was generated on a small-scale process automation scenario using MODBUS/TCP equipment, for research on the application of ML techniques to cybersecurity in Industrial Control Systems. The testbed emulates a CPS process controlled by a SCADA system using the MODBUS/TCP protocol. It consists of a liquid pump simulated by an electric motor controlled by a variable frequency drive (allowing for multiple rotor speeds), which in its turn controlled by a Programmable Logic Controller (PLC). The motor speed is determined by a set of predefined liquid temperature thresholds, whose measurement is provided by a MODBUS Remote Terminal Unit (RTU) device providing a temperature gauge, which is simulated by a potentiometer connected to an Arduino. The PLC communicates horizontally with the RTU, providing insightful knowledge of how this type of communications may have an effect on the overall system. The PLC also communicates with the Human-Machine Interface (HMI) controlling the system. The testbed is depicted in the image hereby included.
The provided sample corresponds to roughly one third of the total available captured traces.The full network trace data sets are available at:
https://github.com/tjcruz-dei/ICS_PCAPS/releases/tag/MODBUSTCP%231
This dataset was produced as part of the research effort for the ATENA H2020 EC project (H2020-DS-2015-1 700581).
Citation Request
Frazão, I. and Pedro Henriques Abreu and Tiago Cruz and Araújo, H. and Simões, P. , "Denial of Service Attacks: Detecting the frailties of machine learning algorithms in the Classication Process", in 13th International Conference on Critical Information Infrastructures Security (CRITIS 2018), ed. Springer, Kaunas, Lithuania, September 24-26, 2018, Springer series on Security and Cryptology , 2018. DOI: 10.1007/978-3-030-05849-4_19
All the attack use cases provided in the aforementioned URL (https://github.com/tjcruz-dei/ICS_PCAPS/releases/tag/MODBUSTCP%231) are organized along three folders:
1- “capture1” contains the captured traces for the following scenarios:
-nominal state (no attacks, normal testbed operation - folder “clean”)
-ARP-based, Main-in-the-Middle attack (folder “mitm”)
-modbus query flooding (folders “modbusQuery*”)
-ICMP flooding (folder “pingFloodDDoS”)
-TCP SYN flooding (folder “tcpSYNFloodDDoS”)
For each file, the format is “<capture interface>dump-<attack>-<attack subtype>-<attack duration>-<capture duration>. For instance:
-“eth2dump-mitm-change-15m-0,5h_1.pcap” refers to a capture on interface eth2 for a MITM attack where data is being changed on-the-fly. This attack was executed during 15 minutes, over a 0.5 hour capture timeframe”
2- “captures2” (hereby uploaded) and “captures3” contain the captured traces for the following scenarios, with several attack and capture time spans:
-Modbus query flooding (folders “modbusQueryFlooding”)
-ICMP flooding (folder “pingFloodDDoS”)
-TCP SYN flooding (folder “tcpSYNFloodDDoS”)
The file naming format is the one described in the previous item.
Note: The sample which was provided in this site corresponds to the compressed contents of the second folder (“captures2”). If you need further information or assistance, please contact:
-pha@dei.uc.pt (Pedro Abreu)
-tjcruz@dei.uc.pt (Tiago Cruz)
Attribute Information
All trace captures are encoded in the PCAP file format, version 2.4 (header bytes “d4 c3 b2 a1”) [2]. It includes the following data:
-a global header with the structure:
typedef struct pcap_hdr_s {
guint32 magic_number; /* magic number */
guint16 version_major; /* major version number */
guint16 version_minor; /* minor version number */
gint32 thiszone; /* GMT to local correction */
guint32 sigfigs; /* accuracy of timestamps */
guint32 snaplen; /* max length of captured packets, in octets */
guint32 network; /* data link type */
} pcap_hdr_t;
-a set of captured network packets, each one encoded with a header (next shown), followed by the packet data, as a data blob of “incl_len” bytes:
typedef struct pcaprec_hdr_s {
guint32 ts_sec; /* timestamp seconds */
guint32 ts_usec; /* timestamp microseconds */
guint32 incl_len; /* number of octets of packet saved in file */
guint32 orig_len; /* actual length of packet */
} pcaprec_hdr_t;
The pcap format is open and universally supported by tools such as Wireshark [3] or tcpdump [4]. These tools provide their own packet dissectors (to decode the packet contents).
The packet structure includes the items which are used for classification, accordingly with the paper. The traces were “then processed for feature extraction within Matlab, where a total of 68 features were extracted”:
-packet timestamps (“ts_usec” , in packet headers)
-inter-packet arrival times (deltas of “ts_usec” regarding the start of capture, in packet headers)
-binary features defining which protocols were involved (requires dissection)
-every field of the Ethernet, ARP, IP, ICMP, UDP, TCP and MODBUS over TCP headers (requires dissection)
Citation Request
Frazão, I. and Pedro Henriques Abreu and Tiago Cruz and Araújo, H. and Simões, P. , "Denial of Service Attacks: Detecting the frailties of machine learning algorithms in the Classication Process", in 13th International Conference on Critical Information Infrastructures Security (CRITIS 2018), ed. Springer, Kaunas, Lithuania, September 24-26, 2018, Springer series on Security and Cryptology , 2018. DOI: 10.1007/978-3-030-05849-4_19