Dataset of Port Scanning Attacks on Emulation Testbed and Hardware-in-the-loop Testbed

Citation Author(s):: Hao
Huang

Texas A&M University

Patrick
Wlazlo
Texas A&M University

Abhijeet
Sahu
Texas A&M University

Adele
Walker
Texas A&M University

Ana
Goulart
Texas A&M University

Katherine
Davis
Texas A&M University

Laura
Swiler
Sandia National Laboratories

Thomas
Tarman
Sandia National Laboratories

Eric
Vugrin

Sandia National Laboratories
Submitted by:: Hao Huang
Last updated:: Thu, 05/12/2022 - 09:45
DOI:: 10.21227/cva5-nd75
Data Format:: Excel
License:: Creative Commons Attribution

1536 Views

Categories:: Artificial Intelligence
Machine Learning
Electric Utility
Smart Grid
Security
Communications
Keywords:: testbeds, port scanning, reconnaissance attacks, cyber experimentation reproducibility

0 ratings - Please login to submit your rating.

ACCESS DATASET CITE

Abstract

Port scanning attack is popular method to map a remote network or identify operating systems and applications. It allows the attackers to discover and exploit the vulnerabilities in the network. The dataset is generated by performing four scenarios of port scanning attacks on a 8-substation supervisory control and data acquisition (SCADA) system at three different environments, including the minimega at Sandia National Lab (SNL), the Common Open Research Emulator (CORE) at Texas A&M University, and the hardware-in-the-loop RESLab Testbed at Texas A&M University. The purpose of these experiments are to reproduce the attack scenarios in different environments and validate the emulated communication system with hardware-based system, especially the industrial programmable controllers (PLCs). The generated dataset can be beneficial for the community to study the behavior of port scanning attacks and the validate the methodologies for detecting and preventing port scanning attacks.

Instructions:

There are four folders in this dataset: SNL_MathModel, SNL_minimega_emulation, TAMU_CORE_emulation, and TAMU_RESLab_physical. In each folder, it has the processed data from each environment under four port scanning scenarios.

SNL_minimega_emulation: This folder contains four datasets generated from minimega. There are two folders inside. The 100 sample folder contains two datasets that consider the attack scenarios of sequentially scanning the ports considering 0% packet loss rate using Fast Loud scanning strategy and Slow Stealthy scanning strategy with 100 samples, respectively. The 1000 sample folder contains two datasets that consider the attack scenarios of sequentially scanning the ports considering 10% packet loss rate using Fast Loud scanning strategy and Slow Stealthy scanning strategy with 1000 samples, respectively. For each dataset, it shows the number of discovered Open, Closed, Filtered (Inconclusive) ports in the given system with a time interval of 1 second at each sample (run).

TAMU_CORE_emulation: This folder contains four datasets generated from CORE. There are four folders inside. Each folder has the corresponding dataset as the folder name indicates. The Random_fast_10percent Drop folder contains the dataset that consider the attack scenarios of randomly scanning the ports considering 10% packet loss rate using Fast Loud scanning strategy with 1000 samples. The Random_slow_10percent_Drop folder contains the dataset that consider the attack scenarios of randomly scanning the ports considering 10% packet loss rate using Slow Stealthy scanning strategy with 1000 samples. The Sequential_fast_0percent_Drop folder contains the dataset that consider the attack scenarios of sequentially scanning the ports considering 0% packet loss rate using Fast Loud scanning strategy with 1000 samples. The Sequential_slow_0percent_Drop folder contains the dataset that consider the attack scenarios of sequentially scanning the ports considering 0% packet loss rate using Slow Stealthy scanning strategy with 1000 samples. For each dataset, it shows the number of discovered Open, Closed, Filtered (Inconclusive) ports in the given system with a time interval of 1 second at each sample (run).

TAMU_RESLab_physical: This folder contains four datasets generated from RESLab Testbed. There are four folders inside. Each folder has the corresponding dataset as the folder name indicates. The Random_fast_10percent_Drop folder contains the dataset that consider the attack scenarios of randomly scanning the ports considering 10% packet loss rate using Fast Loud scanning strategy with 1000 samples. The Random_slow_10percent_Drop folder contains the dataset that consider the attack scenarios of randomly scanning the ports considering 10% packet loss rate using Slow Stealthy scanning strategy with 1000 samples. The Sequential_fast_0percent_Drop folder contains the dataset that consider the attack scenarios of sequentially scanning the ports considering 0% packet loss rate using Fast Loud scanning strategy with 1000 samples. The Sequential_slow_0percent_Drop folder contains the dataset that consider the attack scenarios of sequentially scanning the ports considering 0% packet loss rate using Slow Stealthy scanning strategy with 1000 samples. For each dataset, it shows the number of discovered Open, Closed, Filtered (Inconclusive) ports in the given system with a time interval of 1 second at each sample (run).

SNL_MathModel: This folder contains two datasets generated from a math model that describes port scanning progress by an attacker and intrusion detection by a defender. They consider the attack scenarios of randomly scanning the ports considering 10% packet loss rate using Fast Loud scanning strategy and Slow Stealthy scanning strategy, respectively. The datasets describe the average discovered Open ports and Closed ports in the given system at a time interval of 10 seconds.

Funding Agency:

LDRD Program at the Sandia National Laboratories

Dataset Files

SNL-TAMU data.zip (4.82 MB)

Documentation

Attachment	Size
Portscanning Attacks Datasets in Emulation Testbed and HIL Testbet.pdf	1.09 MB

Datasets

Standard Dataset

Dataset of Port Scanning Attacks on Emulation Testbed and Hardware-in-the-loop Testbed

Abstract

More from this Author

Cyber-physical dataset of hardware-in-the-loop cyber-...

Dataset Files

Documentation

QUESTIONS?