Dataset of Port Scanning Attacks on Emulation Testbed and Hardware-in-the-loop Testbed

Citation Author(s):
Hao
Huang
Texas A&M University
Patrick
Wlazlo
Texas A&M University
Abhijeet
Sahu
Texas A&M University
Adele
Walker
Texas A&M University
Ana
Goulart
Texas A&M University
Katherine
Davis
Texas A&M University
Laura
Swiler
Sandia National Laboratories
Thomas
Tarman
Sandia National Laboratories
Eric
Vugrin
Sandia National Laboratories
Submitted by:
Hao Huang
Last updated:
Thu, 05/12/2022 - 09:45
DOI:
10.21227/cva5-nd75
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Port scanning attack is popular method to map a remote network or identify operating systems and applications. It allows the attackers to discover and exploit the vulnerabilities in the network. The dataset is generated by performing four scenarios of port scanning attacks on a 8-substation supervisory control and data acquisition (SCADA) system at three different environments, including the minimega at Sandia National Lab (SNL), the Common Open Research Emulator (COREat Texas A&M University, and the hardware-in-the-loop RESLab Testbed at Texas A&M University. The purpose of these experiments are to reproduce the attack scenarios in different environments and validate the emulated communication system with hardware-based system, especially the industrial programmable controllers (PLCs). The generated dataset can be beneficial for the community to study the behavior of port scanning attacks and the validate the methodologies for detecting and preventing port scanning attacks.

Instructions: 

There are four folders in this dataset: SNL_MathModel, SNL_minimega_emulation, TAMU_CORE_emulation, and TAMU_RESLab_physical. In each folder, it has the processed data from each environment under four port scanning scenarios.

  • SNL_minimega_emulation: This folder contains four datasets generated from minimega. There are two folders inside. The 100 sample folder contains two datasets that consider the attack scenarios of sequentially scanning the ports considering 0% packet loss rate using Fast Loud scanning strategy and Slow Stealthy scanning strategy with 100 samples, respectively. The 1000 sample folder contains two datasets that consider the attack scenarios of sequentially scanning the ports considering 10% packet loss rate using Fast Loud scanning strategy and Slow Stealthy scanning strategy with 1000 samples, respectively. For each dataset, it shows the number of discovered Open, Closed, Filtered (Inconclusive) ports in the given system with a time interval of 1 second at each sample (run).
  • TAMU_CORE_emulation: This folder contains four datasets generated from CORE. There are four folders inside. Each folder has the corresponding dataset as the folder name indicates. The Random_fast_10percent Drop folder contains the dataset that consider the attack scenarios of randomly scanning the ports considering 10% packet loss rate using Fast Loud scanning strategy with 1000 samples. The Random_slow_10percent_Drop folder contains the dataset that consider the attack scenarios of randomly scanning the ports considering 10% packet loss rate using Slow Stealthy scanning strategy with 1000 samples. The Sequential_fast_0percent_Drop folder contains the dataset that consider the attack scenarios of sequentially scanning the ports considering 0% packet loss rate using Fast Loud scanning strategy with 1000 samples. The Sequential_slow_0percent_Drop folder contains the dataset that consider the attack scenarios of sequentially scanning the ports considering 0% packet loss rate using Slow Stealthy scanning strategy with 1000 samples. For each dataset, it shows the number of discovered Open, Closed, Filtered (Inconclusive) ports in the given system with a time interval of 1 second at each sample (run). 
  • TAMU_RESLab_physical: This folder contains four datasets generated from RESLab Testbed. There are four folders inside. Each folder has the corresponding dataset as the folder name indicates. The Random_fast_10percent_Drop folder contains the dataset that consider the attack scenarios of randomly scanning the ports considering 10% packet loss rate using Fast Loud scanning strategy with 1000 samples. The Random_slow_10percent_Drop folder contains the dataset that consider the attack scenarios of randomly scanning the ports considering 10% packet loss rate using Slow Stealthy scanning strategy with 1000 samples. The Sequential_fast_0percent_Drop folder contains the dataset that consider the attack scenarios of sequentially scanning the ports considering 0% packet loss rate using Fast Loud scanning strategy with 1000 samples. The Sequential_slow_0percent_Drop folder contains the dataset that consider the attack scenarios of sequentially scanning the ports considering 0% packet loss rate using Slow Stealthy scanning strategy with 1000 samples. For each dataset, it shows the number of discovered Open, Closed, Filtered (Inconclusive) ports in the given system with a time interval of 1 second at each sample (run). 
  • SNL_MathModel: This folder contains two datasets generated from a math model that describes port scanning progress by an attacker and intrusion detection by a defender. They consider the attack scenarios of randomly scanning the ports considering 10% packet loss rate using Fast Loud scanning strategy and Slow Stealthy scanning strategy, respectively. The datasets describe the average discovered Open ports and Closed ports in the given system at a time interval of 10 seconds. 

 

Funding Agency: 
LDRD Program at the Sandia National Laboratories