Dataset for Image-Based Traffic Classification in SDN

Citation Author(s):
Hicham
YZZOGH
Submitted by:
Hicham Yzzogh
Last updated:
Sun, 05/12/2024 - 12:16
DOI:
10.21227/722d-7p84
Data Format:
License:
53 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

Flow to image conversion is a pivotal preprocessing step in intrusion detection systems (IDS) where the representation of network flow data significantly influences classifier performance. In this study, we explore the effects of three distinct methods of transforming flow data into images on classifier performance. Leveraging a subset of the InSDN Dataset encompassing five types of attacks (DoS, DDoS, Probe, Normal, and BFA), we compare the efficacy of three methodologies: Method 1 involves converting each row (flow) into a bar chart, where the values are normalized and rendered as a Matplotlib-generated image. This approach excludes the target variable containing the label from the conversion process. Method 2 utilizes the Image Generator for Tabular Data (IGTD) framework based on Euclidean distance. IGTD transforms tabular data into grayscale images, optimizing spatial dependencies crucial for Convolutional Neural Networks (CNNs) by aligning feature and pixel distance rankings. Through iterative optimization, IGTD selects features to minimize discrepancies between rankings, positioning similar features close together in the resultant image. Method 3 extends the IGTD approach but relies on Manhattan distance for feature alignment and image generation. By evaluating the performance of classifiers trained on images generated by these methods, we aim to discern the impact of different flow-to-image conversion techniques on classifier accuracy, particularly in the context of intrusion detection. Our findings shed light on the suitability of each method for enhancing classifier performance in IDS applications, contributing to the optimization of network security systems.

Instructions: 

The dataset contains a CSV file and three folders, each containing images converted using one of the conversion methods: flow-to-bar charts, IGTD based on Euclidean distance, or IGTD based on Manhattan distance. Each of these folders contains five subfolders, with each subfolder containing specific attack types.