Datasets
Standard Dataset
Network Image datasets for IDS using Anomaly Detection
- Citation Author(s):
- Submitted by:
- Yonatan Embiza ...
- Last updated:
- Thu, 08/08/2024 - 23:14
- DOI:
- 10.21227/09ng-rv70
- Data Format:
- Research Article Link:
- Links:
- License:
- Categories:
- Keywords:
Abstract
As the world increasingly becomes more interconnected, the demand for safety and security is ever-increasing, particularly for industrial networks. This has prompted numerous researchers to investigate different methodologies and techniques suitable for intrusion detection systems (IDS) requirements. Over the years, many studies have proposed various solutions in this regard, including signature-based and machine learning (ML)-based systems. More recently, researchers are considering deep learning (DL)-based anomaly detection approaches. Most proposed works in this research field aim to achieve either one or a combination of high accuracy, considerably low false alarm rates (FARs), high classification specificity and detection sensitivity, lightweight DL models, or other ML and DL-related performance measurement metrics. In this study, we propose a novel method to convert a raw dataset to an image dataset to magnify patterns by utilizing the Short-Term Fourier transform (STFT). The resulting high-quality image dataset allowed us to devise an anomaly detection system for IDS using a simple lightweight convolutional neural network (CNN) that classifies denial of service and distributed denial of service. The proposed methods were evaluated using a modern dataset, CSE-CIC-IDS2018, and a legacy dataset, NSLKDD. We have also applied a combined dataset to assess the generalization of the proposed model across various datasets. Our experimental results have demonstrated that the proposed methods achieved high accuracy and considerably low FARs with high specificity and sensitivity. The resulting loss and accuracy curves have demonstrated the efficacy of our raw dataset to image dataset conversion methodology, which is evident as an excellent generalization of the proposed lightweight CNN model was observed, effectively avoiding overfitting. This holds for both the modern and legacy datasets, including their mixed versions.
The use of the dataset is straightforward
Comments
√