Datasets
Standard Dataset
DDoS Attack Dataset
- Citation Author(s):
- Submitted by:
- Radhakrishna Va...
- Last updated:
- Mon, 07/10/2023 - 02:38
- DOI:
- 10.21227/hg6t-z226
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
Iman Sharafaldin et al. generated the real time network traffic and these are made available at the Canadian Institute of Cyber security Institute website. The team of researchers published the network traffic data and has made the dataset publicly available in both PCAP and CSV formats. The network traffic data is generated during two days. Training Day was on January 12th, 2018 and Testing Day was on March 11th, 2018. All the background traffic flows were used to execute the 12 attacks namely SNMP, TFTP, SSDP, DNS, SYN, LDAP, NTP, NetBIOS, MSSQL, UDP, WebDDoS, and UDP_Lag and these were captured on the training day. During testing day, they have recorded all background traffic flows carrying out seven attacks namely PortScan, NetBIOS, LDAP, MSSQL, UDP, UDP-Lag, and SYN. The dataset has more than 5 crore instances. Using the pcap files of this dataset, we have extracted network traffic instances randomly using CICflow meter and obtained 4 Lakh instances. Using CICflowmeter, we extracted 84 features. From these 84 features, we have eliminated 24 features. 24 Features were eliminated on the basis of following properties; (i) Standard deviation is zero, (ii) Static features, (iii) Redundant Features, (iv) Information Gain. The resulting features in the final dataset are 60.This DDoS attack dataset can be used to evaluate performance of machine learning classifiers and deep learning models. The training dataset is a balanced dataset consisting 2,00,000 normal traffic and 2,00,000 DDoS network traffic instances. The testing dataset consists of nearly 40k traffic instances consisting both normal and DDoS network traffic. We have run the experimentation for 60 features by considering the 4 Lakh training data and 40k testing data. The experiment results are included in the document for seven classifiers which can be considered as baseline results.
Iman Sharafaldin et al. generated the real time network traffic and these are made available at the Canadian Institute of Cyber security Institute website. The team of researchers published the network traffic data and has made the dataset publicly available in both PCAP and CSV formats. The network traffic data is generated during two days. Training Day was on January 12th, 2018 and Testing Day was on March 11th, 2018. All the background traffic flows were used to execute the 12 attacks namely SNMP, TFTP, SSDP, DNS, SYN, LDAP, NTP, NetBIOS, MSSQL, UDP, WebDDoS, and UDP_Lag and these were captured on the training day.During testing day, they have recorded all background traffic flows carrying out seven attacks namely PortScan, NetBIOS, LDAP, MSSQL, UDP, UDP-Lag, and SYN. The dataset has more than 5 crore instances. Using the pcap files of this dataset, we have extracted network traffic instances randomly using CICflow meter and obtained 4 Lakh instances. Using CICflowmeter, we extracted 84 features. From these 84 features, we have eliminated 24 features. 24 Features were eliminated on the basis of following properties; (i) Standard deviation is zero, (ii) Static features, (iii) Redundant Features, (iv) Information Gain. The resulting features in the final dataset are 60.
Dataset Files
- Testing data CICDDOS2019_Testing_40K_60_Features.csv (8.25 MB)
- Training data Binary_Class_Balanced_Dataset_4Lac_60_Features.csv (85.68 MB)
Documentation
Attachment | Size |
---|---|
dataset_upload.pdf | 242.12 KB |
Comments
This dataset can be used to evaluate the performance of machine learning and deep learning models for detection of DDoS attacks.