DDoS Attack Dataset

Citation Author(s):
Raghupathi
Manthena
VNR VJIET
Radhakrishna
Vangipuram
VNR VJIET
Submitted by:
Radhakrishna Va...
Last updated:
Mon, 07/10/2023 - 02:38
DOI:
10.21227/hg6t-z226
Data Format:
License:
5
1 rating - Please login to submit your rating.

Abstract 

Iman Sharafaldin et al. generated the real time network traffic and these are made available at the Canadian Institute of Cyber security Institute website.  The team of researchers published the network traffic data and has made the dataset publicly available in both PCAP and CSV formats. The network traffic data is generated during two days. Training Day was on January 12th, 2018 and Testing Day was on March 11th, 2018. All the background traffic flows were used to execute the 12 attacks namely SNMP, TFTP, SSDP, DNS, SYN, LDAP, NTP, NetBIOS, MSSQL, UDP, WebDDoS, and UDP_Lag and these were captured on the training day. During testing day, they have recorded all background traffic flows carrying out seven attacks namely PortScan, NetBIOS, LDAP, MSSQL, UDP, UDP-Lag, and SYN. The dataset has more than 5 crore instances. Using the pcap files of this dataset, we have extracted network traffic instances randomly using CICflow meter and obtained 4 Lakh instances. Using CICflowmeter, we extracted 84 features. From these 84 features, we have eliminated 24 features. 24 Features were eliminated on the basis of following properties; (i) Standard deviation is zero, (ii) Static features, (iii) Redundant Features, (iv) Information Gain. The resulting features in the final dataset are 60.This DDoS attack dataset can be used to evaluate performance of machine learning classifiers and deep learning models. The training dataset is a balanced dataset consisting 2,00,000 normal traffic and 2,00,000 DDoS network traffic instances. The testing dataset consists of nearly 40k traffic instances consisting both normal and DDoS network traffic. We have run the experimentation for 60 features by considering the 4 Lakh training data and 40k testing data. The experiment results are included in the document for seven classifiers which can be considered as baseline results.  

Instructions: 

Iman Sharafaldin et al. generated the real time network traffic and these are made available at the Canadian Institute of Cyber security Institute website.  The team of researchers published the network traffic data and has made the dataset publicly available in both PCAP and CSV formats. The network traffic data is generated during two days. Training Day was on January 12th, 2018 and Testing Day was on March 11th, 2018. All the background traffic flows were used to execute the 12 attacks namely SNMP, TFTP, SSDP, DNS, SYN, LDAP, NTP, NetBIOS, MSSQL, UDP, WebDDoS, and UDP_Lag and these were captured on the training day.During testing day, they have recorded all background traffic flows carrying out seven attacks namely PortScan, NetBIOS, LDAP, MSSQL, UDP, UDP-Lag, and SYN. The dataset has more than 5 crore instances. Using the pcap files of this dataset, we have extracted network traffic instances randomly using CICflow meter and obtained 4 Lakh instances. Using CICflowmeter, we extracted 84 features. From these 84 features, we have eliminated 24 features. 24 Features were eliminated on the basis of following properties; (i) Standard deviation is zero, (ii) Static features, (iii) Redundant Features, (iv) Information Gain. The resulting features in the final dataset are 60.

Comments

This dataset can be used to evaluate the performance of machine learning and deep learning models for detection of DDoS attacks.

Submitted by Radhakrishna Va... on Mon, 07/10/2023 - 02:37