Statistical analysis ToN_IoT Datasets

Citation Author(s):
Submitted by:
Tim Booij
Last updated:
Sat, 03/27/2021 - 12:35
Data Format:
0 ratings - Please login to submit your rating.


The Internet of Things (IoT) is reshaping our connected world, due to the prevalence of lightweight devices connected to the Internet and their communication technologies. Therefore, research towards intrusion detection in the IoT domain has a lot of significance. Network intrusion datasets are fundamental for this research, as many attack detection strategies have to be trained and evaluated using these datasets. In this paper, we introduce the description, statistical analysis, and machine learning evaluations of the IoT dataset, the so-called ToN\_IoT, and compare it to other recent datasets. This comparison not only shows the importance of heterogeneity within these datasets, but also why even the slightest differences between datasets can have a huge impact on industry applications. In a cross-training experiment, we show that the inclusion of different data collection methods and a large diversity of the monitored features is of crucial importance for IoT network intrusion datasets to be useful for the industry. We also explain that the practical application of IoT datasets in operational environments requires the standardization of feature descriptions and cyberattack classes. This can only be achieved with a joint effort from the research community to start creating such standards.


The Python and R scripts in the files will create the datasets. Required is to also have the original ToN_IoT datasets.


AI security

Submitted by Chaofei Li on Wed, 08/25/2021 - 23:07

How can I get the Python and R scripts

Submitted by Chaofei Li on Thu, 08/26/2021 - 02:32

Currently they are only available through IEEE Dataport subscription, but we are working on open access. I have sent you a DM!

Submitted by Tim Booij on Fri, 08/27/2021 - 09:56

Hi Tim, are you able to confirm if there is now public access to the Python scripts?

Submitted by Heath Carson on Sun, 09/26/2021 - 01:15

Hi, I'm not able to download the dataset. Is this publically available now?

Submitted by Sudhandar Balak... on Thu, 03/03/2022 - 12:09