Datasets
Standard Dataset
N1 dataset in the paper A Data Embedding Scheme for Efficient Program Behavior Modeling with Neural Networks" accepted by IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI)
- Citation Author(s):
- Submitted by:
- Sunwoo Ahn
- Last updated:
- Wed, 12/29/2021 - 01:05
- DOI:
- 10.21227/dt54-qg81
- License:
- Categories:
Abstract
This dataset is used in the experiment of the paper "A Data Embedding Scheme for Efficient Program Behavior Modeling with Neural Networks" accepted by IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI). System calsl and their relevant branch sequences are contained in the tar.gz file. For a detailed description, please refer to the paper.
The data format is binary (not human-readable), and a python script to read binary is included in tar.gz file.
You can run the script as "python data_utils.py ${data_path}, ${max_packet}"
${data_path} is the directory where the binary data exist.
${max_packet} means the length of branch sequence for each system call that you want to read. Note that 0 means "read all the branch sequence".
Dataset Files
- GNU_Screen.tar.gz (108.69 MB)
- MySQL_train.tar.gz (5.74 MB)
- MySQL_benign.tar.gz (32.64 MB)
- MySQL_attack1.tar.gz (146.91 MB)
- MySQL_attack2.tar.gz (103.97 MB)
- data_utils.py (2.86 kB)