Skip to main content

Datasets

Standard Dataset

N1 dataset in the paper A Data Embedding Scheme for Efficient Program Behavior Modeling with Neural Networks" accepted by IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI)

Citation Author(s):
Sunwoo Ahn (Seoul National University)
Submitted by:
Sunwoo Ahn
Last updated:
DOI:
10.21227/dt54-qg81
No Ratings Yet

Abstract

This dataset is used in the experiment of the paper "A Data Embedding Scheme for Efficient Program Behavior Modeling with Neural Networks" accepted by IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI). System calsl and their relevant branch sequences are contained in the tar.gz file. For a detailed description, please refer to the paper.

Instructions:

The data format is binary (not human-readable), and a python script to read binary is included in tar.gz file.

You can run the script as "python data_utils.py ${data_path}, ${max_packet}"

${data_path} is the directory where the binary data exist.

${max_packet} means the length of branch sequence for each system call that you want to read. Note that 0 means "read all the branch sequence".