N1 dataset in the paper A Data Embedding Scheme for Efficient Program Behavior Modeling with Neural Networks" accepted by IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI)

Citation Author(s):
Sunwoo
Ahn
Seoul National University
Submitted by:
Sunwoo Ahn
Last updated:
Wed, 12/29/2021 - 01:05
DOI:
10.21227/dt54-qg81
License:
0
0 ratings - Please login to submit your rating.

Abstract 

This dataset is used in the experiment of the paper "A Data Embedding Scheme for Efficient Program Behavior Modeling with Neural Networks" accepted by IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI). System calsl and their relevant branch sequences are contained in the tar.gz file. For a detailed description, please refer to the paper.

Instructions: 

The data format is binary (not human-readable), and a python script to read binary is included in tar.gz file.

You can run the script as "python data_utils.py ${data_path}, ${max_packet}"

${data_path} is the directory where the binary data exist.

${max_packet} means the length of branch sequence for each system call that you want to read. Note that 0 means "read all the branch sequence".