Malware Analysis Datasets: Raw PE as Image

Citation Author(s):
Angelo
Oliveira
Submitted by:
Angelo Oliveira
Last updated:
Thu, 11/07/2019 - 11:45
DOI:
10.21227/8brp-j220
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data: Raw PE byte stream rescaled to a 32 x 32 greyscale image using the Nearest Neighbor Interpolation algorithm and then flattened to a 1024 bytes vector. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories.

Instructions: 

* FEATURES *

Column name: hash
Description: MD5 hash of the example
Type: 32 bytes string

Column name: pix_0
Description: The first greyscale pixel value
Type: Integer (0-255)

Column name: pix_1023
Description: The last greyscale pixel value
Type: Integer (0-255)

Column name: malware
Description: Class
Type: 0 (Goodware) or 1 (Malware)

* ACKNOWLEDGMENTS *

We would like to thank: Cuckoo Sandbox for developing such an amazing dynamic analysis environment!
VirusShare! Because sharing is caring!
Universidade Nove de Julho for supporting this research.
Coordination for the Improvement of Higher Education Personnel (CAPES) for supporting this research.

* CITATIONS *

Please refer to the dataset DOI.
Please feel free to contact me for any further information.

Dataset Files

LOGIN TO ACCESS DATASET FILES
Open Access dataset files are accessible to all logged in  users. Don't have a login?  Create a free IEEE account.  IEEE Membership is not required.