
The dataset stores a random sampling distribution with cardinality of support of 4,294,967,296 (i.e., two raised to the power of thirty-two). Specifically, the source generator is fixed as a symmetric-key cryptographic function with 64-bit input and 32-bit output. A total of 17,179,869,184 (i.e., two raised to the power of thirty-four) randomly chosen inputs are used to produce the sampling distribution as the dataset. The integer-valued sampling distribution is formatted as 4,294,967,296 (i.e., two raised to the power of thirty-two) entries, and each entry occupies one byte in storage.


This dataset is a result of my research production into machine learning in android security. The data was obtained by a process that consisted to map a binary vector of permissions used for each application analyzed {1=used, 0=no used}. Moreover, the samples of malware/benign were devided by "Type"; 1 malware and 0 non-malware.

When I did my research, the datasets of malware and benign Android applications were not available, then I give to the community a part of my research results for the future works.

