Costas arrays are permutation matrices that meet the added Costas condition that, when used as a frequency-hop scheme, allow at most one time-and-frequency-offset signal bin to overlap another.  Databases to various orders have been available for many years.  Here we have a database that is far more extensive than any available before it.  A very powerful and easy-to-use Windows utility with a GUI accompanies the database.


Download the file  This file contains the Instructions as a PDF file, the extraction and analysis utility in its own ZIP file, and several information files includign an enumeration database in an Excel file.


Unpack this file in a folder that you want to be the location of your Costas array database.  Be sure and unpack subfolders, so that you dee subfolders /Searches and /Generated when you are done.  Folder /Searches contains all Costas arrays to order 29, and folder /Generated contains all generated Costas arrays to order 100.  The file contains the extraction and analysis utility.  It may be extracted in-place or, if the database is on a network drive or other location inconvenient for DLLs, in its own folder anywhere on a local drive such as your C:\ drive.  See the Instructions PDF for details.


Then, as you need them, add these files:        More data for /Generated folder        More data for /Generated folder        More data for /Generated folder        More data for /Generated folder        More data for /Generated folder        More data for /Generated folder        More data for /Generated folder        More data for /Generated folder       More data for /Generated folder    More data for /Generated folder    More data for /Generated folder


This is a file that was produced by the extraction/analysis utility        Frequency hop LUB list; useful with PLL-based waveform generators


For further information, see the file Costas Arrays to Order 1030 INSTRUCTIONS.pdf


The dataset stores a random sampling distribution with cardinality of support of 4,294,967,296 (i.e., two raised to the power of thirty-two). Specifically, the source generator is fixed as a symmetric-key cryptographic function with 64-bit input and 32-bit output. A total of 17,179,869,184 (i.e., two raised to the power of thirty-four) randomly chosen inputs are used to produce the sampling distribution as the dataset. The integer-valued sampling distribution is formatted as 4,294,967,296 (i.e., two raised to the power of thirty-two) entries, and each entry occupies one byte in storage.


The big dataset file is 4GB in size. The dataset contains 4,294,967,296 entries and each entry occupies one byte in storage. The MD5 checksum is 4ee9 a09a a509 fd70 4152 2fd2 f263 ae25. The SHA256 checksum is d9a4 fb8d d9f0 de29 b1e2 3316 c78d 8e65 4ec7 d60f 7ebc ec9e ee57 6fa2 e392 3b57. Note that the above hash checksum results are displayed in groups of four digits.


This dataset is a result of my research production into machine learning in android security. The data was obtained by a process that consisted to map a binary vector of permissions used for each application analyzed {1=used, 0=no used}. Moreover, the samples of malware/benign were devided by "Type"; 1 malware and 0 non-malware.

When I did my research, the datasets of malware and benign Android applications were not available, then I give to the community a part of my research results for the future works.