This data was used for the paper: J. Wang, P. Aubry, and A. Yarovoy, " Three-Dimensional Short-Range Imaging With Irregular MIMO Arrays Using NUFFT-Based Range Migration Algorithm", IEEE Transactions on Geoscience and Remote Sensing. It includes two synthetic electromagnetic datasets and one experimental measured data with multiple-input-multiple-output (MIMO) arrays.

Instructions: 

The detailed instructions about the dataset can be found in the readme file.

Categories:
65 Views

Authors’ multimedia video of VO2 patch with resistive heater electrically isolated by a thin SiO2 layer and increase in temperature (by Joule heating) inducing phase transition, observed by a change in color.

Categories:
114 Views

The users graph structure of Sina network.

Categories:
167 Views

 

Instructions: 

The data in xssed.csv comes from XSSed(http://www.XSSed.com)

The data in normal_example.csv from DMOZ(http://www.dmoztools.net/)

Data are URL formed. IP address and domain name are all removed.

Categories:
881 Views

Information:

This dataset was created for research on blockchain anomaly and fraud detection. And donated to IEEE data port online community. 

https://github.com/epicprojects/blockchain-anomaly-detection

 

 

 

Instructions: 

A directed-acyclic graph is created from the bitcoin transaction data and metadata is extracted to create this dataset. 

 

DIMENSIONS:

  • tx_hash: Hash of the bitcoin transaction.
  • indegree: Number of transactions that are inputs of tx_hash
  • outdegree: Number of transactions that are outputs of tx_hash.
  • in_btc: Number of bitcoins on each incoming edge to tx_hash.
  • out_btc: Number of bitcoins on each outgoing edge from tx_hash.
  • total_btc: Net number of bitcoins flowing in and out from tx_hash.
  • mean_in_btc: Average number of bitcoins flowing in for tx_hash.
  • mean_out_btc: Average number of bitcoins flowing out for tx_hash.
  • in-malicious: Will be 1 if the tx_hash is an input of a malicious transaction.
  • out-malicious: Will be 1 if the tx_hash is an output of a malicious transaction.
  • is-malicious: Will be 1 if the tx_hash is a malicious transaction.
  • out_and_tx_malicious: Will be 1 if the tx_hash is a malicious transaction or an output of a malicious transaction.
  • all_malicious: Will be 1 if the tx_hash is a malicious transaction or an output of a malicious transaction or input of a malicious transaction.

 

REFERENCES:

  1. https://arxiv.org/abs/1611.03942
  2. https://arxiv.org/abs/1611.03941
  3. https://arxiv.org/abs/1107.4524
  4. http://anonymity-in-bitcoin.blogspot.com/2011/09/code-datasets-and-spsn1...
  5. http://snap.stanford.edu/class/cs224w-2013/projects2013/cs224w-030-final...

 

 

Categories:
830 Views

This is a dataset consisting of 8 features extracted from 70,000 monochromatic still images adapted from the Genome Project Standford's database, that are labeled in two classes: LSB steganography (1) and without LSB Steganography (0). These features are Kurtosis, Skewness, Standard Deviation, Range, Median, Geometric Mean, Hjorth Mobility, and Hjorth Complexity, all extracted from the histograms of the still images, including random spatial transformations. The steganographic function embeds five types of payloads, from 0.1 to 0.5.

Instructions: 

This is a dataset consisting of 8 features extracted from 70,000 monochromatic still images adapted from the Genome Project Standford's database, that are labeled in two classes: with (1) and without (0) LSB Steganography. In the training and testing dataset, it will be found 8 columns with the following features represented as numeric quantities: Kurtosis, Skewness, Standard Deviation, Range, Median, Geometric Mean, Hjorth Mobility, and Hjorth Complexity. There is a ninth column that expresses the class of the observation, being 0 as non-steganogram and 1 as steganogram. All the features were extracted from the histograms of the still images. Reading and processing of the dataset can be done using Pandas in Python, R or Matlab.

 

The steganographic function embeds five types of payloads, from 0.1 to 0.5. The training dataset includes 56,000 of these pairs of labeled images (with and without LSB Steganography), with which 5,600 images conform the dataset for each payload. The testing dataset has 14,000 observations and is equally divided as the training dataset.

Categories:
565 Views

This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data: Top-1000 imported functions extracted from the 'pe_imports' elements of Cuckoo Sandbox reports. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories.

Instructions: 

* FEATURES *

Column name: hash
Description: MD5 hash of the example
Type: 32 bytes string

Column name: GetProcAddress
Description: Most imported function (1st)
Type: 0 (Not imported) or 1 (Imported)

...

Column name: LookupAccountSidW
Description: Least imported function (1000th)
Type: 0 (Not imported) or 1 (Imported)

Column name: malware
Description: Class
Type: 0 (Goodware) or 1 (Malware)

* ACKNOWLEDGMENTS *

We would like to thank: Cuckoo Sandbox for developing such an amazing dynamic analysis environment!
VirusShare! Because sharing is caring!
Universidade Nove de Julho for supporting this research.
Coordination for the Improvement of Higher Education Personnel (CAPES) for supporting this research.

* CITATIONS *

Please refer to the dataset DOI.
Please feel free to contact me for any further information.

Categories:
2894 Views

This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data: Raw PE byte stream rescaled to a 32 x 32 greyscale image using the Nearest Neighbor Interpolation algorithm and then flattened to a 1024 bytes vector. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories.

Instructions: 

* FEATURES *

Column name: hash
Description: MD5 hash of the example
Type: 32 bytes string

Column name: pix_0
Description: The first greyscale pixel value
Type: Integer (0-255)

Column name: pix_1023
Description: The last greyscale pixel value
Type: Integer (0-255)

Column name: malware
Description: Class
Type: 0 (Goodware) or 1 (Malware)

* ACKNOWLEDGMENTS *

We would like to thank: Cuckoo Sandbox for developing such an amazing dynamic analysis environment!
VirusShare! Because sharing is caring!
Universidade Nove de Julho for supporting this research.
Coordination for the Improvement of Higher Education Personnel (CAPES) for supporting this research.

* CITATIONS *

Please refer to the dataset DOI.
Please feel free to contact me for any further information.

Categories:
910 Views

This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data (PE Section Headers of the .text, .code and CODE sections) extracted from the 'pe_sections' elements of Cuckoo Sandbox reports. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories.

Instructions: 

* FEATURES *

Column name: hash
Description: MD5 hash of the example
Type: 32 bytes string

Column name: size_of_data
Description: The size of the section on disk
Type: Integer

Column name: virtual_address
Description: Memory address of the first byte of the section relative to the image base
Type: Integer

Column name: entropy
Description: Calculated entropy of the section
Type: Float

Column name: virtual_size
Description: The size of the section when loaded into memory
Type: Integer

Column name: malware
Description: Class
Type: 0 (Goodware) or 1 (Malware)

* ACKNOWLEDGMENTS *

We would like to thank: Cuckoo Sandbox for developing such an amazing dynamic analysis environment!
VirusShare! Because sharing is caring!
Universidade Nove de Julho for supporting this research.
Coordination for the Improvement of Higher Education Personnel (CAPES) for supporting this research.

* CITATIONS *

Please refer to the dataset DOI.
Please feel free to contact me for any further information.

Categories:
1244 Views

ASNM datasets include records consisting of many features, that express various properties and characteristics of TCP communications. These features are called Advanced Security Network Metrics (ASNM) and were designed with the intention to discern legitimate and malicious connections (especially intrusions).

Instructions: 

ASNM datasets were created one by one during our long-term research. The following listing contains references to descriptions of particular datasets with their download locations:

 

  • ASNM-NPBO Dataset - contains non-payload-based obfuscation techniques applied onto malicious and some of legitimate traffic. It was created in 2015.
  • ASNM-TUN Dataset - contains tunneling obfuscation techniques applied to malicious traffic. It was created in 2014.
  • ASNM-CDX-2009 Dataset - contains ASNM features extracted from tcpdumps of CDX 2009 dataset. It misses few newer ASNM features. It was created in 2013.
Categories:
2472 Views

Pages