Security

This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data: Top-1000 imported functions extracted from the 'pe_imports' elements of Cuckoo Sandbox reports. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories.

78 views
  • Machine Learning
  • Last Updated On: 
    Fri, 11/08/2019 - 05:43

    This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data: Raw PE byte stream rescaled to a 32 x 32 greyscale image using the Nearest Neighbor Interpolation algorithm and then flattened to a 1024 bytes vector. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories.

    68 views
  • Machine Learning
  • Last Updated On: 
    Thu, 11/07/2019 - 11:45

    This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data (PE Section Headers of the .text, .code and CODE sections) extracted from the 'pe_sections' elements of Cuckoo Sandbox reports. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories.

    122 views
  • Machine Learning
  • Last Updated On: 
    Wed, 11/06/2019 - 06:10

    ASNM datasets include records consisting of many features, that express various properties and characteristics of TCP communications. These features are called Advanced Security Network Metrics (ASNM) and were designed with the intention to discern legitimate and malicious connections (especially intrusions).

    73 views
  • Machine Learning
  • Last Updated On: 
    Sun, 11/03/2019 - 01:04

    This dataset is part of our research on malware detection and classification using Deep Learning. It contains 42,797 malware API call sequences and 1,079 goodware API call sequences. Each API call sequence is composed of the first 100 non-repeated consecutive API calls associated with the parent process, extracted from the 'calls' elements of Cuckoo Sandbox reports.

    180 views
  • Machine Learning
  • Last Updated On: 
    Wed, 11/06/2019 - 06:18

    Collecting and analysing heterogeneous data sources from the Internet of Things (IoT) and Industrial IoT (IIoT) are essential for training and validating the fidelity of cybersecurity applications-based machine learning.  However, the analysis of those data sources is still a big challenge for reducing high dimensional space and selecting important features and observations from different data sources.

    47 views
  • Artificial Intelligence
  • Last Updated On: 
    Wed, 10/16/2019 - 03:10

    Boğaziçi University DDoS dataset (BOUN DDoS) is generated in Boğaziçi University via Hping3 traffic generator software by flooding TCP SYN, and UDP packets. This dataset includes attack-free user traffic as well as attack traffic and suitable for evaluating network-based DDoS detection methods. Attacks are towards one victim server connected to the backbone router of the campus.  Attack packets have randomly generated spoofed source  IP addresses.  The data-trace was recorded on the backbone and included over 2000 active hosts. The average packet/second ratio was around 18000. 

    45 views
  • Security
  • Last Updated On: 
    Wed, 10/09/2019 - 10:07

    We created various types of network attacks in Internet of Things (IoT) environment for academic purpose. Two typical smart home devices -- SKT NUGU (NU 100) and EZVIZ Wi-Fi Camera (C2C Mini O Plus 1080P) -- were used. All devices, including some laptops or smart phones, were connected to the same wireless network. The dataset consists of 42 raw network packet files (pcap) at different time points.

    * The packet files are captured by using monitor mode of wireless network adapter. The wireless headers are removed by Aircrack-ng.

    742 views
  • Security
  • Last Updated On: 
    Fri, 09/27/2019 - 04:57

    This dataset contains Cyber Threat Intelligence (CTI) data generated from public security reports and malware repositories.

    The dataset is stored in a structured format (JSON) and includes approximately 640,000 records from 612 security reports published from January 2008 to June 2019.

    Several data types are contained in this dataset such as URL, host, IP address, e-mail account, hashes (MD5, SHA1, and SHA256), common vulnerabilities and exposures (CVE), registry, file names ending with specific extensions, and the program database (PDB) path.

    269 views
  • Security
  • Last Updated On: 
    Sun, 10/06/2019 - 07:02

    This FFT-75 dataset contains randomly sampled, potentially overlapping file fragments from 75 popular file types (see details below). It is the most diverse and balanced dataset available to the best of our knowledge. The dataset is labeled with class IDs and is ready for training supervised machine learning models. We distinguish 6 different scenarios with different granularity and provide variants with 512 and 4096-byte blocks. In each case, we sampled a balanced dataset and split the data as follows: 80% for training, 10% for testing and 10% for validation.

    214 views
  • Security
  • Last Updated On: 
    Wed, 08/07/2019 - 16:56

    Pages