Zip File containing .json

Network traffic analysis, i.e. the umbrella of procedures for distilling information from network traffic, represents the enabler for highly-valuable profiling information, other than being the workhorse for several key network management tasks. While it is currently being revolutionized in its nature by the rising share of traffic generated by mobile and hand-held devices, existing design solutions are mainly evaluated on private traffic traces, and only a few public datasets are available, thus clearly limiting repeatability and further advances on the topic.

  • Communications
  • Last Updated On: 
    Mon, 10/07/2019 - 10:02

    This dataset contains Cyber Threat Intelligence (CTI) data generated from public security reports and malware repositories.

    The dataset is stored in a structured format (JSON) and includes approximately 640,000 records from 612 security reports published from January 2008 to June 2019.

    Several data types are contained in this dataset such as URL, host, IP address, e-mail account, hashes (MD5, SHA1, and SHA256), common vulnerabilities and exposures (CVE), registry, file names ending with specific extensions, and the program database (PDB) path.

  • Security
  • Last Updated On: 
    Sun, 10/06/2019 - 07:02

    WiFi measurements dataset for WiFi fingerprint indoor localization compiled on the first and ground floors of the Escuela Técnica Superior de Ingeniería Informática, in Seville, Spain. The facility has 24.000 m² approximately, although only accessible areas were compiled.

  • Communications
  • Last Updated On: 
    Tue, 09/10/2019 - 08:49

    Code duplicates in large code corpora have adverse effects on the evaluation and use of machine learning models that rely on them. Most existing corpora suffer from this problem to some extent. This dataset contains a "duplication" index for some of the existing corpora in Big Code research. The method for collecting this dataset is described in "The Adverse Effects of Code Duplication in Machine Learning Models of Code" by Allamanis [ArXiV, to appear in SPLASH 2019].


  • Computational Intelligence
  • Last Updated On: 
    Thu, 06/27/2019 - 11:47