*.pcap (zip)
This dataset results from a 5-month-long Cloud Telescope Internet Background Radiation collection experiment conducted during the months of October 2023 until February 2024.
A total amount of 130 EC2 instances (sensors) were deployed across all the 26 commercially available AWS regions at the time, 5 sensors per region.
A Cloud Telescope sensor does not serve information. All traffic arriving to the sensor is unsolicited, and potentially malicious. Sensors were configured to allow all unsolicited traffic.
- Categories:
This dataset results from a 47-day Cloud Telescope Internet Background Radiation collection experiment conducted during the months of August and September 2023. A total amount of 260 EC2 instances (sensors) were deployed across all the 26 commercially available AWS regions at the time, 10 sensors per region. A Cloud Telescope sensor does not serve information. All traffic arriving to the sensor is unsolicited, and potentially malicious. Sensors were configured to allow all unsolicited traffic.
- Categories:
This dataset contains .pcap files collected during the execution of variant calling on large number of human genomes using a cluster. The GATK4 variant calling pipeline was executed using AVAH in two testbeds, CloudLab and FABRIC. A 16-node cluster was used on CloudLab, and an 8-node cluster was used on FABRIC. The files were collected by running tcpdump on the network interfaces of the nodes.
- Categories:
Update (07/30/2023): The dataset has been updated to be more realistic with specific characteristics described in [*].
We collect encrypted traffic from six widely-used Instant Messaging Applications (IMAs) installed on an Android device for descriptive and statistical analysis, as presented in our papers [*][**]. In particular, we collect traffic from:
1. Microsoft Teams,
2. Discord,
3. Facebook Messenger,
4. Signal,
5. Telegram, and
6. WhatsApp.
- Categories:
This dataset is used for the identification of video in the internet traffic. The dataset was prepared by using Wireshark. It comprises of two types of traffic data, VPN (Virtual Private Network) or encrypted traffic data and Non-VPN or unencrypted traffic. The dataset consist of the data streams (.pcap) of 43 videos. Each video is played 50 times in both VPN and Non-VPN mode. The streams were obtained by setting-up a dummy client on a PC which plays a YouTube video and Wireshark is used to capture the internet traffic.
- Categories:
This dataset is used for the identification of video in the internet traffic. The dataset was prepared by using Wireshark. It comprises of two types of traffic data, VPN (Virtual Private Network) or encrypted traffic data and Non-VPN or unencrypted traffic. The dataset consist of the data streams (.pcap) of 43 videos. Each video is played 50 times in both VPN and Non-VPN mode. The streams were obtained by setting-up a dummy client on a PC which plays a YouTube video and Wireshark is used to capture the internet traffic.
- Categories:
Dataset contains generated traffic from single requests towards DNS and DNS over Encryption servers as well as network traffic generated by browsers towards multiple DNS over HTTPS servers. The dataset contains also logs and csv files with queried domains. The IP addresses of the DoH servers are provided in the readme so that users can easily label the data extracted from pcap files. The dataset may be used for Machine Learning purposes (DNS over HTTPS identification).
- Categories:
Smart speakers and voice-based virtual assistants are core components for the success of the IoT paradigm. Unfortunately, they are vulnerable to various privacy threats exploiting machine learning to analyze the generated encrypted traffic. To cope with that, deep adversarial learning approaches can be used to build black-box countermeasures altering the network traffic (e.g., via packet padding) and its statistical information.
- Categories:
This dataset is a sample of the dataset used for the paper "A network analysis on cloud gaming:Stadia, GeForce Now and PSNow" and rappresent samples of the gaming sessions.
To access further data, please contact Gianluca Perna at: gianluca.perna@polito.it
- Categories:
Due to the large number of vulnerabilities in information systems and the continuous activity of attackers, techniques for malicious traffic detection are required to identify and protect against cyber-attacks. Therefore, it is important to intentionally operate a cyber environment to be invaded and compromised in order to allow security professionals to analyze the evolution of the various attacks and exploited vulnerabilities.
This dataset includes 2016, 2017 and 2018 cyber attacks in the HoneySELK environment.
- Categories: