*.pcap (zip)

NATed Network Traffic Collection Datasets of Terminal Devices

three-level network topology experimental environment was constructed based on a typical home network. An optical network terminal (Bell XE-140W-TD) was used as the starting point for Internet access, and a Gigabit wired connection was established with the core router (Huawei TC30) via a shielded Cat 6 cable. This router is a Huawei beta version that can obtain specific device information and corresponding traffic after NAT. The experimental evaluation of traffic attribution association in subsequent chapters will use this as ground truth label data for comparison.

Categories:

Artificial Intelligence

Cloud Telescope Internet Background Radiation - October 2023 - February 2024

This dataset results from a 5-month-long Cloud Telescope Internet Background Radiation collection experiment conducted during the months of October 2023 until February 2024.
A total amount of 130 EC2 instances (sensors) were deployed across all the 26 commercially available AWS regions at the time, 5 sensors per region.
A Cloud Telescope sensor does not serve information. All traffic arriving to the sensor is unsolicited, and potentially malicious. Sensors were configured to allow all unsolicited traffic.

Categories:

Cloud Telescope Internet Background Radiation August 2023

This dataset results from a 47-day Cloud Telescope Internet Background Radiation collection experiment conducted during the months of August and September 2023. A total amount of 260 EC2 instances (sensors) were deployed across all the 26 commercially available AWS regions at the time, 10 sensors per region. A Cloud Telescope sensor does not serve information. All traffic arriving to the sensor is unsolicited, and potentially malicious. Sensors were configured to allow all unsolicited traffic.

Categories:

A Dataset of Network Traffic Collected During Large-Scale Human Genome Sequence Analysis

This dataset contains .pcap files collected during the execution of variant calling on large number of human genomes using a cluster. The GATK4 variant calling pipeline was executed using AVAH in two testbeds, CloudLab and FABRIC. A 16-node cluster was used on CloudLab, and an 8-node cluster was used on FABRIC. The files were collected by running tcpdump on the network interfaces of the nodes.

Categories:

Encrypted Mobile Instant Messaging Traffic Dataset

Update (07/30/2023): The dataset has been updated to be more realistic with specific characteristics described in [*].

We collect encrypted traffic from six widely-used Instant Messaging Applications (IMAs) installed on an Android device for descriptive and statistical analysis, as presented in our papers [*][**]. In particular, we collect traffic from:

1. Microsoft Teams,

2. Discord,

3. Facebook Messenger,

4. Signal,

5. Telegram, and

6. WhatsApp.

Categories:

Communications

Video identification in encrypted network traffic dataset (VPN)

This dataset is used for the identification of video in the internet traffic. The dataset was prepared by using Wireshark. It comprises of two types of traffic data, VPN (Virtual Private Network) or encrypted traffic data and Non-VPN or unencrypted traffic. The dataset consist of the data streams (.pcap) of 43 videos. Each video is played 50 times in both VPN and Non-VPN mode. The streams were obtained by setting-up a dummy client on a PC which plays a YouTube video and Wireshark is used to capture the internet traffic.

Categories:

Video identification in encrypted network traffic dataset

This dataset is used for the identification of video in the internet traffic. The dataset was prepared by using Wireshark. It comprises of two types of traffic data, VPN (Virtual Private Network) or encrypted traffic data and Non-VPN or unencrypted traffic. The dataset consist of the data streams (.pcap) of 43 videos. Each video is played 50 times in both VPN and Non-VPN mode. The streams were obtained by setting-up a dummy client on a PC which plays a YouTube video and Wireshark is used to capture the internet traffic.

Categories:

DNS Over HTTPS network traffic

Dataset contains generated traffic from single requests towards DNS and DNS over Encryption servers as well as network traffic generated by browsers towards multiple DNS over HTTPS servers. The dataset contains also logs and csv files with queried domains. The IP addresses of the DoH servers are provided in the readme so that users can easily label the data extracted from pcap files. The dataset may be used for Machine Learning purposes (DNS over HTTPS identification).

Categories:

Machine Learning

Google Home Pcap

Smart speakers and voice-based virtual assistants are core components for the success of the IoT paradigm. Unfortunately, they are vulnerable to various privacy threats exploiting machine learning to analyze the generated encrypted traffic. To cope with that, deep adversarial learning approaches can be used to build black-box countermeasures altering the network traffic (e.g., via packet padding) and its statistical information.

Categories:

A network analysis on cloud gaming: Stadia, GeForce Now and PSNow

This dataset is a sample of the dataset used for the paper "A network analysis on cloud gaming:Stadia, GeForce Now and PSNow" and rappresent samples of the gaming sessions.

To access further data, please contact Gianluca Perna at: gianluca.perna@polito.it

Categories:

Cloud Computing

Category