Dalhousie NIMS Lab VPN 2024 Dataset

Citation Author(s):
Haotian
Liu
Riyad
Alshammari
Nur
Zincir-Heywood
Submitted by:
Haotian Liu
Last updated:
Thu, 09/12/2024 - 15:00
DOI:
10.21227/08yh-pz10
Data Format:
Research Article Link:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

This dataset presents real-world VPN encrypted traffic flows captured from five applications that belong to four service categories, which reflect typical usage patterns encountered by everyday users. 

Our methodology utilized a set of automatic scripts to simulate real-world user interactions for these applications, to achieve a low level of noise and irrelevant network traffic.

 

The dataset consists of flow data collected from four service categories:

Instant Messaging (Slack for both desktop and Android)

Streaming (Twitch for desktop, and TikTok for Android)

Browsing (Chrome for both desktop and Android)

Cloud Storage Service (Google Drive for both desktop and Android)

 

These flows were extracted from .pcap captures with Tranalyzer. The dataset is organized into service-specific folders, with each containing the “with VPN” and “without VPN” scenario.

Comprehensive details regarding our setup and methodology are provided in our paper, along with a thorough explanation of the dataset's structure in the readme file. 

This dataset serves as a valuable resource for understanding the pattern of VPN encrypted traffic in real-world contexts.

 

Instructions: 

Data Format: The data files are in .txt format, separated by Tabs. They can be converted to .csv files safely. The first row in a flow file contains the names of the flow-level characteristics, and the first column lists the flows.

If you use this dataset in your research, please cite the accompanying IEEE paper: Edge-Cloud VPN Traffic Analysis Over Cross Platforms

Please also include the dataset DOI in your citations.