network service traffic

Citation Author(s):
Submitted by:
Kai-Cheng Chiu
Last updated:
Tue, 05/17/2022 - 22:17
Data Format:
Research Article Link:
0 ratings - Please login to submit your rating.


The dataset includes 2 parts: private and public traffic.

The private traffic is self-captured network traffic of serveral softwares, such as YouTube, Skype, streaming video, totally 16 categories.

The public traffic is an open VPN dataset, including numorous VPN or nonVPN network services, totally 24 categories.



Two folders are placed in the zip file, public and private.

Every text file in a folder stands for the traffic of a single service, and every line within a text file represents a single packet in byte format.

Every two characters compose a byte value, such as '00', 'd5'.

The length of every packet is not consistent, which cannot be applied to a machine leraning algorithm.

The _normalized_traffic.csv file in a folder is the sampled normalized traffic, presented in floating value ranging from 0 to 1 (originally from 0 to 255 as a byte value).

The length of any record in the csv file is set as 1500 (MTU), with more bytes cut off or padding bytes.

The last column in a csv file is the category of the services, presented in integer values, according to the number of services.

 The raw_private zip file contains the raw pcap files of each self-captured service traffic before conversion.


Why i can not access this dataset? I've subscribered.

Submitted by Jin Wang on Fri, 11/06/2020 - 23:16