5G Traffic Datasets

Citation Author(s):
Kwangwoon University
Kwangwoon University
Kwangwoon University
Submitted by:
Yong-Hoon Choi
Last updated:
Mon, 10/02/2023 - 23:17
Data Format:
Research Article Link:
0 ratings - Please login to submit your rating.


We created a 5G dataset by measuring 5G traffic directly from a major mobile operator in South Korea. The model name of the mobile terminal used for traffic measurement is the Samsung Galaxy A90 5G, equipped with a Qualcomm Snapdragon X50 5G modem. We installed PCAPdroid, a packet sniffer software, on the terminal via Google Play. Traffic was measured sequentially per application on two stationary terminals (only one terminal is used for noninteractive services) with no background traffic. The dataset contains various types of traffic, and you can find them listed in the table below. The collected dataset includes resource-intensive video traffic that has the greatest impact on 5G network planning and provisioning. We did not mix background traffic to measure the unique characteristics of each type of traffic.

The video streaming dataset contains data directly measured while watching Netflix and Amazon Prime Video, representative over-the-top (OTT) services, on mobile devices. The live streaming dataset is measured while watching YouTube Live and South Korea's famous live broadcasts (Naver NOW and Afreeca TV). Video conferencing data are measured by conducting live meetings on the popular Zoom, MS Teams, and Google Meet platforms. Two types of metaverse traffic are acquired: Zepeto and Roblox. Zepeto traffic is collected while staying in 'Camping' for 15 hours. Roblox traffic is collected by playing 'Collect All Pets' for 25 hours using the auto-clicker. We collect two types of mobile network gaming traffic. The first is cloud gaming, an online game setup that runs video games on remote servers and streams them directly to the user's device. The second is a typical mobile game connected to the Internet.

The dataset was collected from May to October 2022, has a total length of 328 hours, and is provided in CSV file format. The dataset is a timestamp-mapped time-series dataset with packet header information, and further traffic analysis by application is possible because it includes source and destination addresses.


All files have been converted and saved in CSV format, making them easily accessible for machine learning. The detailed composition of the dataset is presented in the table below:

(Note: The machine learning model that generates 5G traffic by training on this dataset is available on IEEE Code Ocean. Please visit IEEE Code Ocean at ML-Based 5G Traffic Generation for Practical Simulations Using Open Datasets | Code Ocean.)

TypeApplicationProtocolDuration and Size
Live StreamingYouTube LiveGQUIC20h 19m 38s
File size: 0.73GB
AfreecaTVTCP20h 14m 00s
File size: 4.06GB
Naver NOWTCP33h 50m 34s
File size: 12.48GB
Stored StreamingYouTubeQUIC22h 59m 51s
File size: 1.12GB
NetflixTCP24h 43m 02s
File size: 0.74GB
Amazon Prime VideoTCP32h 39m 10s
File size: 1.54GB
Video ConferencingZoomUDP26h 12m 53s
File size: 3.36GB
MS TeamsUDP28h 17m 27s
File size: 3.71GB
Google MeetUDP24h 01m 40s
File size: 4.41GB
MetaverseZepetoTCP15h 28m 36s
File size: 0.16GB
RobloxRakNet25h 04m 11s
File size: 0.11GB
Online GameTeamfight TacticsUDP13h 46m 53s
File size: 0.24GB
BattlegroundUDP16h 02m 57s
File size: 0.38GB
Game StreamingGeForce NowUDP12h 26m 21s
File size: 7.05GB
KT GameBoxUDP12h 23m 26s
File size: 4.36GB


Funding Agency: 
Institute for Information & Communications Technology Promotion (IITP) grant funded by the Korea government (MSIT), National Re-search Foundation of Korea (NRF) grant funded by the Korea government Ministry of Science and ICT
Grant Number: 
No. 2021-0-00092, No. 2021R1F1A1064080