Datasets
Standard Dataset
Encrypted Mobile Instant Messaging Traffic Dataset
- Citation Author(s):
- Submitted by:
- Zolboo Erdenebaatar
- Last updated:
- Sun, 07/30/2023 - 02:45
- DOI:
- 10.21227/aer2-kq52
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
Update (07/30/2023): The dataset has been updated to be more realistic with specific characteristics described in [*].
We collect encrypted traffic from six widely-used Instant Messaging Applications (IMAs) installed on an Android device for descriptive and statistical analysis, as presented in our papers [*][**]. In particular, we collect traffic from:
1. Microsoft Teams,
2. Discord,
3. Facebook Messenger,
4. Signal,
5. Telegram, and
6. WhatsApp.
The encrypted traffic collected from these applications are stored as individual .pcap file. For our research, we extract flows from these .pcap files using tranalyzer and build statistical models. Therefore, the flow dataset for each of the IMA are also contained with this dataset. Furthermore, we collect common encrypted mobile traffic that do not result from any IMA. We use such traffic to test if we can distinguish between IMA and non-IMA traffic. In particular, we collect traffic resulting from web-browsing, video streaming, and sending-emails. We also save any background traffic that does not correspond to any of these activities. These sets of encrypted traffic are also saved as .pcap files and their flow dataset are contained with this dataset. Lastly, we include the text conversations used to generate this dataset for any reproduction purposes.
Further details about the method we used to generate and label this traffic set is presented in our demo paper below [**].
[*] Zolboo Erdenebaatar, Riyad Alshammari, Bis Nandy, Nabil Seddigh, Nur Zincir-Heywood, and Marwa Elsayed. Depicting Instant Messaging Encrypted Traffic Characteristics through an Empirical Study. In 2023 International Conference on Computer Communications and Networks (ICCCN), 2023
[*] Zolboo Erdenebaatar, Riyad Alshammari, Nur Zincir-Heywood, Marwa Elsayed, Bis Nandy, and Nabil Seddigh. Analyzing traffic characteristics of instant messaging applications on Android smartphones. In 2023 IEEE/IFIP Network Operations and Management Symposium (NOMS 2023), 2023.
[**] Zolboo Erdenebaatar, Riyad Alshammari, Nur Zincir-Heywood, Marwa Elsayed, Bis Nandy, and Nabil Seddigh. Instant messaging application encrypted traffic generation system. In 2023 IEEE/IFIP Network Operations and Management Symposium (NOMS 2023), 2023.
The .pcap files, for each scenario, can be extracted from the .zip are ready to use.
Please contact one of these authors to get access to the source code: zolboo@dal.ca or zincir@cs.dal.ca.
Please refer to the README file for further information.
Dataset Files
- discord_encrypted_traffic.zip (83.48 MB)
- messenger_encrypted_traffic.zip (64.01 MB)
- non_ima_encrypted_traffic.zip (598.55 MB)
- signal_encrypted_traffic.zip (163.50 MB)
- teams_encrypted_traffic.zip (460.05 MB)
- telegram_encrypted_traffic.zip (125.32 MB)
- whatsapp_encrypted_traffic.zip (8.48 MB)
Documentation
Attachment | Size |
---|---|
README.md | 2.6 KB |
Comments
We collect encrypted traffic from six widely-used Instant Messaging Applications (IMAs) installed on an Android device for descriptive and statistical analysis.