Datasets
Standard Dataset
Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and Mobile Webpages
- Citation Author(s):
- Submitted by:
- Mohamad Amar Ir...
- Last updated:
- Mon, 10/21/2024 - 14:57
- DOI:
- 10.21227/8drg-rn32
- License:
- Categories:
- Keywords:
Abstract
This is a dataset of Tor cell file extracted from browsing simulation using Tor Browser. The simulations cover both desktop and mobile webpages. The data collection process was using WFP-Collector tool (https://github.com/irsyadpage/WFP-Collector). All the neccessary configuration to perform the simulation as detailed in the tool repository.
The webpage URL is selected by using the first 100 website based on: https://dataforseo.com/free-seo-stats/top-1000-websites.
Each webpage URL is visited 90 times for each deskop and mobile browsing mode.
The captured network traffic are then extracted into Tor cell without SENDMEs removal. Each of the Tor cell file contains the network request and response traces with the relavant timestamp.
The files naming scheme are "X-Y.cell" where "X" is the webpage URL and "Y" is the instance number. Both desktop and mobile datasets has the same webpage URL to ensure comparable content.
Each files contains list of timestamp and cell directions for each webpage instance.
To read the file:
1. Choose the folder "desktop" or "mobile".
2. On the chosen folder, iterate each files.
3. Use the class number and the instance number from the file name to determine appropriate data ingestion process (e.g. feature selection or feature extraction).
4. On each file, iterate each lines to read the timestamp, cell size (in this case is 1), and cell direction.
Comments