Datasets
Standard Dataset
ADFLOW: SITE-D|IMG-D|TEXT-D
- Citation Author(s):
- Submitted by:
- Yuxuan Shang
- Last updated:
- Sat, 07/20/2024 - 08:46
- DOI:
- 10.21227/w9fj-9f96
- License:
139 Views
- Categories:
- Keywords:
0 ratings - Please login to submit your rating.
Abstract
There are parts of datasets used in paper <ADFLOW: Integrated and Comprehensive Ad Detection Considering Relationships Among Webpage Elements>, including SITE-D/IMG-D/TEXT-D and some PageGraphs extracted from websites in SITE-D.
The IMG-D dataset is large, and we have not yet finished organizing it, so it only includes a portion. Similarly, due to the extensive size of the entire PageGraph dataset used in our experiments, we have only uploaded the PageGraph of a few hundred websites.
Instructions:
The SITE-D table contains a list of 14,250 websites used for all site-level experiments in the paper. IMG-D includes a portion of the ad and non-ad images we collected from SITE-D. TEXT-D contains the ad and non-ad text we collected from SITE-D.