ADFLOW: SITE-D|IMG-D|TEXT-D

Citation Author(s):
Yuxuan
Shang
Submitted by:
Yuxuan Shang
Last updated:
Sat, 07/20/2024 - 08:46
DOI:
10.21227/w9fj-9f96
License:
139 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

There are parts of datasets used in paper <ADFLOW: Integrated and Comprehensive Ad Detection Considering Relationships Among Webpage Elements>, including SITE-D/IMG-D/TEXT-D and some PageGraphs extracted from websites in SITE-D.

The IMG-D dataset is large, and we have not yet finished organizing it, so it only includes a portion. Similarly, due to the extensive size of the entire PageGraph dataset used in our experiments, we have only uploaded the PageGraph of a few hundred websites.

Instructions: 

The SITE-D table contains a list of 14,250 websites used for all site-level experiments in the paper. IMG-D includes a portion of the ad and non-ad images we collected from SITE-D. TEXT-D contains the ad and non-ad text we collected from SITE-D.