Skip to main content

Datasets

Standard Dataset

Dataset for "Integrating Deep Learning Approaches for Identifying News Reprint Relation"

Citation Author(s):
Yin Luo
Fangfang Wang
Jun Chen
Lei Wang
Daniel Dajun Zeng
Submitted by:
Fangfang Wang
Last updated:
DOI:
10.21227/vwam-dn13
Research Article Link:
222 views
Categories:
No Ratings Yet

Abstract

 # of original news:30;
# of candidate news:25899;
# of reprinted news (no source label):4234 (537)

 

Instructions:

This dataset was constructed for news reprint relation identification. It crawled from more than 3000 new portals on a daily basis from January 1st, 2018 to June 30, 2018. It consists of 30 popular original news items in the field of finance, sports and technology and 25899 candidate news items which were chosen by keyword matching. The reprint relations between original news and its candidate news was manually labeled . If the candidate news reprints the original news, the reprint relation will be labelled as 1, otherwise the reprint relation will be labelled as 0.