DFND : Dravidian_Fake News Data

Citation Author(s):
Eduri
Raja
National Institute of Technology Silchar, India
Badal
Soni
National Institute of Technology Silchar, India
Samir Kumar
Borgohain
National Institute of Technology Silchar, India
Submitted by:
Eduri Raja
Last updated:
Tue, 06/20/2023 - 07:23
DOI:
10.21227/nj13-t949
Data Format:
License:
3
1 rating - Please login to submit your rating.

Abstract 

 

DFND is a Dravidian fake news dataset for detecting fake news in Dravidian languages, namely Telugu, Kannada, Tamil, and Malayalam. We collected the data from different sources: for real news articles, we scrapped the data from various news websites like Eenadu, Dinamalar, Kannadaprabha, Malayala manorama, etc.; for fake news articles, we scrapped the data from various fact-checking websites like factly, factcrescendo, etc. We collected the data from January 2021 to December 2022. After collecting the data, data preprocessing was performed through our designed script; the data annotation on preprocessed data was performed through corresponding language experts to mentioned Dravidian languages. The DFND dataset is preprocessed. This dataset contains more than 27,000 news articles which consist of 50%  fake and 50% real news articles.

Instructions: 

The DFND.zip folder contains the whole Dravidian languages dataset. The folder has four files: (1) Telugu, (2) Tamil, (3) Kannada, and (4) Malayalam. Each folder has two files: (1) fake.csv and (2) true.csv.

The Dataset has two columns: text and label.

text: A claim published in the media by a person or an organization. 

 

label: The class for each sample.