Name: Annotated Arabic Extremism Tweets
Creator: Saja Aldera
License: https://creativecommons.org/licenses/by/4.0/
Keywords: Artificial Intelligence

Abstract

We present an Arabic Twitter dataset for online extremism detection consisting of 89K tweets with associated metadata. The dataset was manually annotated by three experts and achieved a Gwet’s AC1 score of 0.6, indicating substantial inter-annotator agreement. We performed further analysis of the tweet metadata to identify important features. For the extremism dataset, there were 89,816 tweets in total published by 52,929 unique users. Moreover, 50,279 tweets (56%) from 22,858 unique users were labeled as extremist, whereas 39,537 tweets (44%) from 30,911 unique users were labeled as non-extremist. We applied Shannon’s entropy measure to check the dataset’s balance, deriving a result of 0.98, which indicates that the dataset is well balanced.

Instructions:

Important Notes:

> Twitter's content redistribution policy restricts the sharing of tweet information other than tweet IDs and/or user IDs. Twitter wants researchers to always pull fresh data. It is because a user might delete a tweet or make his/her profile protected.

> Only the tweet IDs and Annotation are available.

> If you need the full dataset please contact me on: saaldera@ksu.edu.sa

Comments

Thank you

Submitted by Yaser Altalhi on Tue, 10/05/2021 - 05:36

Your welcome

Submitted by Saja Aldera on Mon, 10/18/2021 - 17:52

Thank you for sharing, I am looking for the complete dataset to use with my research.

Submitted by Yanis AZIB on Sun, 11/13/2022 - 07:33

If you could share with me the dataset to test my model please, i've already sent you a request on your email

Submitted by Zerrouki Khadidja on Fri, 11/25/2022 - 15:23

iam just student and i need it for learning only

Submitted by omar mohamed on Fri, 12/20/2024 - 03:25

Dataset Files

Annotated Arabic Extremism Content.xlsx (2.38 MB)

Datasets

Standard Dataset

Annotated Arabic Extremism Tweets

Abstract

Comments

Dataset Files

QUESTIONS?

Datasets

Standard Dataset

Annotated Arabic Extremism Tweets

Abstract

Comments

Dataset Files

Related Datasets

QUESTIONS?