Datasets
Standard Dataset
COVID-19 VACCINE Misinformation Aspects

- Citation Author(s):
- Submitted by:
- Heba Ismail
- Last updated:
- Wed, 02/26/2025 - 07:44
- DOI:
- 10.21227/n4s4-6278
- Data Format:
- Research Article Link:
- License:
- Categories:
- Keywords:
Abstract
The COVID-19 Vaccine Misinformation Aspects Dataset contains 3,824 English tweets discussing COVID-19 vaccine misinformation, collected from Twitter/X between December 31, 2020, and July 8, 2021. Each tweet is manually annotated and categorized into four distinct misinformation aspects: (1) Vaccine Constituent, (2) Adverse Effects, (3) Agenda-Driven Narratives, and (4) Efficacy and Clinical Trials. Annotations were validated by multiple raters, with only labels achieving consensus included in the final dataset. The annotation process is based on reliable medical sources, including the Centers for Disease Control and Prevention (CDC).
This dataset is designed to facilitate natural language processing (NLP) research, particularly in misinformation detection, aspect-based sentiment analysis (ABSA), and social media discourse analysis. It supports studies in public health communication, machine learning, and computational social science, offering insights into vaccine hesitancy and misinformation trends. By enabling the development of AI-driven misinformation detection models, this dataset contributes to evidence-based public health strategies for combating vaccine-related misinformation.
Key Dataset Information:
- Time Period: December 31, 2020 – July 8, 2021
- Source: Twitter/X
- Annotations: Each tweet is labeled with one or more misinformation aspects, validated against reliable medical sources (CDC).
Please read the readme file for instructions
Documentation
Attachment | Size |
---|---|
172.39 KB |