Datasets
Standard Dataset
COVIDSentiRO
- Citation Author(s):
- Submitted by:
- Alexandra Ciobotaru
- Last updated:
- Mon, 07/08/2024 - 15:59
- DOI:
- 10.21227/a7az-aw35
- Data Format:
- License:
19 Views
- Categories:
- Keywords:
0 ratings - Please login to submit your rating.
Abstract
COVIDSentiRO contains 19319 Romanian tweets extracted in the time-frame 01.01.2021 - 28.02.2022 using query words related to COVID-19 vaccination. Each tweet has its timestamp associated and is labelled with positive, negative and neutral, using the SART dataset for sentiment analysis.
Instructions:
COVIDSentiRO dataset contains 19319 Romanian tweets regarding COVID19. After predicting each tweet's sentiment using SART dataset, 9718 tweets received the label "negative", 8039 were "neutral" and "1562" positive.
Each tweet is associated with its published timestamp, sentiment label and probability.
For anonimity reasons, we removed usernames from this dataset.