Datasets
Standard Dataset
SART
- Citation Author(s):
- Submitted by:
- Alexandra Ciobotaru
- Last updated:
- Mon, 07/08/2024 - 15:59
- DOI:
- 10.21227/5fnc-tk84
- Data Format:
- License:
21 Views
- Categories:
- Keywords:
0 ratings - Please login to submit your rating.
Abstract
SART contains 3000 tweets labelled with respect to the polarity of the sentiment expressed: positive, negative or neutral. Each class contains 1300 tweets and the dataset is split into train/validation/test csv files.
Instructions:
# SART - Sentiment Analysis from Romanian Tweets
This dataset contains tweets in Romanian labelled with: 0 (Negative), 1 (Neutral) and 2 (Positive). Each class contains 1300 tweets and the dataset is split into train/validation/test csv files: 3120 tweets for training, 390 tweets for validation and 390 tweets for testing.
| Class Name | No. of labelled tweets |
| ------- | --- |
| Negative | 1300 |
| Positive | 1300 |
| Neutral | 1300 |
To protect confidentiality of Twitter users, we removed usernames from this dataset.