Datasets
Standard Dataset
Explainable Sentiment Analysis Dataset
- Citation Author(s):
- Submitted by:
- Donghao Huang
- Last updated:
- Sat, 02/01/2025 - 06:32
- DOI:
- 10.21227/hx7g-vv29
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
The Explainable Sentiment Analysis Dataset provides annotated sentiment classification data for Amazon Reviews and IMDB Movie Reviews, facilitating the evaluation of sentiment analysis models with a focus on explainability. It includes ground-truth sentiment labels, model-generated predictions, and fine-grained classification results obtained from various large language models (LLMs), including both proprietary (GPT-4o/GPT-4o-mini) and open-source models (DeepSeek-R1 full and distilled models).
The dataset is structured into ground-truths (human-annotated sentiment labels) and results (LLM-generated predictions), allowing direct comparisons between human and model performance. It supports multi-level sentiment classification, ranging from binary (positive/negative) to five-class sentiment categorization (e.g., strongly positive to strongly negative).
Each model’s output includes structured sentiment predictions along with textual explanations, enabling deeper insights into the reasoning process behind sentiment classification. Additionally, the dataset captures explanation content from DeepSeek-R1 models, enhancing transparency and interpretability in sentiment analysis.
This dataset serves as a benchmark for evaluating the explainability, accuracy, and efficiency of sentiment classification models and is particularly useful for researchers, NLP practitioners, and developers interested in improving trustworthy AI applications in sentiment analysis.
refer to README.md
Documentation
Attachment | Size |
---|---|
README.md | 3.29 KB |