Data Set MR and sst-Binary

- Citation Author(s):
-
Jiajing Zhang (Anhui Jianzhu University)
- Submitted by:
- Jinlan Chen
- Last updated:
- DOI:
- 10.21227/bp27-xy39
- Categories:
- Keywords:
Abstract
MR is a textual dataset of movie reviews for binary sentiment classification, where each review contains only one sentence. The corpus has 5,331 positive and 5,331 negative reviews with an average length of 20.39 tokens. SST-2 is a subset of the Stanford Sentiment Treebank, where the data are labeled positive or negative, and contains 9,613 utterances with an average length of 20.32 tokens.
Instructions:
MR is a textual dataset of movie reviews for binary sentiment classification, where each review contains only one sentence.