Data Set MR and sst-Binary

Citation Author(s):
Jiajing
Zhang
Anhui Jianzhu University
Submitted by:
Jinlan Chen
Last updated:
Sat, 12/30/2023 - 10:04
DOI:
10.21227/bp27-xy39
License:
50 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

MR is a textual dataset of movie reviews for binary sentiment classification, where each review contains only one sentence. The corpus has 5,331 positive and 5,331 negative reviews with an average length of 20.39 tokens. SST-2 is a subset of the Stanford Sentiment Treebank, where the data are labeled positive or negative, and contains 9,613 utterances with an average length of 20.32 tokens.

Instructions: 

MR is a textual dataset of movie reviews for binary sentiment classification, where each review contains only one sentence.

Comments

Submitted by Jinlan Chen on Sat, 12/30/2023 - 10:23