Data Set MR and sst-Binary

Citation Author(s):: Jiajing
Zhang
Anhui Jianzhu University
Submitted by:: Jinlan Chen
Last updated:: Sat, 12/30/2023 - 10:04
DOI:: 10.21227/bp27-xy39
License:: Creative Commons Attribution

54 Views

Categories:: Other
Keywords:: movie reviews;Stanford Sentiment Treebank

0 ratings - Please login to submit your rating.

ACCESS DATASET CITE

Abstract

MR is a textual dataset of movie reviews for binary sentiment classification, where each review contains only one sentence. The corpus has 5,331 positive and 5,331 negative reviews with an average length of 20.39 tokens. SST-2 is a subset of the Stanford Sentiment Treebank, where the data are labeled positive or negative, and contains 9,613 utterances with an average length of 20.32 tokens.

Instructions:

MR is a textual dataset of movie reviews for binary sentiment classification, where each review contains only one sentence.

Comments

数据来自https://github.com/FKarl/short-text-classification/tree/main/data

Submitted by Jinlan Chen on Sat, 12/30/2023 - 10:23

Dataset Files

text_all.txt (1.18 MB)
sst2-dev.txt (89.90 kB)

Datasets

Standard Dataset

Data Set MR and sst-Binary

Abstract

Comments

Dataset Files

QUESTIONS?