Datasets
Open Access
ChnSentiCorp
- Citation Author(s):
- Submitted by:
- Wenbin Zheng
- Last updated:
- Tue, 01/04/2022 - 08:58
- DOI:
- 10.21227/yfwt-wr77
- License:
1171 Views
- Categories:
0 ratings - Please login to submit your rating.
Abstract
This dataset is a large-scale Chinese hotel review data set collected by Tan Songbo. The corpus size is 10,000 reviews. The corpus is automatically collected and organized from Trip.com.
Instructions:
This dataset is organized into 4 subsets, including: ChnSentiCorp-Htl-ba-2000 (balanced corpus, 1000 articles for positive and negative categories), ChnSentiCorp-Htl-ba-4000 (balanced corpus, 2000 articles for positive and negative categories), ChnSentiCorp-Htl-ba-6000 (balanced corpus, 3000 positive and negative categories) and ChnSentiCorp-Htl-unba-10000 (unbalanced corpus, 7000 positive categories).
Comments
123
456
789