ChnSentiCorp

Citation Author(s):
Songbo
Tan
Submitted by:
Wenbin Zheng
Last updated:
Tue, 01/04/2022 - 08:58
DOI:
10.21227/yfwt-wr77
License:
0
0 ratings - Please login to submit your rating.

Abstract 

This dataset is a large-scale Chinese hotel review data set collected by Tan Songbo.  The corpus size is 10,000 reviews. The corpus is automatically collected and organized from Trip.com.

Instructions: 

This dataset is organized into 4 subsets, including: ChnSentiCorp-Htl-ba-2000 (balanced corpus, 1000 articles for positive and negative categories), ChnSentiCorp-Htl-ba-4000 (balanced corpus, 2000 articles for positive and negative categories),  ChnSentiCorp-Htl-ba-6000 (balanced corpus, 3000 positive and negative categories) and ChnSentiCorp-Htl-unba-10000 (unbalanced corpus, 7000 positive categories).

Comments

123

Submitted by Yuanfei Deng on Tue, 09/22/2020 - 09:33

456

Submitted by Yawei He on Mon, 06/14/2021 - 19:59

789

Submitted by Max Liu on Wed, 11/17/2021 - 04:41