weibo_senti_100k and THUCNews

Citation Author(s):
Maosong
Sun
Submitted by:
YUANYUAN Zhang
Last updated:
Wed, 07/27/2022 - 09:26
DOI:
10.21227/abj8-y636
Links:
License:
294 Views
Categories:
0
0 ratings - Please login to submit your rating.

Abstract 

Weibo_senti_100k sentiment classification data set is a two-class classification data set, the average length is 42.9 words. This data set contains 100,000 pieces of Sina weibo text data, including two categories.THUCNews news classification data set contains 50,000 pieces of data, the average length is 534.53 words, including 10 categories.

Instructions: 

Weibo_senti_100k sentiment classification data set is a two-class classification data set, the average length is 42.9 words. This data set contains 100,000 pieces of Sina weibo text data, including two categories: positive emotion and negative emotion. Each category has about 50,000 pieces of data.THUCNews news classification data set contains 50,000 pieces of data, the average length is 534.53 words, including 10 categories: finance, real estate, home furnishing, education, technology, fashion, political events, sports, games, entertainment. Each category has 5000 pieces of data.