
This dataset, constructed around the Jilin Baishan Incident, aims to enhance the emotion prediction capabilities of large language models. Approximately 3.5 million raw comments were collected via the Weibo API, covering key information such as user identifiers, text content, timestamps, and interaction metrics. The data underwent preprocessing steps including normalization, Chinese tokenization, stopword removal, deduplication, and anomalous sample exclusion.
- Categories: