Datasets
Standard Dataset
SocialCD-3K
- Citation Author(s):
- Submitted by:
- Hongzhi Qi
- Last updated:
- Sun, 06/09/2024 - 05:42
- DOI:
- 10.21227/jb3w-j696
- Data Format:
- License:
329 Views
- Categories:
- Keywords:
0 ratings - Please login to submit your rating.
Abstract
We sourced our data by crawling comments from the “Zoufan” blog within the Weibo social platform. Subsequently, a team of qualified psychologists were enlisted to annotate the data. In our study, strict data preprocessing measures were adopted to protect users’ privacy.
SocialCD-3K (Cognitive Distortion Classification)
- Labels and Number of Samples:
- All-or-nothing thinking: 77
- Over-generalization: 141
- Mental filter: 378
- Disqualifying the positive: 27
- Mind reading: 121
- The fortune teller error: 652
- Magnification: 321
- Emotional reasoning: 16
- Should statements: 84
- Labeling and mislabeling: 1961
- Blaming oneself: 188
- Blaming others: 27
- Data Split:
- Training set: 2725 samples
- Test set: 682 samples
- Average Number of Labels per Sample: 1.71
- Average Number of Words per Post: 42.56
Funding Agency:
National Natural Science Foundation of China
Grant Number:
72174152, 72304212 and 82071546
Comments
Nice