Dataset

Citation Author(s):
Jing Jie
Tan
Universiti Tunku Abdul Rahman
Submitted by:
Jing Jie Tan
Last updated:
Wed, 01/22/2025 - 04:43
DOI:
10.21227/dyfp-7f45
Links:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

The Essays-Big5 and Kaggle-MBTI datasets are valuable resources for personality research, combining diverse textual data with psychological labels. The Essays-Big5 dataset includes over 2,000 personal essays annotated with Big Five personality traits, enabling the exploration of linguistic patterns correlated with personality dimensions, with data split stratified by personality trait distributions to ensure balanced representation. The Kaggle-MBTI dataset offers 8,000 social media posts labeled with Myers-Briggs Type Indicator (MBTI) profiles, also employing stratified splits to preserve type proportions. Together, these datasets facilitate advancements in natural language processing by providing balanced, annotated data for robust personality modeling in varied contexts.

Instructions: 

from datasets import load_dataset

ds = load_dataset("jingjietan/essays-big5")

Dataset Files

    Files have not been uploaded for this dataset