Skip to main content

Datasets

Standard Dataset

Dataset

Citation Author(s):
Jing Jie Tan (Universiti Tunku Abdul Rahman)
Submitted by:
Jing Jie Tan
Last updated:
DOI:
10.21227/dyfp-7f45
Links:
No Ratings Yet

Abstract

The Essays-Big5 and Kaggle-MBTI datasets are valuable resources for personality research, combining diverse textual data with psychological labels. The Essays-Big5 dataset includes over 2,000 personal essays annotated with Big Five personality traits, enabling the exploration of linguistic patterns correlated with personality dimensions, with data split stratified by personality trait distributions to ensure balanced representation. The Kaggle-MBTI dataset offers 8,000 social media posts labeled with Myers-Briggs Type Indicator (MBTI) profiles, also employing stratified splits to preserve type proportions. Together, these datasets facilitate advancements in natural language processing by providing balanced, annotated data for robust personality modeling in varied contexts.

Instructions:

from datasets import load_dataset

ds = load_dataset("jingjietan/essays-big5")

Dataset Files

Files have not been uploaded for this dataset