Using social media and personality traits to assess software developers' emotions

Citation Author(s):
Centre of Informatics and Systems, University of Coimbra, Polo II, Pinhal de Marrocos, 3030-290 Coimbra, Portugal
Faculty of Psychology and Educational Sciences, University of Coimbra, Colégio Novo Street, 3001-802 Coimbra, Portugal
Faculty of Psychology and Educational Sciences, University of Coimbra, Colégio Novo Street, 3001-802 Coimbra, Portugal
Faculty of Psychology and Educational Sciences, University of Coimbra, Colégio Novo Street, 3001-802 Coimbra, Portugal
Department of Informatics and Applied Mathematics, Federal University of Rio Grande do Norte, 59072-970, Natal, Brazil
Faculty of Psychology and Educational Sciences, University of Coimbra, Colégio Novo Street, 3001-802 Coimbra, Portugal
Centre of Informatics and Systems, University of Coimbra, Polo II, Pinhal de Marrocos, 3030-290 Coimbra, Portugal
Submitted by:
Leo Silva
Last updated:
Tue, 05/10/2022 - 10:57
Data Format:
0 ratings - Please login to submit your rating.


Companion data of the paper "Using social media and personality traits to assess software developers’ emotions" submitted to the IEEE Access journal, 2022. This dataset contains the anonymized dataset used in the study, including the answers of demographic survey, the answers to the Big Five Inventory, the experiment protocol, the manual analysis from psychologists and participants, all generated charts and data analysis.


Companion DATA


Using social media and personality traits to assess software developers’ emotions


Leo Moreira Silva

Marília Gurgel Castro

Miriam Bernardino Silva

Milena Nestor Santos

Uirá Kulesza

Margarida Lima

Henrique Madeira


    IEEE Access


The folders contain:



analyzed_tweets_by_psychologists.csv: file containing the manual analysis done by psychologists

analyzed_tweets_by_participants.csv: file containing the manual analysis done by participants

analyzed_tweets_by_psychologists_solved_divergencies.csv: file containing the manual analysis done by psychologists over 51 divergent tweets' classifications



alldata.json: contains the dataset used in the paper



committee_response.pdf: contains the acceptance response of Research Ethics and Deontology Committee of the Faculty of Psychology and Educational Sciences of the University of Coimbra.

committee_submission_form.pdf: the project submitted to the committee.

consent_form.pdf: declaration of free and informed consent fulfilled by participants.

data_protection_declaration.pdf: personal data and privacy declaration, according to European Union General Data Protection Regulation.



General - Charts.ipynb: notebook file containing all charts produced in the study, including those in the paper

Statistics - Lexicons and Ensembles.ipynb: notebook file with the statistics for the five lexicons and ensembles used in the study

Statistics - Linear Regression.ipynb: notebook file with the multiple linear regression results

Statistics - Polynomial Regression.ipynb: notebook file with the polynomial regression results

Statistics - Psychologists versus Participants.ipynb: notebook file with the statistics between the psychologists and participants manual analysis

Statistics - Working x Non-working.ipynb: notebook file containing the statistical analysis for the tweets posted during work period and those posted outside of working period



Demographic_Survey.pdf: survey inviting participants to enroll in the study. We collect demographic data and participants' authorization to access their public Tweet posts

Demographic_Survey_answers.xlsx: participants' demographic survey answers

ibf_pt_br.doc: the Portuguese version of the Big Five Inventory (BFI) instrument to infer participants' Big Five polarity traits

ibf_answers.xlsx: participantes' and psychologists' answers for BFI

Experiment Protocol.pdf: file containing the explanation of the experiment protocol.


We have removed from dataset any sensible data to protect participants' privacy and anonymity.

We have removed from demographic survey answers any sensible data to protect participants' privacy and anonymity.