EmoSurv: A typing biometric (Keystroke dynamics) dataset with emotion labels created using computer keyboards

Citation Author(s):
Aicha
Maalej
University of Sfax
Ilhem
Kallel
University of Sfax
Submitted by:
Aicha Maalej
Last updated:
Tue, 10/13/2020 - 20:49
DOI:
10.21227/eae6-pk42
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

EmoSurv is a dataset containing keystroke data along with emotion labels. Timing and frequency data is recorded while participants are typing free and fixed texts before and after being induced specific emotions. These emotions are: Anger, Happiness, Calmness, Sadness, and Neutral state.

First, data is collected while the participant is in a neutral state. Then, the participant watches an eliciting video. Once the emotion is induced in the participant, he types another fixed and free text.

EmoSurv dataset is collected after developing a dynamic web application, also named EmoSurv application, and sharing it via e-mail, Facebook, and word of mouth.

  • We named it EmoSurv as our objective is to investigate on Human’s emotional states while typing on a keyboard. While collecting data, we made sure that only trials using physical keyboards are accepted. 

  • A total of 124 participants visited EmoSurv and logged information.

Kindly cite our paper when you wish to use this dataset:

 

Maalej and I. Kallel, "Does Keystroke Dynamics tell us about Emotions? A Systematic Literature Review and Dataset Construction," 2020 16th International Conference on Intelligent Environments (IE), Madrid, Spain, 2020, pp. 60-67, doi: 10.1109/IE49459.2020.9155004.

Instructions: 

The dataset contains 4 .csv files:

  • File 1: Fixed Text Typing Dataset which is collected while a participants it typing a fixed text and it  includes the following features: User Id, Emotion Index, Index, Key Code, key Down, key Up, D1U1, D1U2, D1D2, U1D2, U1U2, D1U3, D1D3, and Answer.

  • File 2: Free Text Typing Dataset which is collected while a participants it typing a free text and it  includes the following features:  User Id, Emotion Index, Index, Key Code, key Down, key Up, D1U1, D1U2, D1D2, U1D2, U1U2, D1U3, D1D3, and Answer.

  • File 3: Frequency Dataset which includes frequency related features like User ID, textIndex, EmotionIndex, DelFreq, LeftFreq, and TotTime.

  • File 4: Participants Information dataset which includes demographics information like UserID, TypeWith, TypistType, PCTimeAverage, AgeRange, gender, status, degree, and country.

NOTE:

  • UserID: each participant is allocated the same ID in the 4 files.

  • Emotion Index: H (for Happy), S (for Sad), A (for Angry), C (for Calm), and N (for Neutral state).

  • Key Code: the key pressed by the participant.

  • Key Down: is the exact timestamp of the key down event. 

  • Key Up: is the exact timestamp of the key up event.

  • TextIndex: the type of text typed being either FI (for Fixed text) or FR (for Free text)

  • D1U1 (DT1): Time between first key down and first key up 1

  • D1U2 (Dig2): Time between first key down and second key up 2

  • D1D2 (Dig1): Time between first key down and second key down 2

  • U1D2 (FT1 / FT2): Time between first key up and second key down 2

  • U1U2 (Dig3): Time between first key up and second key up 2

  • D1U3 (Trig2): Time between first key down and third key up 3

  • D1D3 (Trig1): Time between first key down and third key down 3

  • Answer: Takes “R” (as right answer) if the participant answered correctly the accuracy question and “W” (as wrong answer) if he incorrectly answered it. (The accuracy question is a MCQ related to the video that the participant has watched)

  • DelFreq: Relative frequency of delete key NA

  • LeftFreq: Relative frequency of backspace key NA

  • Typing speed: Number of key pressed in each task the time spent from the first key pressed to the last key released (in the same task). 

  • TypeWith: specifies if the participant types using one hand or two hands

  • TypistType: specifies whether the participant uses one finger, two fingers, or is a touch typist (multiple fingers) to type a text.

  • PCTimeAverage: is the average time a user spends on his/her computer per day.

  • AgeRange: 16-19, 20-29, 30-39, >= 40years old. 

  • Gender: Male, or female

  • Status: Student, or professional

  • Degree: College/University, or High school. 

  • Country: Place of residence.

The following figure represents how the timing features are calculated.

 

Grant of License

We grant You a non-exclusive, non-transferable, revocable license to use the EmoSurv  Dataset solely for Your non-commercial, educational, and research purposes only, but without any right to copy or reproduce, publish or otherwise make available to the public or communicate to the public, sell, rent or lend the whole or any constituent part of the Emosurv Dataset thereof.