D3TEC Dataset

Citation Author(s):
Luis F.
Brenes
Tecnologico de Monterrey, School of Engineering and Sciences.
Luis A.
Trejo
Tecnologico de Monterrey, School of Engineering and Sciences.
Jose Antonio
Cantoral-Ceballos
Tecnologico de Monterrey, School of Engineering and Sciences.
Daniela
Aguilar De-León
Tecnologico de Monterrey, School of Medicine and Health Sciences.
Fresia Paloma
Hernández-Moreno
Tecnologico de Monterrey, School of Medicine and Health Sciences.
Submitted by:
Luis Felipe Brenes
Last updated:
Tue, 11/26/2024 - 18:08
DOI:
10.21227/0m32-t378
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

 

Depression is a mental health condition that affects millions of people worldwide. Although com- mon, it remains difficult to diagnose due to its heterogeneous symptomatology. Mental health questionnaires are currently the most used assessment method to screen depression; these, how- ever, have a subjective nature due to their dependence on patients’ self-assessments. Researchers have been interested in finding an accurate way of identifying depression through an objective biomarker. Recent developments in neural networks and deep learning have enabled the possi- bility of classifying depression through the computational analysis of voice recordings. However, this approach is heavily dependent on the availability of datasets to train and test deep learning models, and these are scarce. There are also very few languages available. This study proposes a protocol for the collection of a new dataset for deep learning research on voice depression classifi- cation, featuring Spanish speakers, professional and smartphone microphones, and a high-quality recording standard.

This work aims at creating a high-quality voice depression dataset by recording Spanish speakers with a professional microphone and strict audio quality standards. The data is captured by a smartphone microphone as well for further research in the use of smartphone applications for depression identification. Our methodology involves the strategic collection of depressed and non-depressed voice recordings. Three types of data are collected: voice recordings, depression labels (using the PHQ-9 questionnaire), and additional data that could potentially influence speech. Recordings are captured with professional-grade and smartphone microphones simultaneously to ensure versatility and practical applicability. Several considerations and guidelines are described to ensure high audio quality and avoid potential bias in deep learning research.

This data collection effort immediately enables new research topics on depression classifica- tion. Some potential uses include deep learning research on Spanish speakers, an evaluation of the impact of audio quality on developing audio classification models, and an evaluation of the appli- cability of voice depression classification technology on smartphone applications. A preliminary experimentation section is included to showcase the potential research areas that the creation of this dataset enables.

This research marks a significant step towards the objective and automated classification of depression in voice recordings. By focusing on the underrepresented demographic of Spanish speakers, the inclusion of smartphone recordings, and addressing the current data limitations in audio quality, this study lays the groundwork for future advancements in deep learning-driven mental health diagnosis. 

Instructions: 

Refer to the methods sections of the "D3TEC Dataset.pdf" file.

Comments

need dataset

Submitted by Anisha GS on Wed, 11/27/2024 - 05:39