FLAMENCO Learning Disabilities Dataset

Citation Author(s):
Nikolaos
Pavlidis
Democritus University of Thrace
Vasileios
Perifanis
Democritus University of Thrace
Submitted by:
NIKOLAOS PAVLIDIS
Last updated:
Thu, 01/11/2024 - 12:49
DOI:
10.21227/vv67-ka20
License:
0
0 ratings - Please login to submit your rating.

Abstract 

In the context of the FLAMENCO project, we have released a dataset designed for predicting potential deficiencies in children's communication skills, tailored for Federated Learning. This dataset specifically focuses on addressing two prevalent deficiencies in communication skill development in children: autism and intellectual disability. For each deficiency, two CSV files are provided—one for training machine learning models and another for testing them. Each entry in these CSV files includes the following details:

 

-        case_id: An anonymized identifier used to distinguish cases.

-        client_id: Identifies the client to which the case belongs, useful for dataset splitting in federated settings.

-        A series of scores measuring specific communication skills:  These scores, such as Verbalization, Voicing, Syntax, etc., are derived from the child's performance in specialised gamified exercises and have been computed with the assistance of expert clinicians.

  •         target: Can be -1 (no clinician's diagnosis available for the case), 0 (no diagnosed deficiency in the case), 1 (indicates a positive diagnosis of communication deficiency by a clinician)

TThis project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements No. 957406 (TERMINET).

Instructions: 

In the context of the FLAMENCO project, we have released a dataset designed for predicting potential deficiencies in children's communication skills, tailored for Federated Learning. This dataset specifically focuses on addressing two prevalent deficiencies in communication skill development in children: autism and intellectual disability. For each deficiency, two CSV files are provided—one for training machine learning models and another for testing them. Each entry in these CSV files includes the following details:

 

-        case_id: An anonymized identifier used to distinguish cases.

-        client_id: Identifies the client to which the case belongs, useful for dataset splitting in federated settings.

-        A series of scores measuring specific communication skills:  These scores, such as Verbalization, Voicing, Syntax, etc., are derived from the child's performance in specialised gamified exercises and have been computed with the assistance of expert clinicians.

  •         target: Can be -1 (no clinician's diagnosis available for the case), 0 (no diagnosed deficiency in the case), 1 (indicates a positive diagnosis of communication deficiency by a clinician)

TThis project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements No. 957406 (TERMINET).