Skip to main content

Datasets

Standard Dataset

Children Arabic Utterances for Mispronunciation Detection

Citation Author(s):
Mona Sadik (Faculty of Computer and Information Sciences, Ain Shams University)
Sherin Moussa (Faculty of Computer and Information Sciences)
Submitted by:
Sherin Moussa
Last updated:
DOI:
10.21227/p5k8-6m10
Data Format:
No Ratings Yet

Abstract

Children Arabic Utterances for Mispronunciation Detection Dataset

Audio samples were recorded from 27 Egyptian children (14 boys and 13 girls aged between 7 and 12 years old), where they pronounce 16 words. The files are organized into folders and subfolders that contain the following: the dataset is managed and separated into 2 folders (Correct / Wrong) pronunciations. The dataset is collected and annotated on segmental pronunciation errors by Arabic linguistics experts from NahdetMisr Publishing House (https://nahdetmisr.com/).

We would like to acknowledge NahdetMisr Publishing House for their generous support and collaboration to provide the required resources and expertise, which greatly contributed to the success of this research project.

 

For more details, please contact:

Mona A. Sadik and Sherin M. Moussa

Faculty of Computer and Information Sciences,

Ain Shams University

mona.sadik@cis.asu.edu.eg, sherinmoussa@cis.asu.edu.eg

 

Instructions:

Children Arabic Utterances for Mispronunciation Detection Dataset

Audio samples were recorded from 27 Egyptian children (14 boys and 13 girls aged between 7 and 12 years old), where they pronounce 16 words. The files are organized into folders and subfolders that contain the following: the dataset is managed and separated into 2 folders (Correct / Wrong) pronunciations. Each folder is further split for each 27 speakers; each contains .wav files of all the pronounced words. The collected pronounciations were processed through the software of Audacity to obtain the audio .wav files with mono channel and a sampling rate of 16kHz. The dataset is collected and annotated on segmental pronunciation errors by Arabic linguistics experts from NahdetMisr Publishing House (https://nahdetmisr.com/).

indexWordindexWord

1عين27بكى

9شرب29رسم

10خرج30كتب

11دخل31فتح

14عائلة32غسل

21مسجد33قرأ

23درج36دب

26ضحك40حصان

We would like to acknowledge NahdetMisr Publishing House for their generous support and collaboration to provide the required resources and expertise, which greatly contributed to the success of this research project.

 

For more details, please contact:

Mona A. Sadik and Sherin M. Moussa

Faculty of Computer and Information Sciences,

Ain Shams University

mona.sadik@cis.asu.edu.eg, sherinmoussa@cis.asu.edu.eg