JUMLA-QSL-22: A dataset of Qatari sign language sentences

Citation Author(s):
Oussama
El Ghoul
Mada Center, Qatar
Achraf
Othman
Mada Center, Qatar
Maryam
Aziz
Mada Center, Qatar
Sammy
Sedrati
Mada Center, Qatar
Submitted by:
Mada Edge
Last updated:
Wed, 11/02/2022 - 16:42
DOI:
10.21227/ckzp-3754
Data Format:
License:
5
1 rating - Please login to submit your rating.

Abstract 

Sign languages are the most common mode of communication with and between hearing-impaired individuals. In the Arab world, Arabic sign language is used with different dialects supporting a distinct set of rules for the gestures used. With research on natural language processing advancing, models have been developed to translate sign language to spoken language and vice versa. However, Arabic sign language has rarely been studied due to the lack of availability of datasets dealing with Arabic sign language.

The aim of this project is to improve the accessibility of hearing-impaired individuals by bridging the gap in communication using the Jumla dataset. This dataset supplies a large sample of Arabic sign language in the Qatari dialect, having 6300 records collected over a period of 5 months. 7 participants were invited to the study which included 5 hearing-impaired individuals and 2 sign language interpreters. The participants were given a sentence from a list of 900 sentences at a time and videos of them signing the sentences in Qatari sign language were recorded. The videos were recorded from four angles (front, left side, right side, and top view) using four true-depth cameras.

Instructions: 

The dataset has 7 folders, one for each participant, with each folder having 900 sub-folders and a *.csv file. Within the sub-folders, the output of each signed sentence from a participant is stored. The output holds four videos from four different angles. The details of the dataset content are described below:

                 i.          Participant_YY folder: There are 7 such folders, one for each participant.

               ii.          JUMLA-QSL-22-Participant_YY.csv: Each participant has a .csv file that contains the Arabic sentences they signed and their respective code.

             iii.          f_YYxxx: Within each folder, there are 900 subfolders labelled with the coded sentence. For example, the folder ‘f_AT374’ represents the code ‘374’ in JUMLA-QSL-22-Participant_AT[MA1] .csv that refers to the sentence ‘يوم الثلاثاء’ hence, this folder contains the video of the participant signing the sentence ‘يوم الثلاثاء’. Each subfolder contains four types of files representing the four views recorded of the participant signing a sentence.

1.      recF.svo: This video file contains the front angle of the participant signing. The RGB video was recorded with a frame rate of 60 fps and a resolution of 2560 x 720 pixels.

2.      recT.svo: This video file contains the top angle of the participant signing. The RGB video was recorded with a frame rate of 60 fps and a resolution of 2560 x 720 pixels.

3.      recL.bag: This video file contains the left angle of the participant signing. The RGB video was recorded with a frame rate of 60 fps and a resolution of 640 x 480 pixels.

4.      recR.bag: This video file contains the right angle of the participant signing. The RGB video was recorded with a frame rate of 60 fps and a resolution of 640 x 480 pixels.

 

Comments

Thanks for your outstanding efforts on this topic, it is really appreciated.

I would like to raise an issue "I think so it is an issue up to 90%" about downloading the dataset, the link on IEEE-DataPort just has .svo and .bag files and I could not find any .csv files for the participant even though the description illustrates that the dataset should have .csv files.

Submitted by Muhammad Al-Brham on Mon, 01/16/2023 - 05:03

Thank you for your comment. We uploaded all csv files. Kindly note that we're still uploading data as the size exceeds 3 TB.

Submitted by Mada Edge on Mon, 03/06/2023 - 05:11

Thank you for the effort. Can you add the research paper link relating to the dataset

Submitted by soumaya belfeki on Mon, 05/01/2023 - 06:04

Hi Soumaya, the paper analyzing the data is under review. We will announce it once published.

Submitted by Achraf Othman on Wed, 09/13/2023 - 01:10

HOW CAN I DOWNLOAD THE ALL DATA SET AT ONE TIME

Submitted by Ziad Eldamarany on Mon, 12/18/2023 - 06:49

Dataset Files