SYPHAXAR Dataset

Citation Author(s):
Houssem
Turki
National Engineering School of Sfax (ENIS), University of Sfax, Tunisia
Mohamed
Elleuch
National School of Computer Science (ENSI), University of Manouba, Tunisia
Mongi
Kherallah
Faculty of Sciences, University of Sfax, Tunisia
Submitted by:
mohamed elleuch
Last updated:
Tue, 09/12/2023 - 12:40
DOI:
10.21227/dpsa-q406
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

SYPHAXAR dataset is a dataset for Arabic text detection in the wild. It was collected from Tunisia in “Sfax” city, the second largest Tunisian city after the capital. A total of 3078 images were gathered through manual collection one by one, with each image energizing text detection challenges in nature according to real existing complexity of 15 different routes along with ring roads, intersections and roundabouts. These annotated images consist of more than 31000 objects, each of which is enclosed within a bounding box. The estimated overall distance covered is around 422 kilometers; all the paths mentioned and traveled contain commercial and service activities on both outward and return routes.It's worth noting that one of the notable contributions and challenges of the SYPHAXAR dataset is the inclusion of natural text scripts in the Arabic language, encompassing most of the existing challenges seen in state-of-the-art datasets.

Instructions: 

Steps to use this Dataset:

1.     Download the “SYPHAXAR.zip” file on your device

2.     Extract the “SYPHAXAR.zip” file at a particular location.

3.     You will find two folders, the folder name “Images” contains all the images of the dataset and the two folders “Annotations” &  “YOLO_Annotations” containing all the annotations (Line-level & Word-level) of the dataset.

Comments

Interesting

Submitted by Amr Abu Alhaj on Wed, 09/06/2023 - 09:44