Arabic Speech

This dataset contains audio recordings sourced from more than 57 TV shows provided by the Saudi Broadcasting Authority. The total number of hours published for these recordings is ~667 hours. The recordings are in Arabic, the majority are in Saudi dialects, and some are in other dialects. To enhance the usage of SADA, the dataset is split into training, validation, and testing sets. Each of validation and testing sets is around 10 hours in audio segments length while training set is 418 hours.

Categories:
22 Views