Speech Dataset in Hindi Language

Citation Author(s):: Shivam Shukla
Submitted by:: Shivam Shukla
Last updated:: Tue, 06/09/2020 - 09:49
DOI:: 10.21227/5vgy-yb08
Data Format:: *.zip
Links:: Speaker-Recognition-Using-GMM-MFCC-Python3

5767 views

Categories:

Keywords:

Speech Dataset

Speech Processing

speech recognition

Speaker Identification

CITE

Abstract

100 Speakers each consisting of 5 voice samples for training data and 1 voice sample for testing data. Total of 600 voice samples collected in different audio formats like mpeg, mp4, mp3, ogg etc. These samples were than preprocessed and converted into .wav format. Each voice sample has a time duration of 5-10 seconds due to different lengths tuning of parameters should be done before usage. Whole Dataset size is 600mb and duration is 1 hour 40 minutes. This dataset can be used for speech synthesis, speaker identification. speaker recognition, speech recogniton etc. Preprocessing of data is required.

Instructions:

-> Download the Dataset

-> Unzip the files

-> Add the voice_samples._path.txt to your training model so that it can extract data from the location.

->make changes to your path.txt file according to your need

thank you

Engin Butun Thu, 08/06/2020 - 14:37 Permalink

i am not able to download

Prapti Trivedi Sat, 11/28/2020 - 10:24 Permalink

thanks

TSAI JAN CHANG Mon, 12/21/2020 - 13:25 Permalink

Thanks

Neekhil Rj Tue, 10/05/2021 - 03:15 Permalink

unable to download

Sridhar Koneru Sat, 12/18/2021 - 12:47 Permalink

Very nice

Neekhil Rj Sat, 12/18/2021 - 15:51 Permalink

Thanks

Neekhil Rj Tue, 04/26/2022 - 02:25 Permalink

How to Download this Data set plz Help?

Neekhil Rj Tue, 04/26/2022 - 02:30 Permalink

Dataset Files

Dataset.zip (452.52 MB)

Open Access dataset files are accessible to all logged in users. Don't have a login? Create a free IEEE account. IEEE Membership is not required.

Datasets

Open Access

Speech Dataset in Hindi Language

Abstract

Instructions:

Dataset Files

QUESTIONS?

More from this Author

Speech Dataset in Hindi Language

More like this Dataset

Weather Monitoring Station For Farms And Agriculture

Trilateration based on RSSI values in transmitters and receivers

The FLAME dataset: Aerial Imagery Pile burn detection using drones (UAVs)

Retinal Fundus Multi-disease Image Dataset (RFMiD)

Experimental database for detecting and diagnosing rotor broken bar in a three-phase induction motor.

Dataset for classification of handwritten and printed text in a Doctor's prescription