Speech Dataset | IEEE DataPort

A reading and self-presentation speech characteristics dataset

The following dataset consists of samples of acoustic characteristics of 356 Russian-speaking subjects and measured psychological traits. All the recordings (5701 samples) were processed and acoustic characteristics were calculated.

Categories:

Signal Processing

Speech Dataset in Hindi Language

100 Speakers each consisting of 5 voice samples for training data and 1 voice sample for testing data. Total of 600 voice samples collected in different audio formats like mpeg, mp4, mp3, ogg etc. These samples were than preprocessed and converted into .wav format. Each voice sample has a time duration of 5-10 seconds due to different lengths tuning of parameters should be done before usage. Whole Dataset size is 600mb and duration is 1 hour 40 minutes. This dataset can be used for speech synthesis, speaker identification.

Categories:

Category

Speech Dataset in Hindi Language

100 Speakers each consisting of 5 voice samples for training data and 1 voice sample for testing data. Total of 600 voice samples collected in different audio formats like mpeg, mp4, mp3, ogg etc. These samples were than preprocessed and converted into .wav format. Each voice sample has a time duration of 5-10 seconds due to different lengths tuning of parameters should be done before usage. Whole Dataset size is 600mb and duration is 1 hour 40 minutes. This dataset can be used for speech synthesis, speaker identification.

Categories:

Category

iNoise Indian Noise Database

Speech Processing in noisy condition allows researcher to build solutions that work in real world conditions. Environmental noise in Indian conditions are very different from typical noise seen in most western countries. This dataset is a collection of various noises, both indoor and outdoor ollected over a period of several months. The audio files are of the format RIFF (little-endian) data, WAVE audio, Microsoft PCM, 8 bit, mono 11025 Hz and have been recorded using the Dialogic CTI card.

Categories:

Signal Processing

Test Set From 10.1016/j.specom.2019.06.002

This is the noisy-speech test set used in the original Deep Xi paper: https://doi.org/10.1016/j.specom.2019.06.002. The clean speech and noise used to create the noisy-speech set are also available. The clean-speech recordings are from the TSP speech database (http://www-mmsp.ece.mcgill.ca/Documents/Data/TSP-Speech-Database/TSP-Speech-Database.pdf).

Categories:

Category

Deep Xi Test Set (depreciated, please see Deep Xi dataset)

Noisy-speech set used to test Deep Xi (https://github.com/anicolson/DeepXi). The clean speech and noise used to create the noisy-speech set are also available. The clean-speech recordings are from Librispeech test-clean (http://www.openslr.org/12/).

Categories:

Category

A Manitoban Speech Dataset

The following dataset consists of utterances, recorded using 24 volunteers raised in the Province of Manitoba, Canada. To provide a repeatable set of test words that would cover all of the phonemes, the Edinburg Machine Readable Phonetic Alphabet (MRPA) [KiGr08], consisting of 44 words is used. Each recording consists of one word uttered by the volunteer and recorded in one continuous session.

Categories:

Signal Processing