Speech Processing

The dataset and source code used in paper "Pick the Better and Leave the Rest: Leveraging Multiple Retrieved Results to Guide Response Generation".

Categories:
58 Views

The following dataset consists of samples of acoustic characteristics of 356 Russian-speaking subjects and measured psychological traits. All the recordings (5701 samples) were processed and acoustic characteristics were calculated. 

Categories:
286 Views

100 Speakers each consisting of 5 voice samples for training data and 1 voice sample for testing data. Total of 600 voice samples collected in different audio formats like mpeg, mp4, mp3, ogg etc. These samples were than preprocessed and converted into .wav format. Each voice sample has a time duration of 5-10 seconds due to different lengths tuning of parameters should be done before usage. Whole Dataset size is 600mb and duration is 1 hour 40 minutes. This dataset can be used for speech synthesis, speaker identification. speaker recognition, speech recogniton etc.

Categories:
5516 Views

100 Speakers each consisting of 5 voice samples for training data and 1 voice sample for testing data. Total of 600 voice samples collected in different audio formats like mpeg, mp4, mp3, ogg etc. These samples were than preprocessed and converted into .wav format. Each voice sample has a time duration of 5-10 seconds due to different lengths tuning of parameters should be done before usage. Whole Dataset size is 600mb and duration is 1 hour 40 minutes. This dataset can be used for speech synthesis, speaker identification. speaker recognition, speech recogniton etc.

Categories:
2364 Views

      The following dataset consists of utterances, recorded using 24 volunteers raised in the Province of Manitoba, Canada. To provide a repeatable set of test words that would cover all of the phonemes, the Edinburg Machine Readable Phonetic Alphabet (MRPA) [KiGr08], consisting of 44 words is used. Each recording consists of one word uttered by the volunteer and recorded in one continuous session.

Categories:
2755 Views

In order to discriminate and mark audio signal segments which include normal human speech and discriminate segments which do not include speech (like silence, music and noise), Speech/Music Discrimination (SMD) systems are used. Using this definition, SMD systems can be considered as a specific or accurate type of speech activity detection system.

Categories:
388 Views