Speech Processing

PD_dataset

The development of automated techniques for speech analysis-based Parkinson's disease (PD) detection has attracted a lot of interest, especially because of its possible uses in health tele-monitoring. Due to the drawbacks of the ᾳ - Synuclein Seed Amplification Assay technique, scientists are looking more closely at speech signals as a potential substitute for PD detection. In order to identify PD, this proposal describes a thorough investigation that emphasizes using both voice and unvoiced source material.

Categories:

Signal Processing

REGen_data(Retrieval Generation Chat dataset)

The dataset and source code used in paper "Pick the Better and Leave the Rest: Leveraging Multiple Retrieved Results to Guide Response Generation".

Categories:

Artificial Intelligence

A reading and self-presentation speech characteristics dataset

The following dataset consists of samples of acoustic characteristics of 356 Russian-speaking subjects and measured psychological traits. All the recordings (5701 samples) were processed and acoustic characteristics were calculated.

Categories:

Signal Processing

Speech Dataset in Hindi Language

100 Speakers each consisting of 5 voice samples for training data and 1 voice sample for testing data. Total of 600 voice samples collected in different audio formats like mpeg, mp4, mp3, ogg etc. These samples were than preprocessed and converted into .wav format. Each voice sample has a time duration of 5-10 seconds due to different lengths tuning of parameters should be done before usage. Whole Dataset size is 600mb and duration is 1 hour 40 minutes. This dataset can be used for speech synthesis, speaker identification.

Categories:

Speech Dataset in Hindi Language

100 Speakers each consisting of 5 voice samples for training data and 1 voice sample for testing data. Total of 600 voice samples collected in different audio formats like mpeg, mp4, mp3, ogg etc. These samples were than preprocessed and converted into .wav format. Each voice sample has a time duration of 5-10 seconds due to different lengths tuning of parameters should be done before usage. Whole Dataset size is 600mb and duration is 1 hour 40 minutes. This dataset can be used for speech synthesis, speaker identification.

Categories:

A Manitoban Speech Dataset

The following dataset consists of utterances, recorded using 24 volunteers raised in the Province of Manitoba, Canada. To provide a repeatable set of test words that would cover all of the phonemes, the Edinburg Machine Readable Phonetic Alphabet (MRPA) [KiGr08], consisting of 44 words is used. Each recording consists of one word uttered by the volunteer and recorded in one continuous session.

Categories:

Signal Processing

Long-Term Multi-Band Frequency-Domain Mean-Crossing Rate (FDMCR) Feature

In order to discriminate and mark audio signal segments which include normal human speech and discriminate segments which do not include speech (like silence, music and noise), Speech/Music Discrimination (SMD) systems are used. Using this definition, SMD systems can be considered as a specific or accurate type of speech activity detection system.

Categories:

PD_dataset

REGen_data(Retrieval Generation Chat dataset)

A reading and self-presentation speech characteristics dataset

Speech Dataset in Hindi Language

Category

Speech Dataset in Hindi Language

Category

A Manitoban Speech Dataset

Long-Term Multi-Band Frequency-Domain Mean-Crossing Rate (FDMCR) Feature

Category