A Manitoban Speech Dataset

Citation Author(s):: Sina Sedigh

Witold Kinsner
Submitted by:: Sina Sedigh
Last updated:: Fri, 11/22/2024 - 15:23
DOI:: 10.21227/H2KM16
Data Format:: WAV
Research Article Link:: Speech segmentation using multifractal measures and amplification of signal fea…

2776 views

Categories:

Signal Processing

Keywords:

Speech Processing

Speech Dataset

Controlled Recording Environment

ACCESS DATASET CITE

Abstract

The following dataset consists of utterances, recorded using 24 volunteers raised in the Province of Manitoba, Canada. To provide a repeatable set of test words that would cover all of the phonemes, the Edinburg Machine Readable Phonetic Alphabet (MRPA) [KiGr08], consisting of 44 words is used. Each recording consists of one word uttered by the volunteer and recorded in one continuous session. All the recordings are conducted in an anechoic chamber, available at the Applied Electromagnetic Laboratory at the University of Manitoba, using a Blue Yeti microphone, with a sampling frequency of 44.1 kilo-samples per second (kSps).

Instructions:

All the recordings are stored in waveform audio file format (WAV), which is uncompressed and could be easily loaded into software programs like Matlab or Audacity for analysis. Each participant is numbered alphanumerically, male participants starting with M and female participants starting with F. The volunteers participated in this study consists of 12 male and 12 female participants, all raised in the Province of Manitoba. The male volunteers aged between 19 to 60 years and the female volunteers between 18 to 44 years. All the recordings were conducted in one continuous session, each approximately 15 to 20 minutes in duration. The recording sessions took place between 10 AM to 3 PM, from March 27, 2017, until September 27, 2017. The 44 utterances recorded for each participant are stored in a file named with the alphanumeric number of each participant, along with a recording of the silence before and after the recordings. Moreover, a file named “Readme” is included, which contains the age range of the participants, as well as the date, time, weather, humidity, and pressure during the recording session. For further detail please refer to the attached PDF file or contact the authors at sedighs@myumanitoba.ca or witold.kinsner@umanitoba.ca.

Reference

[KiGr08] Witold Kinsner and Warren Grieder, “Speech segmentation using multifractal measures and amplification of signal features,” in Proc. Seventh IEEE Cognitive Informatics conf., ICCI08, (Stanford, CA; 14-16 August 2008), 7 pp., 2008. Retrieved July 1, 2015 from IEEE xplore at: http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=4639188&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D463918