Synthetic vowels generated with 1D and 3D acoustic models

Citation Author(s):: Rémi Blandin

Simon Stone

Angélique Remacle

Vincent Didone

Peter Birkholz
Submitted by:: Remi Blandin
Last updated:: Tue, 09/12/2023 - 09:29
DOI:: 10.21227/vdjq-3k31
Research Article Link:: A Comparative Study of 3D and 1D Acoustic Simulations of the Higher Frequencies…

90 views

Categories:

Digital signal processing

Keywords:

Articulatory synthesis

wideband speech

multi- modal method

ACCESS DATASET CITE

Abstract

This dataset contains the synthetic stimuli used in the study published in the paper "A Comparative Study of 3D and 1D Acoustic
Simulations of the Higher Frequencies of Speech". The goal of this study was to evaluate the accuracy of the acoustic wave
propagation in the vocal tract in a source-filter synthesis paradigm with two perceptual experiments. The high frequencies (above 4 kHz) of the stimuli were
generated by three different methods: a source-filter method relying on a 1D and a 3D acoustic model, and a bandwith extension
algorithm with no physical basis. The low frequency portion was generated with a 3D acoustic model in each case. The data and code used to generate
the stimuli are provided in this dataset.

Instructions:

# stimuli_recordings
This folder contains:
- the recordings of the stimuli in the experimental condition in the subfolder "stimuli"
- a background noise recording "background_noise.wav"
- the 94 dB calibration signal recording "calibration.wav"
- various documents presenting acoustical analysis of the stimuli

# stimuli
This folder contains:
- a recording of a real speaker used as a reference for the fundamental frequency and the amplitude of the stimuli
- the synthetic stimuli in the folder "dev"
- a Matlab script, "StimuliGeneration.m", to generate the stimuli (this requires the toolboxes Audio, DSP System, Image Processing and Signal Processing)
- 2 Octave scripts for the test interfaces: "pair_comparison_interface.m" and "naturalness_evaluation_interface.m"

# transfer_functions
This folder contains:
- the transfer functions computed with the transmission line model in the folder "1d"
- the transfer functions computed with the multimodal model in the folder "multimodal"
- the blended transfer functions in the folder "blended"
- plots of the transfer functions in the folder "pdf"
- a script to plot the transfer functions: "PlotTransferFunctions.m"

# speaker-files
This folder contains the speaker files containing the parameters of the articulatory model of VocalTractLab to generate the transfer functions.
To compute the transmission line transfer function with VocalTractLab set the following options:
- Radiation impedance: Piston in wall
- Additional options: all OFF
- Energy losses:
   - Boundary layer resistance ON,
   - heat conduction losses ON,
   - Soft Walls ON,
   - Hagen-Poiseuille resistance OFF

# include
This folder contains dependencies for the generation of the stimuli. In particular the artificial bandwidth extension algorithms, the glottal flow model and the VocalTractLab API.