Datasets
Open Access
Deep Xi dataset
- Citation Author(s):
- Submitted by:
- Aaron Nicolson
- Last updated:
- Mon, 08/10/2020 - 22:54
- DOI:
- 10.21227/3adt-pb04
- Data Format:
- Links:
- License:
- Categories:
- Keywords:
Abstract
The training, validation, and test set used for Deep Xi (https://github.com/anicolson/DeepXi).
Training set:
The clean-speech recordings are from the test-clean-100 set of Librispeech (http://www.openslr.org/12/) and from the CSTR VCTK corpus (https://homepages.inf.ed.ac.uk/jyamagis/page3/page58/page58.html) (the recordings from speakers p232 and p257 are excluded as they are used in the test set of the DEMAND Voicebank dataset (http://ssw9.talp.cat/papers/ssw9_PS2-4_Valentini-Botinhao.pdf)).
The noise recordings are from the Environmental Background Noise dataset (https://personal.utdallas.edu/~nxk019000/VAD-dataset/), the Nonspeech dataset (http://web.cse.ohio-state.edu/pnl/corpus/HuNonspeech/HuCorpus.html), the QUT-NOISE dataset (https://research.qut.edu.au/saivt/databases/qut-noise-databases-and-protocols/), multiple Freesound packs (https://freesound.org/), the noise set of the MUSAN corpus (https://www.openslr.org/17/), the RSG-10 noise database (http://www.steeneken.nl/wp-content/uploads/2014/04/RSG-10_Noise-data-base.pdf) (voice babble, F16, and factory (welding) are excluded as they are used in the Deep Xi Test Set and the Test Set From 10.1016/J.SPECOM.2019.06.002) and the Urban Sound dataset (http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_urbansound_acmmm14.pdf) (street music no. 26,270 is excluded as it is used in the Deep Xi Test Set and the Test Set From 10.1016/J.SPECOM.2019.06.002).
Note that the clean-speech and noise recordings used for this training set are separate from those used in the test set and the Test Set From 10.1016/J.SPECOM.2019.06.002, and the DEMAND Voicebank test set (http://ssw9.talp.cat/papers/ssw9_PS2-4_Valentini-Botinhao.pdf).
Test set:
Noisy-speech set used to test Deep Xi (https://github.com/anicolson/DeepXi). The clean speech and noise used to create the noisy-speech set are also available. The clean-speech recordings are from Librispeech test-clean (http://www.openslr.org/12/). The noise recordings are from the RSG-10 noise database (http://www.steeneken.nl/wp-content/uploads/2014/04/RSG-10_Noise-data-base.pdf) and the Urban Sound dataset (http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_urbansound_acmmm14.pdf).
The noise recordings are as follows:
26270 - A recording of street music. It is recording no. 26,270 from the Urband Sound dataset.
SIGNAL019 - A recording of voice babble from the RSG-10 dataset.
SIGNAL020 - A recording of an F16 fighter jet from the RSG-10 dataset.
SIGNAL021 - A recording of factory welding from the RSG-10 dataset.
The directories are pre-configured for Deep Xi, as seen here: https://github.com/anicolson/DeepXi/tree/master/set.
Dataset Files
- deep_xi_dataset.zip (21.69 GB)
Open Access dataset files are accessible to all logged in users. Don't have a login? Create a free IEEE account. IEEE Membership is not required.