FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection
A speech dataset used for fake speech detection. The fake speech are generated by 8 well-known latest deep learning based open-sourced tools and 8 commercial speech synthesis products. All speech are in Chinese or English. It contains more than 127,890 synthetic speech and 14,400 natural speech in English and mandarin Chinese languages.
With the development of audio synthesis techniques, the most state-of-art synthesis methods based on Generative Adversarial Network(GAN) have been proposed. Whether the automatic speaker verification (ASV) systems are vulnerability to the GAN based synthesized audios is urgently needed to be verified. We present a publicly available set of GAN based synthesized audios generated by some open source schemes (WaveGAN,TifGAN,GANSynth,MelGAN), which allows researches to verify impact of the GAN-synthetic audio on security of ASV systems.