deepfake detection
Introduced here is the Extended-Length Audio Dataset for Synthetic Voice Detection and Speaker Recognition (ELAD-SVDSR), a resource designed to advance research in synthetic voice (DeepFake) detection and automatic speaker recognition (ASR). It features around 45-minute audio recordings from 36 participants, each of whom read aloud different newspaper articles during controlled sessions, captured with five different high-quality microphones. Synthetic voices generated from 20 subjects of this dataset using open-source and commercial software are also included.
- Categories:
Recent advances in generative visual content have led to a quantum leap in the quality of artificially generated Deepfake content. Especially, diffusion models are causing growing concerns among communities due to their ever-increasing realism. However, quantifying the realism of generated content is still challenging. Existing evaluation metrics, such as Inception Score and Fréchet inception distance, fall short on benchmarking diffusion models due to the versatility of the generated images.
- Categories: