singer recognition

Vocal92: Multimodal Audio Dataset with a Cappella Solo Singing and Speech

We present Vocal92, a multivariate Cappella solo singing and speech audio dataset spanning around 146.73 hours sourced from volunteers. To the best of our knowledge, this is the first dataset of its kind that specifically focuses on a cappella solo singing and speech. Furthermore, we use two current state-of-the-art models to construct the singer recognition baseline system.

Categories:: Artificial Intelligence
Signal Processing

43 Views