Dataset Entries from this Author

The Human voice Natural Language from On-demand media (HENLO) dataset is a high-quality emotional speech dataset created to address the need for representative and realistic data in speech emotion recognition research. Unlike many existing datasets, which rely on simulated emotions performed by untrained speakers or directed participants, HENLO sources its data from professionally produced films and podcasts available on Media On-Demand (MOD).
- Categories:

A clean audio signal means an audio recording that's free from unwanted noise, distortion, or other audio issues. This ensures the sound is clear, crisp, and easily understandable. Achieving a clean audio signal often involves various techniques such as noise reduction to eliminate background sounds, filtering to remove hums or hisses, and using equalization to enhance the audio quality. Additionally, proper recording techniques, like using high-quality microphones and soundproofing, are essential to capture clean audio from the start.
- Categories:

This dataset contains audio recordings and transcriptions of toxic speech derived from Indonesian conversations during YouTube videos where scammers are confronted. The dataset captures two separate interactions that escalate into toxic exchanges. Each interaction has been verified by native Indonesian speakers and labeled into two classes: toxic and non-toxic. The dataset includes both the original and preprocessed versions of the speech and text data. The original speech files total 136MB, while the preprocessed speech files are 111,7MB.
- Categories: