Abstract

Description

Forest environmental sound classification is one use case of ESC which has been widely experimenting to identify illegal activities inside a forest. With the unavailability of public datasets specific to forest sounds, there is a requirement for a benchmark forest environment sound dataset. With this motivation, the FSC22 was created as a public benchmark dataset, using the audio samples collected from FreeSound org.

This dataset includes 2025 labeled sound clips of 5s long. All the audio samples are distributed between six major parent-level classes; Mechanical sounds, Animal sounds, Environmental Sounds, Vehicle Sounds, Forest Threat Sounds, and Human Sounds. Further, each class is divided into subclasses that capture specific sounds which fall under the main category. Overall the dataset taxonomy consists of 34 classes as shown below. For the first phase of the dataset creation, 75 audio samples for every 27 classes were collected.

We expect that this dataset will help research communities with their research work governing Forest Acoustic monitoring and classification domain.

Sources

This dataset contains 2025 audio clips originating from the online audio database FreeSound Org(https://freesound.org/). FreeSound is a free platform that consists of thousands of audio recordings.

Collection methodology

After finalizing the taxonomy of the dataset, data was collected from the FreeSound platform for each class. For each of the selected class labels, we queried for audio samples which contain the considered label in the title or the description, using the API endpoint for text search. This was done using an automated python script. Then all the gathered data was filtered and validated through a manual process. First, all the queried results were checked, and removed the irrelevant records. Then filtered audio samples were sent through further processing to select the most accurate audios by downloading each audio clip and listening to them. Following, 75 audio clips with a duration, nearly equal to 5 seconds were selected per class. In the end, the filtered and validated data were normalized and 75 audio clips with a fixed length of 5s were finalized per class.

Instructions:

The Dataset contains 27 classes, each containing 75 audios related to the given class name.

In the folder structure of the FSC22 dataset, users can navigate to the Audios folder to access the audio files.

The name of the audio files are derived as follows,

UniqueClassIndex_UniqueAudioID.wav eg: 1_10101.wav

To identify the audio level details, users are expected to use either,

Metadata V1.0 FSC22.csv
Metadata V1.0 FSC22_.xlsx

Located inside the Metadata Folder.

For each audio file, the Metadata file provides:

Source File Name - ID of the original audio sample, used to extract the corresponding audio.
Dataset File Name - ID of the audio, in the context of FSC22
Class ID - Class Identification index (An integer from the range 1 to 27)
Class Name - Class Name which the audio is classified in.

Datasets

Standard Dataset

FSC22 Dataset

Abstract

Dataset Files

QUESTIONS?