We introduced the task of acoustic question answering (AQA) in https://arxiv.org/abs/1811.10561.
This dataset aim to promote research in the acoustic reasoning area.
It comprise Acoustic Scenes and multiple questions/answers for each of them.
Each question is accompanied by a functional program which describe the reasoning steps needed in order to answer it.
The dataset is constitued is separated in 3 sets :
Each scenes is an assembly of 10 Elementary sounds.The scenes are persisted as JSON blobs. They contains the following attributes :
Elementary sounds are recordings of instruments playing a single note.The Elementary sound bank contains 56 unique recordings separated across 5 instruments family.Each of them have the following attributes :