ARImulti-mic: real-world speech recordings on a humanoid robot (ARI)

Citation Author(s):
Bar-Ilan University
Bar-Ilan University
Bar-Ilan University
Submitted by:
Renana Opochinsky
Last updated:
Thu, 09/07/2023 - 02:07
0 ratings - Please login to submit your rating.


ARImulti-mic: real-world speech recordings on a humanoid robot (ARI)

This dataset includes “real-world” experiments. A recording campaign was held in the acoustic laboratory at Bar-Ilan University. This lab is a [6×6×2.4]m room with a reverberation time controlled by 60 interchangeable panels covering the room facets.

In our experiments, the reverberation time was set (by changing the panel arrangements) to either 350 ms, typical of a meeting room, or 600 ms, typical of a lecture hall. ARI by PAL Robotics is a humanoid robot with an Intel i9 processor and Nvidia Orin GPU.  ARI is equipped with ReSpeaker 4-Microphone Array v2.0, installed inside the robot's compartment, 80cm above its base.


In our experimental setup, ARI was positioned at the center of the acoustic lab, with a set of loudspeakers in front of it, on two semi-circles with approximately 1m and 2m radius, respectively. Our experiments only used the inner semi-circle with five loudspeakers positioned at [-65,-30,0,30,65]°. To generate a sample, we randomly selected two loudspeakers and played speech utterances randomly drawn from the Librispeech test set.  There are two mixed signals real mix and handy mix. The first format is composed of utterances that are transmitted at the same time. The second format is composed of utterances that were separately recorded by ARI and then manually mixed to enable SI-SDR calculation. The overlap between the speakers was randomly set in the range [25%,50%]. No external noise was added to the recordings; hence, only sensor and low-level ambient noise are present. Overall, 200 samples were generated at each reverberation level. 


Detailed instructions are in the readme file.