Signal Processing

VAIS-1000: A Vietnamese Speech Synthesis Corpus

This data consists of 1000 studio-quality audios and their transcription for Vietnamese northern accent.
Each utterance has a length of 14-18 words and is spoken by a single speaker.
The corpus can be used to create a Vietnamese speech synthesis system. A tutorial also available at https://vais.vn/vi/tai-ve/hts_for_vietnamese.

Categories:: Signal Processing

4720 Views

VideoSet

A new methodology to measure coded image/video quality using the just-noticeable-difference (JND) idea was proposed in [1]. Several small JND-based image/video quality datasets were released by the Media Communications Lab at the University of Southern California in [2, 3]. In this work, we present an effort to build a large-scale JND-based coded video quality dataset. The dataset consists of 220 5-second sequences in four resolutions (i.e., 1920x1080, 1280x720, 960x540, and 640x 360).

Categories:: Signal Processing

12201 Views

DCASE2016: Sound event detection in real life audio

This task evaluates performance of the sound event detection systems in multisource conditions similar to our everyday life, where the sound sources are rarely heard in isolation. Contrary to task 2, there is no control over the number of overlapping sound events at each time, not in the training nor in the testing audio data.

Signal Processing

Analog signal processing

Submitted On:

Fri, 11/11/2016 - 13:54

Last Updated On:

Tue, 01/10/2017 - 15:56

Citation Author(s):

Annamaria Mesaros, Toni Heittola, and Tuomas Virtanen

TST Intake Monitoring dataset v2

The dataset contains depth frames collected using Microsoft Kinect v1 during the execution of food and drink intake movements.

Categories:: Biomedical and Health Sciences
Sensors
Signal Processing

411 Views

TST Intake Monitoring dataset v1

The dataset contains depth frames collected using Microsoft Kinect v1 during the execution of food and drink intake movements.

Categories:: Biomedical and Health Sciences
Sensors
Signal Processing

326 Views

TST TUG dataset

The dataset contains depth frames and skeleton joints collected using Microsoft Kinect v2 and acceleration samples provided by an IMU during the execution of the timed up and go test.

Categories:: Biomedical and Health Sciences
Sensors
Signal Processing

1837 Views

TST Fall detection dataset v2

The dataset contains depth frames and skeleton joints collected using Microsoft Kinect v2 and acceleration samples provided by an IMU during the simulation of ADLs and falls.

Categories:: Biomedical and Health Sciences
Sensors
Signal Processing

16187 Views

TST Fall detection dataset v1

The dataset contains depth frames collected using Microsoft Kinect v1 in top-view configuration and can be used for fall detection.

Categories:: Biomedical and Health Sciences
Sensors
Signal Processing
Digital signal processing

3039 Views

ENF Power Frequency Data for Location Forensics (IEEE SP Cup 2016 competition)

At the intersection of signal processing and information forensics, the Signal Processing Cup 2016 global competition has explored a time-varying location-dependent signature of power grids that can be intrinsically captured in media recordings. This signature is called the Electric Network Frequency (ENF) signals. Throughout the SP Cup 2016 competition, participants were provided with multiple training, practice, and testing datasets that consisted of recordings made in different grids and containing ENF traces.

Categories:: Signal Processing

1350 Views

The Ace Challenge 2015

Several established parameters and metrics have been used to characterize the acoustics of a room. The most important are the Direct-To-Reverberant Ratio (DRR), the Reverberation Time (T60) and the reflection coefficient. The acoustic characteristics of a room based on such parameters can be used to predict the quality and intelligibility of speech signals in that room.

Categories:: Signal Processing
Analog signal processing

2502 Views

Signal Processing

Signal Processing

Pages