Dataset Search

Walsh Spectrum Analysis on Sampling Distributions

The dataset stores a random sampling distribution with cardinality of support of 4,294,967,296 (i.e., two raised to the power of thirty-two). Specifically, the source generator is fixed as a symmetric-key cryptographic function with 64-bit input and 32-bit output. A total of 17,179,869,184 (i.e., two raised to the power of thirty-four) randomly chosen inputs are used to produce the sampling distribution as the dataset. The integer-valued sampling distribution is formatted as 4,294,967,296 (i.e., two raised to the power of thirty-two) entries, and each entry occupies one byte in storage.

Categories:

Category

VAIS-1000: A Vietnamese Speech Synthesis Corpus

This data consists of 1000 studio-quality audios and their transcription for Vietnamese northern accent. Each utterance has a length of 14-18 words and is spoken by a single speaker. The corpus can be used to create a Vietnamese speech synthesis system. A tutorial also available at https://vais.vn/vi/tai-ve/hts_for_vietnamese.

Categories:

Signal Processing

US Annual Retail Trade Survey—2014

The Annual Retail Trade Survey (ARTS) produces national estimates of total annual sales, e-commerce sales, end-of-year inventories, inventory-to-sales ratios, purchases, total operating expenses, inventories held outside the United States, gross margins, and end-of-year accounts receivable for retail businesses and annual sales and e-commerce sales for accommodation and food service firms located in the U.S.

License: U.S. Government Work

Categories:

Category

HazeRD: an outdoor dataset for dehazing algorithms

HazeRD is an outdoor scene dataset for benchmarking dehazing algorithms. HazeRD contains 10 different scenes based on the architectural biometrics project. For each scene, the ground RGB images, depth maps, and synthesized hazy images following the atmospheric optics are provided; the hazy images come with five different haze level using real life physical parameters. The main features of HazeRD to other dehazing datasets are: HazeRD focuses on outdoor scenes whereas other datasets provide indoor scenes; and, the synthesis is based on real life parameters.

Categories:

Digital signal processing

Web Data Commons - Hyperlink Graphs

The graphs have been extracted from the 2012 and 2014 versions of the Common Crawl web corpera. The 2012 graph covers 3.5 billion web pages and 128 billion hyperlinks between these pages. To the best of our knowledge, the graph is the largest hyperlink graph that is available to the public outside companies such as Google, Yahoo, and Microsoft. The2014 graph covers 1.7 billion web pages connected by 64 billion hyperlinks.

Categories:

Category

Other

VideoSet

A new methodology to measure coded image/video quality using the just-noticeable-difference (JND) idea was proposed in [1]. Several small JND-based image/video quality datasets were released by the Media Communications Lab at the University of Southern California in [2, 3]. In this work, we present an effort to build a large-scale JND-based coded video quality dataset. The dataset consists of 220 5-second sequences in four resolutions (i.e., 1920x1080, 1280x720, 960x540, and 640x 360).

Categories:

Signal Processing

Open Street Map Data

The files found here are regularly-updated, complete copies of the OpenStreetMap.org database, and those published before the 12 September 2012 are distributed under a Creative Commons Attribution-ShareAlike 2.0 license, those published after are Open Data Commons Open Database License 1.0 licensed.

Categories:

Climate Change/Environmental

Population Sample Artificially Generated

Test synthetic population produced with WEKA 3.8.

Categories:

Demographic

Measured scattering parameters for the coupling of stochastic electromagnetic fields to shielded cables above a ground plane in a reverberation chamber

This data set is about the measurement of the statistical electromagnetic field coupling to several shielded coaxial cables. The lines are aligned in parallel to a wall of a reverberation chamber. With a vector network analyzer, the coupled voltage between the inner conductor and the cable shield is measured for different stirrer positions over a wide frequency range. For comparison, the coupled current on the cable shield is calculated based on transmission line theory. From the ratio between the inner voltage and the shield current, a coupling resistance can be calculated.

Categories:

Category

Other

Dataset malware/beningn permissions Android

This dataset is a result of my research production into machine learning in android security. The data was obtained by a process that consisted to map a binary vector of permissions used for each application analyzed {1=used, 0=no used}. Moreover, the samples of malware/benign were devided by "Type"; 1 malware and 0 non-malware.

When I did my research, the datasets of malware and benign Android applications were not available, then I give to the community a part of my research results for the future works.

Categories:

TST Intake Monitoring dataset v2

The dataset contains depth frames collected using Microsoft Kinect v1 during the execution of food and drink intake movements.

Categories:

TST Intake Monitoring dataset v1

The dataset contains depth frames collected using Microsoft Kinect v1 during the execution of food and drink intake movements.

Categories:

TST TUG dataset

The dataset contains depth frames and skeleton joints collected using Microsoft Kinect v2 and acceleration samples provided by an IMU during the execution of the timed up and go test.

Categories:

TST Fall detection dataset v2

The dataset contains depth frames and skeleton joints collected using Microsoft Kinect v2 and acceleration samples provided by an IMU during the simulation of ADLs and falls.

Categories:

TST Fall detection dataset v1

The dataset contains depth frames collected using Microsoft Kinect v1 in top-view configuration and can be used for fall detection.

Categories:

Physician and Other Supplier Data CY 2014

As part of the Obama Administration’s efforts to make our healthcare system more transparent, affordable, and accountable, the Centers for Medicare & Medicaid Services (CMS) has prepared a public data set, the Medicare Provider Utilization and Payment Data: Physician and Other Supplier Public Use File (Physician and Other Supplier PUF), with information on services and procedures provided to Medicare beneficiaries by physicians and other healthcare professionals. The Physician and Other Supplier PUF contains information on utilization, payment (allowed amount and Medicare payment)

Categories:

Demo 2 Zika Virus - 092216

Demo 2 Zika Virus Data 092216 Abstract

Categories:

Biomedical and Health Sciences

ENF Power Frequency Data for Location Forensics (IEEE SP Cup 2016 competition)

At the intersection of signal processing and information forensics, the Signal Processing Cup 2016 global competition has explored a time-varying location-dependent signature of power grids that can be intrinsically captured in media recordings. This signature is called the Electric Network Frequency (ENF) signals. Throughout the SP Cup 2016 competition, participants were provided with multiple training, practice, and testing datasets that consisted of recordings made in different grids and containing ENF traces.

Categories:

Signal Processing

Zika Virus Dataset Demo 081016

Zika Demo Abstract

Categories:

Biomedical and Health Sciences

The Ace Challenge 2015

Several established parameters and metrics have been used to characterize the acoustics of a room. The most important are the Direct-To-Reverberant Ratio (DRR), the Reverberation Time (T60) and the reflection coefficient. The acoustic characteristics of a room based on such parameters can be used to predict the quality and intelligibility of speech signals in that room.

Categories:

Ocean Ship Logbooks (1750-1850)

This data comes from the Climatological Database for the World's Oceans 1750-1850. The data includes observational records of ship location, weather data, and other associated data.

Categories:

Weather

Zika Virus Data

The Pan American Health Organization / World Health Organization is publishing weekly counts of suspected and confirmed cases, by country and territory, as reported by each country. The data portal includes a few important notes: "The suspected cases in Brazil are unofficial (media monitoring)""Data is shared in an effort to transparently disseminate available information reported by Member States.

Categories:

Biomedical and Health Sciences

Million Song Dataset

The Million Song Dataset is a freely-available collection of audio features and metadata for a million contemporary popular music tracks. Its purposes are:

To encourage research on algorithms that scale to commercial sizes
To provide a reference dataset for evaluating research
As a shortcut alternative to creating a large dataset with APIs (e.g. The Echo Nest's)
To help new researchers get started in the MIR field

Categories:

Cause of Death in the United States

Every year the CDC releases the country’s most detailed report on death in the United States under the National Vital Statistics Systems. This mortality dataset is a record of every death in the country for the year 2014, which includes detailed information about causes of death and the demographic background of the deceased. It's been said that "statistics are human beings with the tears wiped off." This is especially true with this dataset.

Categories:

Displaying 8209 - 8232 of 8237 results

Category

Category

Category

Category