Skip to main content

Dataset Search

Displaying 7729 - 7752 of 8287 results

We conducted a semi-structured search of DBLP and MEDLINE using grouped search terms designed for maximal retrieval of relevant studies. DBLP’s search used the terms “survey” and “questionnaire data” to identify a total of 437 records from DBLP. MEDLINE’s search used three MeSH terms (“questionnaires,” “epidemiology,” and “epidemiologic methods”) with filter conditions set to exclude articles related to clinical trials and reviews as well as articles not written in English. The results were limited to articles written in English and published between 2001 and 2016.

Categories:

The rawdata.csv profile indicates the traffic analysis based mobility patterns. we extract human trips from Call Records Detail data. Combining traffic analysis zone dataset, we map each trip record to the zones with the same origin zones and destination zones. After  this, we can obtain this dataset. This dataset stores the hourly number of departure and arrival trips in each traffic analysis zone.

The POI-importance.csv profile indicates the term frequency-inverse doument frequency(TF-IDF) of each category of poi the in each traffic analysis zone.

Categories:

This dataset provides digital images and videos of surface ice conditions were collected from two Alberta rivers - North Saskatchewan River and Peace River - in the 2016-2017 winter seasons.

Images from North Saskatchewan River were collected using both Reconyx PC800 Hyperfire Professional game cameras mounted on two bridges in Edmonton as well as a Blade Chroma UAV equipped with a CGO3 4K camera at the Genesee boat launch.

Data for the Peace River was collected using only the UAV at the Dunvegan Bridge boat launch and Shaftesbury Ferry crossing.

Categories:
Accurate short-term load forecasting (STLF) plays an increasingly important role in reliable and economical power system operations. This dataset contains The University of Texas at Dallas (UTD) campus load data with 13 buildings, together with 20 weather and calendar features. The dataset spans from 01/01/2014 to 12/31/2015 with an hourly resolution. The dataset is beneficial to various research such as STLF.
Categories:

We introduce a benchmark of distributed algorithms execution over big data. The datasets are composed of metrics about the computational impact (resource usage) of eleven well-known machine learning techniques on a real computational cluster regarding system resource agnostic indicators: CPU consumption, memory usage, operating system processes load, net traffic, and I/O operations. The metrics were collected every five seconds for each algorithm on five different data volume scales, totaling 275 distinct datasets.

Categories:

We proposed a new dataset, HazeRD, for benchmarking dehazing algorithms under realistic haze conditions. As opposed to prior datasets that made use of synthetically generated images or indoor images with unrealistic parameters for haze simulation, our outdoor dataset allows for more realistic simulation of haze with parameters that are physically realistic and justified by scattering theory. 

Categories:

SDTwittC consists of 200 authors evenly balanced by gender (100 for each). We identified the gender of the tweeters via their names and profile pictures. As potential copy-and-paste texts, both tweets and retweets are discarded in the first place. Only replies are compiled. The number of replies for each author varies from hundreds to thousands. Male authors produced 233926 replies whereas 219740 replies are generated by the female group

Categories:

These files are the dataset of the antenna simulation and measurement.
All the simulation data were obtained using FEKO, and those were imported and visualized using MATLAB.
The scattering parameters of the antenna were measured using Keysight E8362B vector network analyzer, while the gain patterns were measured in the anechoic chamber.

Categories:

Consumer complaints are added to this public database after the company has responded to the complaint, confirming a commercial relationship with the consumer, or after they've had the complaint for 15 calendar days, whichever comes first. We don’t verify all the facts alleged in complaints, but we do give companies the opportunity to publicly respond to complaints by selecting responses from a pre-populated list. Company-level information should be considered in the context of company size and/or market share.

Categories:

This dataset was created based on the paper 'Andras Hajdu, Gyorgy Terdik, Attila Tiba, and Henrietta Toman: A stochastic approach to handle knapsack problems in the creation of ensembles'.To summarize our experimental setup for UCI binary classification problems, we have considered base classifiers perceptron, decision tree, Levenberg-Marquardt feedforward neural network, random neural network, and discriminative restricted Boltzmann machine classifier for the 5 UCI datasets MAGIC Gamma Telescope, HIGGS, EEG EyeState, Musk (Version 2), and Spambase; datasets of large car

Categories:

The malicious traffic detection system monitors the communication between the industrial equipment and analyzes the protocol in real time. At the same time, we launch a variety of attacks on the industrial system, such as Denial of Service attack, Man-in-the-Middle attack and so on. These attacks are also the major threat in the ICS currently. Then, we collect and classify different kinds of attack flow. These flows are intercepted from multiple collection stations during different periods.

Categories:

The zipped file includes a description of the combined household model example and a Simscape/MATLAB file for the model.

Categories:

In order to study the application of machine learning in myoelectric data, the machine learning method has been used for data mining and analysis so as to find correlation characteristics. More than 2,300 myoelectric examination data from Sichuan Provincial Hospital of Traditional Chinese Medicine (TCM) for 10 months has been collected and recorded.

Categories:

The data that we used to test the performance of different encryption methods

Categories:

This dataset contains the resuts of an experiment in which an electronic nose implemented with six MOX sensors acquired samples of explosives in raw and combined states.

As for the collection of samples, a random experimentation was carried out in order to avoid that data generates any memory effect that could influence the results. Raw TNT and gunpowder data were taken in amounts of 0.1g to 2g. Soap and toothpaste were also used to be mixed with the explosives. In the end, we took samples of the explosive substances in raw and combined states.

Categories:

The Voltammetry-Based Sensing (VBS) methods are extremely interesting due to high specificity in several biochemical applications. Several considerations can be applied to use this method to measure different analytes, and implement efficient and optimized electronic measurement platform for point-of-care diagnostic, in wearable, portable, or IoT systems. The dataset contains the data presented in [1], which proves on real experimental data a method to define the optimized setup to develop efficient and electronic bio-sensing platforms.

Categories:

The New York City Energy and Water Performance Map maps benchmarking data across the city, allowing viewers to see data by individual buildings, building type, building size, and year built.

Categories: