As part of preparedness efforts in dealing with this, it is important for Indonesia to prepare guidelines for preparedness in dealing with COVID-19.

Instructions: 

This guideline is intended for health workers as a reference in preparing for COVID-19. This guideline is provisional because it has been prepared by adopting WHO interim guidelines so that it will be updated in accordance with disease developments and the current situation.

Categories:
1437 Views

Visible Light Positioning is an indoor localization technology that uses wireless transmission of visible light signals to obtain a location estimate of a mobile receiver. 

This dataset can be used to validate supervised machine learning approaches in the context of Received Signal Strength Based Visible Light Positioning. 

The set is acquired in an experimental setup that consists of 4 LED transmitter beacons and a photodiode as receiving element that can move in 2D.

Categories:
86 Views

A dataset from semiconductor assembly and testing processes is used to evaluate the model selection prediction method. The response variable refers to the throughput rate of a specific machine–product combination in one of the assembly and testing process steps based on historical data. This data set includes 1 response variable, 5 categorical machine and product attributes and 11 numerical attributes. The dataset contains 13186 observations.

Instructions: 

mixed_categorical_numerical_data.csv: the raw data.

mixed_categorical_numerical_dataDummy.csv: the transformed one-hot encoded data.

Full_Model.rds: the full model built from the whole dataset.

Fundamental_Model.rds: the fundamental model built from one fundamental dataset.

Partial_Model_1-11.rds: the models related to the fundamental model mentioned above.

Categories:
79 Views

Dataset asscociated with a paper to appear in IEEE Transactions on Pattern Analysis and Machine Intelligence

"The perils and pitfalls of block design for EEG classification experiments"

The paper has been accepted and is in production.

We will upload the dataset when the paper is published.

This is a placeholder so we can obtain a DOI to include in the paper.

Instructions: 

See the paper "The perils and pitfalls of block design for EEG classification experiments" on IEEE Xplore.

Code for analyzing the dataset is included in the online supplementary materials for the paper.

Categories:
92 Views

This file contains Python code to extract the single-diode model parameters from the Photovoltaic IV-curve. 

Categories:
97 Views

The dataset consists of reviews for various hotels throughout the world and data columns range from Location, Trip Type to various parameters of reviewing with individual review score. The data can be preprocessed and used for various purposes ranging from review categorization, topic extraction, sentiment analysis, location based quality calculation etc. Trustworthy real world data comes handy now-a-days and is tough to get a grasp on. So this dataset will be a good contribution for the researcher community as well as professionals. 

 

Categories:
5621 Views

iSignDB: A biometric signature database created using smartphone

Suraiya Jabin, Sumaiya Ahmad, Sarthak Mishra, and Farhana Javed Zareen

Department of Computer Science, Jamia Millia Islamia, New Delhi-110025, India

It's a database of biometric signatures recorded using sensors present in a smartphone. ​The dataset iSignDB is created to implement a novel anti-spoof biometric signature authentication for smartphone users.

Categories:
231 Views

 

 

Instructions: 

The .zip file contains 6 folders when unzipped. We provide the details of each folder below.

 

“Proteins” folder: Contains 20 protein targets organized into two folders (Benchmark and CASP) depending on the family each target belongs to. Data for each protein is provided in a subfolder named with its id. Each such subfolder contains the following 4 files.

  1. A .fasta file containing the amino-acid sequence of the protein.

  2. A .pdb file containing the native tertiary structure coordinates. Detailed format for a .pdb file can be found in http://www.wwpdb.org/documentation/file-format

  3. A .frag3 file containing the fragments of length 3 for the protein sequence generated from http://old.robetta.org/

  4. A .frag9 file containing the fragments of length 9 for the protein sequence generated from http://old.robetta.org/

 

“Generation” folder: Contains the generated ensembles for the protein targets in 20 subfolders, one for each target, named with their ids. Each subfolder contains 5 files, each containing the generated ensemble for one run. Each such file contains 14 columns and each row represents one generated structure. The first column provides the Rosetta score4 energy, the second column provides the lRMSD to the native structure, and each of the rest of the 12 columns provides one USR feature for the structure.

 

“Reduced” folder: Contains the reduced ensembles for each clustering technique in separate folders. Each such folder contains 20 subfolders, one for each target, named with their ids. Each such subfolder contains 5 files, each containing the reduced ensemble for one run. Each such file contains 2 columns and each row represents one structure in the reduced ensemble. The first column provides the Rosetta score4 energy and the second column provides the lRMSD to the native structure.

 

“Truncation” folder: Contains the reduced ensembles via truncation for the protein targets in 20 subfolders, one for each target, named with their ids. Each such subfolder contains 5 files, each containing the reduced ensemble for one run. Each such file contains 2 columns and each row represents one structure in the reduced ensemble. The first column provides the Rosetta score4 energy and the second column provides the lRMSD to the native structure.

 

“Ks” folder: Contains 4 separate files, one for each clustering technique, containing the number of clusters for each run of each protein target. These files can be used to plot the distributions for the number of clusters.

 

“Bars” folder: Contains 3 separate subfolders containing the information needed to plot the bar charts for the minimum, average, and standard deviation of lRMSDs to the native structure for the CASP targets. Each subfolder contains 10 files, one for each target. Each file contains 6 rows that provide the lRMSD value for original ensemble, reduced ensemble for hierarchical clustering, reduced ensemble for k-means clustering, reduced ensemble for GMM clustering, reduced ensemble for gmx-cluster clustering, and reduced ensemble for truncation, respectively.

Categories:
77 Views

Wine has been popular with the public for centuries; in the market, there are a variety of wines to choose from. Among all, Bordeaux, France, is considered as the most famous wine region in the world. In this paper, we try to understand Bordeaux wines made in the 21st century through Wineinformatics study. We developed and studied two datasets: the first dataset is all the Bordeaux wine from 2000 to 2016; and the second one is all wines listed in a famous collection of Bordeaux wines, 1855 Bordeaux Wine Official Classification, from 2000 to 2016.

Instructions: 

The dataset comes from Wine Spectator Bordeaux wine reviews in human language format from year 2000 to year 2016. A total of 14,349 wines have been collected. There are 4263 above score 90/100 wines and 10,086 below score 89/100 wines. Detailed information is available in the paper. The dataset was processed by the Computational Wine Wheel to become the uploaded dataset. The first attribute of the dataset is the name of the wine. The second attribute of the dataset is the vintage of the wine. The third attribute of the dataset is the score given by the Wine Spectator of the wine. The fourth attribute of the dataset is the price of the wine. $NA indicates the wine price was not available during the time of the wine being reviewed. The rest of the attributes are the characteristic describing the wine with true/false value.

 

For Publications, please cite the following papers:

Dong, Zeqing, Xiaowan Guo, Syamala Rajana, and Bernard Chen. "Understanding 21st Century Bordeaux Wines from Wine Reviews Using Naïve Bayes Classifier." Beverages 6, no. 1 (2020): 5.

Chen, Bernard, Christopher Rhodes, Aaron Crawford, and Lorri Hambuchen. "Wineinformatics: applying data mining on wine sensory reviews processed by the computational wine wheel." In 2014 IEEE International Conference on Data Mining Workshop, pp. 142-149. IEEE, 2014.

Chen, Bernard, Christopher Rhodes, Alexander Yu, and Valentin Velchev. "The Computational Wine Wheel 2.0 and the TriMax Triclustering in Wineinformatics." In Industrial Conference on Data Mining, pp. 223-238. Springer, Cham, 2016.

Categories:
189 Views

 

Intending to cover the existing gap regarding behavioral datasets modelling interactions of users with individual a multiple devices in Smart Office to later authenticate them continuously, we publish the following collection of datasets, which has been generated after having five users interacting for 60 days with their personal computer and mobile devices. Below you can find a brief description of each dataset.

 

Categories:
100 Views

Pages