Machine Learning

This FFT-75 dataset contains randomly sampled, potentially overlapping file fragments from 75 popular file types (see details below). It is the most diverse and balanced dataset available to the best of our knowledge. The dataset is labeled with class IDs and is ready for training supervised machine learning models. We distinguish 6 different scenarios with different granularity and provide variants with 512 and 4096-byte blocks. In each case, we sampled a balanced dataset and split the data as follows: 80% for training, 10% for testing and 10% for validation.

Categories:
3666 Views

 Measurements collected from R1 for root cause analyses of the network service states defined from quality and service design perspectives

Categories:
616 Views

We introduce a benchmark of distributed algorithms execution over big data. The datasets are composed of metrics about the computational impact (resource usage) of eleven well-known machine learning techniques on a real computational cluster regarding system resource agnostic indicators: CPU consumption, memory usage, operating system processes load, net traffic, and I/O operations. The metrics were collected every five seconds for each algorithm on five different data volume scales, totaling 275 distinct datasets.

Categories:
1897 Views

In an aging population, the demand for nurse workers increases to care for elders. Helping nurse workers make their work more efficient, will help increase elders quality of life, as the nurses can focus their efforts on care activities instead of other activities such as documentation.
Activity Recognition can be used for this goal. If we can recognize what activity a nurse is engaged in, we can partially automate documentation process to reduce time spent on this task, monitor care plan compliance to assure that all care activities have been done for each elder, among others.

Last Updated On: 
Fri, 12/06/2019 - 03:40

Malignant pleural effusions (MPEs) are a challenging public health problem, causing significant morbidity and often being the first presenting sign of cancer. Pleural fluid cytology is the most common method used to differentiate malignant from non-malignant effusions. However, its sensitivity reaches 50-70% and depends on the experience of the cytologist, the tumor load, and the amount of fluid tested. Therefore, diagnostic inaccuracy and a high incidence of false negatives may endanger patients with clinical mistreatment and mismanagement.

Categories:
287 Views

Data and codes for journal paper "MmWave Vehicular Beam Training with Situational Awareness Using Machine Learning" submitted to IEEE Access.

The code assumes Python 3.

Categories:
727 Views

The pressure sensors are represented by black circles, which are located in the three zones of each foot. For the left foot: S1 and S2 cover the forefoot area. S3, S4, and S5 the midfoot area. S6 and S7 the rearfoot or heel area. Similarly, for the right foot: S8 and S9 represent the forefoot area. S10, S11, S12 the midfoot area. S13 and S14 the heel area. The values of each sensor are read by the analog inputs of an Arduino mega 2560.

Categories:
1204 Views

This dataset collection contains eleven datasets used in Locally Linear Embedding and fMRI feature selection in psychiatric classification.

The datasets given in the Links section are reduced subsets of those contained in their respective tar files (a consequence of Mendeley Data's 10GB limitation).

The Linked datasets (not the tar files) contain just the MATLAB file and the resting state image (or block-design fMRI for the MRN dataset), where appropriate.

Categories:
908 Views

Device identification using network traffic analysis is being researched for IoT and non-IoT devices against cyber-attacks. The idea is to define a device specific unique fingerprint by analyzing the solely inter-arrival time (IAT) of packets as feature to identify a device. Deep learning is used on IAT signature for device fingerprinting of 58 non-IoT devices. We observed maximum recall and accuracy of 97.9% and 97.7% to identify device. A comparitive research GTID found using defined IAT signature that models of device identification are better than device type identification.

Categories:
1269 Views

The dataset is used in machine learning method of the "A distributed Front-end Edge node assessment model by using Fuzzy and a learning-to-rank method" paper

Categories:
162 Views

Pages