Machine Learning

Benchmark Dataset for Generative AI on Edge Devices

The benchmarking dataset, GenAI on the Edge, contains performance metrics from evaluating Large Language Models (LLMs) on edge devices, utilizing a distributed testbed of Raspberry Pi devices orchestrated by Kubernetes (K3s). It includes performance data collected from multiple runs of prompt-based evaluations with various LLMs, leveraging Prometheus and the Llama.cpp framework. The dataset captures key metrics such as resource utilization, token generation rates/throughput, and detailed inference timing for stages such as Sample, Prefill, and Decode.

Categories:: Artificial Intelligence
Wireless Networking
Machine Learning
Cloud Computing
Communications

415 Views

Accelerations

This dataset includes acceleration data measured by 36 participants across 154 Tactons (i.e., Tactile Icons). We used three iOS smartphones (iPhone 13 mini, iPhone 14, and iPhone 11 Pro Max) by Apple Inc. to collect acceleration data as well as sensory and emotional ratings of Tactons on various consumer phones. These phones varied in size and mass: iPhone 13 mini (64.2 x 131.5 x 7.65 mm, 141 g), iPhone 14 (71.5 x 146.7 x 7.8 mm, 172 g), and iPhone 11 Pro Max (77.8 x 158.0 x 8.1 mm, 226 g).

Categories:: Machine Learning

20 Views

Wheat breeding phenotyping and yield dataset

This is a wheat breeding phenotyping and yield dataset, including canopy height (CH, m), canopy volume (CV, m3), and leaf area index (LAI) collected in the field; vegetation index (VI) generated by multispectral data acquired by UAV remote sensing; trial site weather (Weather); and yield (Yield, kg). The data comes from field trials.

Data acquisition and processing are described in the relevant part of the manuscript.

Categories:: Agriculture
Artificial Intelligence
Machine Learning
Image Processing
Geoscience and Remote Sensing
Remote Sensing

318 Views

Syphilis and Tourism and Geographic Location Dataset - SyphTGL

Tourism is increasing worldwide and has many benefits for countries and cities, such as creating jobs, increasing company revenue, and improving government tax collection. As such, tourism is an unstoppable trend followed by countries and municipalities that try to stimulate this activity. However, unexpected impacts of this, in principle, wealthy activity must be observed.

Categories:: Artificial Intelligence
Machine Learning
Biomedical and Health Sciences
Demographic

203 Views

Sentinel-2 Wildfire Change Detection (S2-WCD)

Abstract

Categories:: Machine Learning
Computer Vision
Geoscience and Remote Sensing

293 Views

Systematic Dataset Generation for Soil Texture Classification Based on the USDA Soil Classification Triangle

This study introduces a novel soil texture dataset designed to overcome geographic constraints and improve the generalization of classification models. Using the USDA soil classification triangle as a framework, the dataset is systematically generated by combining pure sand, silt, and clay in varying proportions to create diverse soil texture classes. The soil mixtures are captured using a multispectral sensor with seven bands, ensuring a rich representation of spectral information.

Categories:: Machine Learning
Sensors
Geoscience and Remote Sensing

154 Views

Sensor-Driven Data Collection System for Predicting Fan Behavior

This study identifies representative sensors for monitoring fan performance by analyzing vibration data collected from piezoelectric sensors during various operational modes. The dataset, which includes measurements at a rate of 300 samples/sec from 10 sensors, covers six modes of operation: Maximum Speed, Maximum Speed with Oscillation, Minimum Speed, Minimum Speed with Oscillation, Minimum to Maximum Speed, and a comprehensive dataset combining all modes.

Categories:: Machine Learning
Sensors

98 Views

DoQ+QUIC web traffic dataset

Moving away from plain-text DNS communications,
users now have the option of using encrypted DNS protocols
for domain name resolutions. DNS-over-QUIC (DoQ) employs
QUIC—the latest transport protocol—for encrypted communi-
cations between users and their recursive DNS servers. QUIC is
also poised to become the foundation of our daily web browsing
experience by replacing TCP with HTTPP/3, the latest version
of the HTTP protocol.
Traditional TCP-based web browsing is vulnerable to website

Categories:: Artificial Intelligence
Machine Learning
Security
Communications

197 Views

Migration

This dataset is collected at KAIST, Daejeon, and KAIST by ISILAB to research seamless indoor-outdoor detection. The collecting device is a Raspberry Pi 4B+ with touchscreen UI connected with a Pmod Nav module and a PmodGPS. This collection has a rough three-month time span, which mitigates the specific time-specific bias. Further, in the collection, we also swap the wiring to simulate the device bias. The dynamic calibration is not applied to the dataset; searchers may choose to apply the dataset or not.

Categories:: Machine Learning

24 Views

HBD4DTI

This dataset integrates three Publicly available sources of drug-target interaction data: the Human dataset, the Biosnap dataset, and the DrugBank dataset, combining them into a comprehensive resource for drug discovery and bioinformatics research. It includes a diverse set of human proteins identified as potential drug targets, along with a variety of corresponding drug molecules. Each drug-target pair is accompanied by interaction labels, indicating whether the drug interacts with the protein target.

Categories:: Machine Learning
Biomedical and Health Sciences

56 Views

Machine Learning

Machine Learning

Pages