Machine Learning

All the healthcare facilites in this dataset were collected from the MOH 2018 list of Uganda healthcare facilites (https://library.health.go.ug/sites/default/files/resources/National%20Health%20Facility%20MasterLlist%202017.pdf) Additional features were scraped using the Google Maps API and additionally from some of the websites of the healthcare facilities themselves.

Categories:
23 Views

This paper describes a dataset of droplet images captured using the sessile drop technique, intended for applications in wettability analysis, surface characterization, and machine learning model training. The dataset comprises both original and synthetically augmented images to enhance its diversity and robustness for training machine learning models. The original, non-augmented portion of the dataset consists of 420 images of sessile droplets. To increase the dataset size and variability, an augmentation process was applied, generating 1008 additional images.

Categories:
37 Views

The Hindi Spam SMS Dataset comprises 3,894 messages, each labeled as either spam or ham. This dataset was meticulously curated with contributions from students who encountered these messages daily. The messages were collected from their experiences and those shared by friends and peers, ensuring a diverse and realistic representation of SMS communication in Hindi. It offers a representative sample of real-world Hindi text messages for analysis. The dataset primarily contains messages written in Hindi, reflecting its origin's linguistic and cultural context.

Categories:
15 Views

Sarcasm detection involves predicting whether a given text is sarcastic, a challenging task in sentiment analysis. While significant research has been conducted for languages like English, Czech, and Italian, limited work exists for Indian languages such as Hindi, Tamil, and Bengali. Marathi, being the third most spoken language in India, has seen little progress in sarcasm detection, mainly due to the lack of suitable datasets.

Categories:
14 Views

A dataset of simulated resistive drift series for an illustrative stochastic memristor.

Dataset Description

The memristor has an equilibrium resistance of approximately 500kΩ.

5000 series are generated with starting resistances sampled uniformly from the range [100Ω, 750kΩ].

Each series consists of 1001 datapoints, with the first (zeroth) point corresponding to the initial resistance, and subsequent points sampled at subsequent timesteps.

Dataset Creation

Categories:
17 Views

PPE Usage Dataset

This repository provides the Personal Protective Equipment (PPE) Usage Dataset, designed for training deep neural networks (DNNs). The dataset was collected using the EFR32MG24 microcontroller and the ICM-20689 inertial measurement unit, which features a 3-axis gyroscope and a 3-axis accelerometer.

The dataset includes data for four types of PPE: helmet, shirt, pants, and boots, categorized into three activity classes: carrying, still, and wearing.

Categories:
23 Views

The 5G cellular technology has introduced advanced radio communication protocols and new frequency bands and enabled faster data exchange. These improvements increase network capacity and establish a foundation for high-bandwidth, low-latency services, helping the development of applications like the Internet of Things (IoT). However, information security poses significant challenges, particularly concerning attacks such as Fake Base Stations (FBS) and Stream Control Transmission Protocol (SCTP) Session Hijacking.

Categories:
95 Views

The benchmarking dataset, GenAI on the Edge, contains performance metrics from evaluating Large Language Models (LLMs) on edge devices, utilizing a distributed testbed of Raspberry Pi devices orchestrated by Kubernetes (K3s). It includes performance data collected from multiple runs of prompt-based evaluations with various LLMs, leveraging Prometheus and the Llama.cpp framework. The dataset captures key metrics such as resource utilization, token generation rates/throughput, and detailed inference timing for stages such as Sample, Prefill, and Decode.

Categories:
60 Views

This dataset includes acceleration data measured by 36 participants across 154 Tactons (i.e., Tactile Icons). We used three iOS smartphones (iPhone 13 mini, iPhone 14, and iPhone 11 Pro Max) by Apple Inc. to collect acceleration data as well as sensory and emotional ratings of Tactons on various consumer phones. These phones varied in size and mass: iPhone 13 mini (64.2 x 131.5 x 7.65 mm, 141 g), iPhone 14 (71.5 x 146.7 x 7.8 mm, 172 g), and iPhone 11 Pro Max (77.8 x 158.0 x 8.1 mm, 226 g).

Categories:
9 Views

This is a wheat breeding phenotyping and yield dataset, including canopy height (CH, m), canopy volume (CV, m3), and leaf area index (LAI) collected in the field; vegetation index (VI) generated by multispectral data acquired by UAV remote sensing; trial site weather (Weather); and yield (Yield, kg). The data comes from field trials.

Data acquisition and processing are described in the relevant part of the manuscript.

Categories:
131 Views

Pages