Skip to main content

Machine Learning

The recent developments in the field of the Internet of Things (IoT) bring alongside them quite a few advantages. Examples include real-time condition monitoring, remote control and operation and sometimes even remote fault remediation. Still, despite bringing invaluable benefits, IoT-enriched entities inherently suffer from security and privacy issues. This is partially due to the utilization of insecure communication protocols such as the Open Charge Point Protocol (OCPP) 1.6. OCPP 1.6 is an application-layer communication protocol used for managing electric vehicle chargers.

Categories:

This dataset provides measurements of cerebral blood flow using Radio Frequency (RF) sensors operating in the Ultra-Wideband (UWB) frequency range, enabling non-invasive monitoring of cerebral hemodynamics. It includes blood flow feature data from two arterial networks, Arterial Network A and Arterial Network B. Statistical features were manually extracted from the RF sensor data, while autonomous feature extraction was performed using a Stacked Autoencoder (SAE) with architectures such as 32-16-32, 64-32-16-32-64, and 128-64-32-16-32-64-128.

Categories:

The Metaverse Gait Authentication Dataset (MGAD) is a large-scale gait dataset designed for biometric authentication in virtual environments. It contains gait data from 5,000 simulated users, generated in Unity 3D and processed using OpenPose and MediaPipe to extract 16 key features, including stride length, step frequency, joint angles, ground reaction forces, and gait symmetry index.

Categories:

We curated and release a real-world medical clinical dataset, namely MedCD, in the context of building generative artificial intelligence (AI) applications in the clinical setting. The MedCD dataset is one of the accomplishments from our longitudinal applied AI research and deployment in a tertiary care hospital in China. First, the dataset is real and comprehensive, in that it was sourced from real-world electronic health records (EHRs), clinical notes, lab examination reports and more.

Categories:

One of the leading causes of early health detriment is the increasing levels of air pollution in major cities and eventually in indoor spaces. Monitoring the air quality effectively in closed spaces like educational institutes and hospitals can improve both the health and the life quality of the occupants. In this paper, we propose an efficient Indoor Air Quality (IAQ) monitoring and management system, which uses a combination of cutting-edge technologies to monitor and predict major air pollutants like CO2, PM2.5, TVOCs, and other factors like temperature and humidity.

Categories:

Overview

This dataset contains detailed experimental data from a series of tests conducted to evaluate the performance of a pulsed water jet ablation system. The experiments aim to investigate the effects of various parameters on the ablation process when cutting through material composites such as PLA/Bone Cement and Bone. The experiments involve layers of different materials, including metal, plastic, and bone cement. The primary objective is to understand the material differentiation.

Dataset Content

Categories:

A new small aerial flame dataset, called the Aerial Fire and Smoke Essential (AFSE) dataset, is created which is comprised of screenshots from different YouTube wildfire videos as well as images from FLAME2. Two object categories are included in this dataset: smoke and fire. The collection of images is made to mostly contain pictures utilizing aerial viewpoints. It contains a total of 282 images with no augmentations and has a combination of images with only smoke, fire and smoke, and no fire nor smoke.

Categories:

The Explainable Sentiment Analysis Dataset provides annotated sentiment classification data for Amazon Reviews and IMDB Movie Reviews, facilitating the evaluation of sentiment analysis models with a focus on explainability. It includes ground-truth sentiment labels, model-generated predictions, and fine-grained classification results obtained from various large language models (LLMs), including both proprietary (GPT-4o/GPT-4o-mini) and open-source models (DeepSeek-R1 full and distilled models).

Categories:

The TripAdvisor online airline review dataset, spanning from 2016 to 2023, provides a comprehensive collection of passenger feedback on airline services during the COVID-19 pandemic. This dataset includes user-generated reviews that capture sentiments, preferences, and concerns, allowing for an in-depth analysis of shifting customer priorities in response to pandemic-related disruptions. By examining these reviews, the dataset facilitates the study of evolving passenger expectations, changes in service perceptions, and the airline industry's adaptive strategies.

Categories: