Skip to main content

Machine Learning

Securing smart grids relies in part on the reliable integration of blockchain technologies for the automation of energy transactions. However, the presence of vulnerabilities in smart contracts poses a direct threat to the integrity and resilience of these critical systems. This work presents a unique and structured dataset of real-world vulnerabilities observed in smart contracts, intended for cybersecurity research applied to smart energy infrastructures.

Categories:

This dataset comprises synchronized multi-modal physiological recordings—functional Near-Infrared Spectroscopy (fNIRS), Electroencephalography (EEG), Electrocardiography (ECG), and Electromyography (EMG)—collected from 16 participants exposed to emotion-eliciting video stimuli. It includes raw signals, event markers, and Python scripts for data import and preprocessing. Special emphasis is placed on fNIRS, which, though less common in affective computing, provides valuable hemodynamic insights that complement electrical signals from EEG, ECG, and EMG.

Categories:

Sensitivity (Se) is the proportion of correctly identified actual abnormal intelligence C&A by the models. Specificity (Sp) is the proportion of correctly identified normal intelligence C&A by the models. Positive predictive value (PV+) is the proportion of correctly identified C&A predicted to have abnormal intelligence. Negative predictive value (PV–) is the proportion of correctly identified C&A predicted to have normal intelligence. Odds ratio (OR) represents the ability of the models to distinguish between C&A with normal and abnormal intelligence.

Categories:

The Travel Recommendation Dataset is a comprehensive dataset designed for building and evaluating conversational recommendation systems in the travel domain. It includes detailed information about users, destinations, and ratings, enabling researchers and developers to create personalized travel recommendation models. The dataset supports use cases such as personalizing travel recommendations, analyzing user behavior, and training machine learning models for recommendation tasks.

Categories:

<p>This dataset contains simulation results generated in OptiSystem for an 18‐tupling optical communication system. The parameters include optical source settings, modulator configurations, and a range of power and signal quality metrics. Key performance indicators—such as total power penalty (TPP), signal power penalty (SPP), and RMS jitter—are provided for each set of simulation inputs. The data are intended to facilitate reproducible research and to enable further analysis of high‐order frequency multiplication in optical networks.

Categories:

This dataset aims to support research on temporal segmentation of the Timed Up and Go (TUG) test using a first-person wearable camera. The data collection includes a training set of 8 participants and a test set of 60 participants. Among the 8 participants, the test was completed at both a normal walking pace and a simulated slower walking pace to mimic elderly movement patterns. The 60 participants were randomly divided into two groups: one group completed the test at a normal walking pace, and the other group simulated slower walking speed to mimic elderly movement patterns.

Categories:

This dataset is composed of a range of biomedical voice measurements from 31 people, 23 with Parkinson's disease (PD). Each column in the table is a particular voice measure, and each row corresponds to one of 195 voice recordings from these individuals ("name" column). The main aim of the data is to discriminate healthy people from those with PD, according to the "status" column which is set to 0 for healthy and 1 for PD.

Categories:

This dataset integrates textual, financial, and macroeconomic indicators to support research on bank failure prediction and financial distress forecasting in Vietnam. It includes financial news from the BKAI News Corpus Dataset (2009–2023) and financial crisis data from "A Dataset for the Vietnamese Banking System (2002–2021)" (Tu Le et al., 2022), covering crisis-related events such as restructuring, special control, mergers, and acquisitions.

Categories:

The AMD3IR dataset is a large-scale collection of Shortwave Infrared (SWIR) and Longwave Infrared (LWIR) images, designed to advance the ongoing research in the field of drone detection and tracking. It efficiently addresses key challenges such as detecting and distinguishing small airborne objects, differentiating drones from background clutter, and overcoming visibility limitations present in conventional imaging. The dataset comprises 20,865 SWIR images with 24,994 annotated drones and 8,696 LWIR images with 10,400 annotated drones, featuring various UAV models.

Categories:

A significant challenge in racing-related research is the lack of publicly available datasets containing raw images with corresponding annotations for the downstream task. In this paper, we introduce RoRaTrack, a novel dataset that contains annotated multi-camera image data from racing scenarios for track detection. The data is collected on a Dallara AV-21 at a racing circuit in Indiana, in collaboration with the Indy Autonomous Challenge (IAC).

Categories: