Machine Learning

App Usage Behavior Modeling and Prediction

The Tsinghua App Usage Dataset is a large-scale mobile application usage dataset collected over one week in one of China’s largest cities. It contains anonymized app usage logs from 1,000 users, capturing detailed information on 2,000 identified apps across 9,800 base stations. Each record includes user ID, timestamp, base station location, app ID, and traffic consumption, allowing for comprehensive analysis of individual and regional mobile usage patterns.

Categories:: Machine Learning

57 Views

Ammonia, Acetone, Formaldehyde Dataset

This data set is related to gas sensing, time-series classification. It has three columns: Time, I1, and Target. The Target column refers to the type of gas measured with the following classification:

•1 for Ammonia (NH₃)

•2 for Acetone (C₃H₆O) & Formaldehyde (CH₂O)

The data has been processed in a number of stages for accuracy and consistency. Raw data was first gathered using a Metal Oxide Semiconductor (MOS) sensor. After acquisition, the data were cleaned and processed, such as Savitzky-Golay filtering to remove noise or artifacts and smooth the signal.

Categories:: Machine Learning

163 Views

Nurturing the Future of English Language Education: Between Technology, Tradition and Gaps across Indonesian Education Levels

New technology solutions including tablets and advanced applications exist in modern classrooms across Indonesia but typically they miss the core educational objectives. The process of elementary school children memorizing the emoticon "happy" represents a lack of comprehension while high school students find themselves overwhelmed by IoT data and teachers work to adapt to AI-based educational requirements.

Categories:: Artificial Intelligence
Education and Learning Technologies
IoT
Machine Learning

132 Views

Stroke Prognosis Dataset Taizhou and Fuyong

The Fuyong dataset records 134 stroke patients who received treatment in the Shenzhen Fuyong People's Hospital between March 1, 2022, and September 31, 2024. Besides, medical records of 435 stroke patients treated in the Affiliated Taizhou People's Hospital of Nanjing Medical University between January 1, 2020, and December 31, 2023, are included in the Taizhou dataset. These two datasets use the pre- and post-thrombolysis of the NIHSS scores as a metric for evaluating the immediate efficacy of the thrombolytic intervention.

Categories:: Artificial Intelligence
Machine Learning
Brain
Health

212 Views

Shape Predictor 68 Face Landmarks Model for Eye-Tracking-Based Cursor Control

This dataset supports the LookCursor AI project, which implements eye-tracking-based cursor control using OpenCV and Dlib. The primary file included is shape_predictor_68_face_landmarks.dat, a pre-trained model used to detect and map 68 facial landmarks essential for tracking eye movements. The dataset enables accurate facial feature detection, which is critical for cursor movement based on eye gaze.

Categories:: Artificial Intelligence
Machine Learning
Computer Vision

111 Views

AMD3IR

The AMD3IR dataset is a large-scale collection of Shortwave Infrared (SWIR) and Longwave Infrared (LWIR) images, designed to advance the ongoing research in the field of drone detection and tracking. It efficiently addresses key challenges such as detecting and distinguishing small airborne objects, differentiating drones from background clutter, and overcoming visibility limitations present in conventional imaging. The dataset comprises 20,865 SWIR images with 24,994 annotated drones and 8,696 LWIR images with 10,400 annotated drones, featuring various UAV models.

Categories:: Artificial Intelligence
Machine Learning
Image Processing
Computer Vision
Remote Sensing
Security

556 Views

RoRaTrack

A significant challenge in racing-related research is the lack of publicly available datasets containing raw images with corresponding annotations for the downstream task. In this paper, we introduce RoRaTrack, a novel dataset that contains annotated multi-camera image data from racing scenarios for track detection. The data is collected on a Dallara AV-21 at a racing circuit in Indiana, in collaboration with the Indy Autonomous Challenge (IAC).

Categories:: Artificial Intelligence
Machine Learning

27 Views

Mixed Signals V2X Dataset

Vehicle-to-everything (V2X) collaborative perception has emerged as a promising solution to address the limitations of single-vehicle perception systems.

Categories:: Artificial Intelligence
Machine Learning
Transportation
Computer Vision

237 Views

NMT-5k

EEG signals in the NMT-5k dataset were captured using a 21-channel setup following the 10/20 electrode placement system, maintaining a sampling rate of 200 Hz. Data collection was conducted using the KT88-2400 system from Contec Medical Systems.

Categories:: Machine Learning
Neuroscience
Brain

99 Views

SQT-ITD3 Experimental Dataset

The proposed method is rigorously evaluated against several state-of-the-art algorithms, including ISAC, ITD3, IPPO, and IDDPG, to ensure a comprehensive performance analysis. The experimental data, which is publicly available [here], provides detailed insights into the training and evaluation processes of each algorithm.

Categories:: Artificial Intelligence
Machine Learning

150 Views

Machine Learning

Machine Learning

Pages