Machine Learning

It is a dataset containing sentence segments from cutomer reviews about mobile phone from different sources like Amazon, Flipkart, Tweeter and some existing datasets. It contains more than 1000 records tagged with one of the five aspect categories battery, camera, display, price and processor. Whether a sentence segment has sentiment expression (subjective/ objective) is also tagged and the sentiment orientation (positive/ negative/ neutral) of each sentence segment is assigned. Explicit or implicit presence of aspect is also maintained.

Categories:
155 Views

This work presents a specialized dataset designed to advance autonomous navigation in hiking trail and off-road natural environments. The dataset comprises over 1,250 images (640x360 pixels) captured using a camera mounted on a tele-operated robot on hiking trails. Images are manually labeled into eight terrain classes: grass, rock, trail, root, structure, tree trunk, vegetation, and rough trail. The dataset is provided in its original form without augmentations or resizing, allowing end-users flexibility in preprocessing.

Categories:
523 Views

The dataset provides detailed information for wheat crop monitoring in the Karnal District, India, spanning the period from 2010 to 2022. It is divided into four main components. The first component, Remote Sensing Data, includes Sentinel-2 (10 m resolution) satellite data averaged over village boundaries, specifically over a wheat crop mask. This folder contains two Excel files: one for NDVI (Normalized Difference Vegetation Index) and another for NDWI (Normalized Difference Water Index), both providing fortnightly data during the Rabi season across a 10-year period.

Categories:
446 Views

This dataset, titled "Synthetic Sand Boil Dataset for Levee Monitoring: Generated Using DreamBooth Diffusion Models," provides a comprehensive collection of synthetic images designed to facilitate the study and development of semantic segmentation models for sand boil detection in levee systems. Sand boils, a critical factor in levee integrity, pose significant risks during floods, necessitating accurate and efficient monitoring solutions.

Categories:
288 Views

We organized and collected two years' worth of complete fault work orders from a wind farm, and structured these work orders into a fault diagnosis event knowledge graph using the proposed algorithm. This graph includes fault modes, fault impacts, fault symptoms, inspection schemes, root cause identification, and maintenance strategies, covering all potential fault information and handling methods for wind turbines. This dataset records the head entity-relation-tail entity information in the form of triples using JSON format.

Categories:
820 Views

Surface electromyography (EMG) can be used to interact with and control robots via intent recognition. However, most machine learning algorithms used to decode EMG signals have been trained on small datasets with limited subjects, impacting their generalization across different users and tasks. Here we developed EMGNet, a large-scale dataset for EMG neural decoding of human movements. EMGNet combines 7 open-source datasets with processed EMG signals for 132 healthy subjects (152 GB total size).

Categories:
984 Views

This dataset presents real-world VPN encrypted traffic flows captured from five applications that belong to four service categories, which reflect typical usage patterns encountered by everyday users. 

Our methodology utilized a set of automatic scripts to simulate real-world user interactions for these applications, to achieve a low level of noise and irrelevant network traffic.

 

The dataset consists of flow data collected from four service categories:

Categories:
318 Views

DALHOUSIE NIMS LAB BENIGN DATASET 2024-2 dataset comprises data captured from Consumer IoT devices, depicting three primary real-life states (Power-up, Idle, and Active) experienced by everyday users. Our setup focuses on capturing realistic data through these states, providing a comprehensive understanding of Consumer IoT devices.

The dataset comprises of nine popular IoT devices namely 

Amcrest Camera

Smarter Coffeemaker

Ring Doorbell

Amazon Echodot

Google Nestcam

Google Nestmini

Kasa Powerstrip

Categories:
88 Views

Resource usage fuzzing samples and related data. Contains samples from Pythoin, random data, GPT-3.5, GPT-4, Gemini-1.0, Claude Instant, and Claude Opus. These samples are generated for 50 Python functions. Also included are resource measures for CPU time, instruction count, function calls, peak RAM usage, final RAM allocated, and coverage. These values were collected on an isolated system and account for interference from other processes.

Categories:
73 Views

Pages