Artificial Intelligence

A CSI Dataset for Wi-Fi-based Personnel Identity Recognition

Categories:

Satellite and ERA5-based Tropical Cyclone Dataset (SETCD)

SETCD (Satellite and ERA5-based Tropical Cyclone Dataset), a comprehensive dataset encompassing satellite imagery data and ERA5 data for all TCs recorded between 1980 and 2022. Our dataset is derived from two publicly available data sources: GridSat-B1 and ERA5. To capture relevant information associated with TC, SETCD adopts the latitude and longitude positions provided by IBTrACS as the center points. The satellite data within the SETCD dataset consists of three channels from GridSat-B1: infrared, water vapor, and visible.

Categories:

KPI prediction dataset

KPI prediction, which is categorized under time series data modeling, serves as a crucial area of investigation within the realm of complex industrial processes. This field focuses on forecasting key performance indicators that are pivotal for assessing the operational efficiency and productivity of industries. By leveraging historical data trends, KPI prediction aids in optimizing process controls and decision-making strategies, thus enhancing overall performance and competitive edge.

Categories:

Artificial Intelligence

wpt-dataset

The WPT dataset was specially created for "Web Page Tampering Detection Based on Dynamic Temporal Graph Pre-training" and encompasses over 200,000 regular web pages from 75 websites across the finance, healthcare, and education sectors, in addition to 1,541 tampered examples sourced from zone-h.org. This dataset organizes web pages as nodes and their links as edges within a discrete dynamic graph structure, capturing snapshots at various moments in time. Each node integrates structural, textual, and statistical features into a robust 148-dimensional feature vector for every page.

Categories:

Weibo and Twitter

1)The Weibo dataset is derived from the Weibo social platform. The collection of true information in this dataset originates from authoritative Chinese sources, while fake information is acquired through the official Weibo rumor suppression system. Each data instance within this dataset comprises both a news text and a corresponding news image.

Categories:

Artificial Intelligence

AVDM Automated Vehicle Driver Monitoring Dataset

The JKU-ITS AVDM contains data from 17 participants performing different tasks with various levels of distraction.
The data collection was carried out in accordance with the relevant guidelines and regulations and informed consent was obtained from all participants.
The dataset was collected using the JKU-ITS research vehicle with automated capabilities under different illumination and weather conditions along a secure test route within the

Categories:

Ship Routing Problem Dataset

This is a dataset about minimizing maritime passenger transfer in ship routing. consists of data on the distance between ports, the number of passengers from the port of origin to the port of destination, ship speed, and the duration of berthing at ports.

Categories:

A collection of nine multi-label text classification datasets

This is a compressed package containing nine multi-label text classification data sets, including AAPD, CitySearch, Heritage, Laptop, Ohsumed, RCV1, Restaurant, Reuters, and Sentihood.

Categories:

NASA lithium ion battery dataset

This data set has been collected from a custom built battery prognostics testbed at the NASA Ames Prognostics Center of Excellence (PCoE). Li-ion batteries were run through 3 different operational profiles (charge, discharge and Electrochemical Impedance Spectroscopy) at different temperatures. Discharges were carried out at different current load levels until the battery voltage fell to preset voltage thresholds. Some of these thresholds were lower than that recommended by the OEM (2.7 V) in order to induce deep discharge aging effects.

Categories:

Artificial Intelligence

English as a Second Language TTS (ESLTTS) dataset

With the progress made in speaker-adaptive TTS approaches, advanced approaches have shown a remarkable capacity to reproduce the speaker’s voice in the commonly used TTS datasets. However, mimicking voices characterized by substantial accents, such as non-native English speakers, is still challenging. Regrettably, the absence of a dedicated TTS dataset for speakers with substantial accents inhibits the research and evaluation of speaker-adaptive TTS models under such conditions. To address this gap, we developed a corpus of non-native speakers' English utterances.

Categories: