Artificial Intelligence

NCBI; BC5CDR; i2b2 2010; HPRD50; AIMed; MedNLI

NCBI: The NCBI dataset is a biomedical corpus containing 793 PubMed abstracts, each manually annotated to include disease mentions and their corresponding concepts, providing a high-quality gold standard for disease name recognition and normalization research.

Categories:: Artificial Intelligence

93 Views

RF-Based Noise Data for Randomness Applications with Artificial Intelligence

This dataset comprises audio recordings of ultra-high-frequency ambient noise stored in the lossless waveform format (WAW). The recordings were sampled at a frequency sample rate of 2.048 MHz and then provided at a downsampled audio rate of 48 kHz for compatibility and practical usage. The total length of the dataset is 01:30:29, consisting of approximately 260 million data points. (2024-03-30)

Categories:: Artificial Intelligence
Education and Learning Technologies
Wireless Networking
Discrete-time signal processing
Machine Learning
Security
Computational Intelligence

20 Views

Dataset for Spectrum Coexistence in Passive Sensing and Wireless Communication

In our ever-expanding world of advanced satellite and communications systems, there's a growing challenge for passive radiometer sensors used in the Earth observation like 5G. These passive sensors are challenged by risks from radio frequency interference (RFI) caused by anthropogenic signals. To address this, we urgently need effective methods to quantify the impacts of 5G on Earth observing radiometers. Unfortunately, the lack of substantial datasets in the radio frequency (RF) domain, especially for active/passive coexistence, hinders progress.

Categories:: Artificial Intelligence
Machine Learning
Sensors
Geoscience and Remote Sensing
Communications

436 Views

ThangkaDatasets_82688

In Tibetan culture, Thangka art holds a significant status and is known as the "Encyclopedia of the Tibetan culture". This unique art form of painting encompasses multiple aspects of Tibetan history, politics, culture, and social life, and serves as precious material for studying Tibetan culture. Considering the lack of publicly available Thangka dataset, We constructed a thangka image super-resolution dataset in our research.

Categories:: Artificial Intelligence

16 Views

Unified Spacecraft Anomaly Detection Benchmark Dataset

Anomaly detection plays a crucial role in various domains, including but not limited to cybersecurity, space science, finance, and healthcare. However, the lack of standardized benchmark datasets hinders the comparative evaluation of anomaly detection algorithms. In this work, we address this gap by presenting a curated collection of preprocessed datasets for spacecraft anomalies sourced from multiple sources. These datasets cover a diverse range of anomalies and real-world scenarios for the spacecrafts.

Categories:: Artificial Intelligence
Signal Processing
Digital signal processing
Machine Learning
Reliability
Sensors
Standards Research Data
Communications

753 Views

A Labelled Dataset for Sentiment Analysis of Videos on YouTube, TikTok, and other Sources about the 2024 Outbreak of Measles

To access this dataset without purchasing an IEEE Dataport subscription, please visit: https://zenodo.org/doi/10.5281/zenodo.11711229

Please cite the following paper when using this dataset:

Categories:: Artificial Intelligence
Machine Learning
Other
Standards Research Data
Biomedical and Health Sciences
Communications
Computational Intelligence
COVID-19
Demographic
Health

801 Views

Garbage Image Dataset

The Garbage Image Dataset consists of images of garbage items collected from nearby localities using smartphones. The dataset is categorized into five different classes. Each category represents a specific type of garbage item commonly found in everyday waste. The purpose of the Garbage Image Dataset is to provide a collection of labelled images of garbage items from different categories. The dataset can be used to train and evaluate deep learning models for garbage classification tasks.

Categories:: Artificial Intelligence

715 Views

LIAR2

The LIAR dataset has been widely followed by fake news detection researchers since its release, and along with a great deal of research, the community has provided a variety of feedback on the dataset to improve it. We adopted these feedbacks and released the LIAR2 dataset, a new benchmark dataset of ~23k manually labeled by professional fact-checkers for fake news detection tasks.

Categories:: Artificial Intelligence

177 Views

Maldeb Dataset

Image representation of Malware-benign dataset. The Dataset were compiled from various sources malware repositories: The Malware-Repo, TheZoo,Malware Bazar, Malware Database, TekDefense. Meanwhile benign samples were sourced from system application of Microsoft 10 and 11, as well as open source software repository such as Sourceforge, PortableFreeware, CNET, FileForum. The samples were validated by scanning them using Virustotal Malware scanning services. The Samples were pre-processed by transforming the malware binary into grayscale images following rules from Nataraj (2011).

Categories:: Artificial Intelligence
Machine Learning
Security

332 Views

MANUU: Handwritten Urdu OCR Dataset

The "MANUU: Handwritten Urdu OCR Dataset" is an extensive and meticulously curated collection to advance OCR (Optical Character Recognition) for handwritten Urdu letters, digits, and words. The compilation of the dataset has been conducted methodically, ensuring that it encompasses a wide variety of handwritten instances. This comprehensive collection enables the construction and assessment of strong models for Optical Character Recognition (OCR) systems specifically designed for the complexities of the Urdu script.

Categories:: Artificial Intelligence
Image Processing

543 Views

Artificial Intelligence

Artificial Intelligence

Pages