Artificial Intelligence
Please cite the following paper when using this dataset:
Vanessa Su and Nirmalya Thakur, “COVID-19 on YouTube: A Data-Driven Analysis of Sentiment, Toxicity, and Content Recommendations”, Proceedings of the IEEE 15th Annual Computing and Communication Workshop and Conference 2025, Las Vegas, USA, Jan 06-08, 2025 (Paper accepted for publication, Preprint: https://arxiv.org/abs/2412.17180).
Abstract:
- Categories:
Dataset for "SynEL: A Synthetic Benchmark for Entity Linking" paper. The dataset integrates structured information from two primary sources: DBpedia for English, representing a high-resource language environment, and the Russian Public Company Register, a challenging low-resource dataset. Each dataset includes extensive annotations and structured entity links, ensuring high relevance for real-world applications in diverse industries.
- Categories:
With multiple large open source datasets, the development of action recognition is rapid. However, we noticed the lack of annotated data of cilvil aircraft pilots, while distribution of whose action can be very different from daily casual activities. After discussion with experienced pilots and experts and close look into standard operation procedure, we present Airline-Pilot-Action (APA) benchmark, containing 5090 RGB and depth images together with corresponding flight computer data.
- Categories:
M. Kacmajor and J.D. Kelleher, "ExTra: Evaluation of Automatically Generated Source Code Using Execution Traces" (submitted to IEEE TSE)
- Categories:
M. Kacmajor and J.D. Kelleher, "ExTra: Evaluation of Automatically Generated Source Code Using Execution Traces" (submitted to IEEE TSE)
- Categories:
The Ogbn-Arxiv dataset (Arxiv for short) represents an academic citation network. In this network structure, papers serve as nodes, citations between papers form edges, and paper abstracts constitute the textual attributes. The primary task involves subject prediction for papers. We utilize the publicly available partitions, ground truth labels, and textual data from OGB
- Categories:
<p>This dataset represents a user interaction network from Reddit, where individual users are represented as nodes. The network connections (edges) are established when users interact through replies. Each node contains features derived from the user's subreddit posting history. The classification goal is to identify users within the top 50% popularity bracket, based on their subreddit score averages. </p>
- Categories:
This dataset comprises a social media network structure where user accounts function as nodes and follower relationships constitute edges. The initial dataset is from [1], and the work adds the graph structure information through Instagram's public API. The main objective is distinguish between commercial and regular user accounts.
- Categories:
Jamming devices present a significant threat by disrupting signals from the global navigation satellite system (GNSS), compromising the robustness of accurate positioning. The detection of anomalies within frequency snapshots is crucial to counteract these interferences effectively. A critical preliminary measure involves the reliable classification of interferences and characterization and localization of jamming devices.
- Categories: