Artificial Intelligence

<p>A dataset to detect knowledge conflict.</p>
The dataset contains 90 groups of natural language sentences with contradictions and 10 groups without contradictions, each group containing 5 sentences, usually 3 identical questions and 2 declarative sentences. The Agent should be able to accurately detect the contradictory statements.

Categories:
27 Views

Please cite the following paper when using this dataset:

Vanessa Su and Nirmalya Thakur, “COVID-19 on YouTube: A Data-Driven Analysis of Sentiment, Toxicity, and Content Recommendations”, Proceedings of the IEEE 15th Annual Computing and Communication Workshop and Conference 2025, Las Vegas, USA, Jan 06-08, 2025 (Paper accepted for publication, Preprint: https://arxiv.org/abs/2412.17180).

Abstract:

Categories:
154 Views

Dataset for "SynEL: A Synthetic Benchmark for Entity Linking" paper. The dataset integrates structured information from two primary sources: DBpedia for English, representing a high-resource language environment, and the Russian Public Company Register, a challenging low-resource dataset. Each dataset includes extensive annotations and structured entity links, ensuring high relevance for real-world applications in diverse industries.

Categories:
280 Views

With multiple large open source datasets, the development of action recognition is rapid. However, we noticed the lack of annotated data of cilvil aircraft pilots, while distribution of whose action can be very different from daily casual activities. After discussion with experienced pilots and experts and close look into standard operation procedure, we present Airline-Pilot-Action (APA) benchmark, containing 5090 RGB and depth images together with corresponding flight computer data.

Categories:
212 Views

M. Kacmajor and J.D. Kelleher, "ExTra: Evaluation of Automatically Generated Source Code Using Execution Traces" (submitted to IEEE TSE)

Categories:
27 Views

M. Kacmajor and J.D. Kelleher, "ExTra: Evaluation of Automatically Generated Source Code Using Execution Traces" (submitted to IEEE TSE)

Categories:
49 Views

The Ogbn-Arxiv dataset (Arxiv for short) represents an academic citation network. In this network structure, papers serve as nodes, citations between papers form edges, and paper abstracts constitute the textual attributes. The primary task involves subject prediction for papers. We utilize the publicly available partitions, ground truth labels, and textual data from OGB

Categories:
37 Views

<p>This dataset represents a user interaction network from Reddit, where individual users are represented as nodes. The network connections (edges) are established when users interact through replies. Each node contains features derived from the user's subreddit posting history. The classification goal is to identify users within the top 50% popularity bracket, based on their subreddit score averages.&nbsp;</p>

Categories:
44 Views

This dataset comprises a social media network structure where user accounts function as nodes and follower relationships constitute edges. The initial dataset is from [1], and the work adds the graph structure information through Instagram's public API. The main objective is distinguish between commercial and regular user accounts.

Categories:
123 Views

Pages