Skip to main content

Artificial Intelligence

Large Vision-Language Models (LVLMs) struggle with distractions, particularly in the presence of irrelevant visual or textual inputs. This paper introduces the Irrelevance Robust Visual Question Answering (IR-VQA) benchmark to systematically evaluate and mitigate this ``multimodal distractibility". IR-VQA targets three key paradigms: irrelevant visual contexts in image-independent questions, irrelevant textual contexts in image-dependent questions, and text-only distractions.

Categories:

Agriculture is the backbone of Mizoram’s state economy as the majority of the people use agriculture and its allied sector as their livelihood. According to the 2011 census, more than 50% of the people are still engaged in agriculture and its related activities. Jhum cultivation or shifting cultivation is the primary farming pattern in the state. However, this traditional farming method is no longer effective and productive, due to various reasons such as resource limitations due to population pressure, and a shortened jhum cycle period of 3-4 years (i.e., the ideal cycle is 14-18).

Categories:

The dataset consists of two primary files: dataset.json and analysis_script.ipynb. The dataset.json file contains structured records of AI-assisted psychological therapy sessions, including emotion recognition, NLP techniques, cognitive behavioral therapy (CBT) patterns, hypnotherapy data, user feedback, and therapy outcomes. The analysis_script.ipynb Jupyter Notebook provides data preprocessing, visualization, and statistical analysis of therapy session outcomes.

Categories:

The datasets include six  publicly datasets for dynamic graph analysis: UCI captures social interactions among UC Irvine students; Digg records user interactions on the news-sharing website; Email-Eu-core details email communications in a European research institution; ia-contacts-dublin tracks human contacts in Dublin; sx-mathoverflow and sx-askubuntu are two temporal networks datasets formed from user activities on StackOverflow. 

Categories:

This work presents a dataset based on multiple network and service metrics (KPIs and KQIs), the latest providing the E2E conditions of video on demand service. Particularly, the dataset also includes an attack situation where an attacker injects traffic into the network. In total, there are 3600 samples, with different configurations of Physical Resource Blocks and cell gain, from sessions of 60 seconds.

Categories:

Following the setup of previous works [8, 16], we conducted experiments on various bit image restoration tasks.

We utilized a dataset of 2000 16-bit images, with training

data sourced from SINTEL [37] and FIVE-K [38]. SINTEL

is an animated short film dataset containing over 20,000 16-

bit lossless images with a resolution of 436 × 1024 pixels. In

FIVE-K, randomly select images from 5,000 16-bit natural

images for the experiment.The test set includes 8 images

randomly chosen from the SINTEL dataset (referred to as

Categories:

The COVID-19 Vaccine Misinformation Aspects Dataset contains 3,822 English tweets discussing COVID-19 vaccine misinformation, collected from Twitter/X between December 31, 2020, and July 8, 2021. Each tweet is manually annotated and categorized into four distinct misinformation aspects: (1) Vaccine Constituent, (2) Adverse Effects, (3) Agenda-Driven Narratives, and (4) Efficacy and Clinical Trials.

Categories:

The medical biometric dataset comprises 10,000 records collected across 23 patients spanning different demographics, biometric profiles, and temporal variations between 2022 and 2023. It is accumulated from various hospitals, digital health records, and biometric-enabled healthcare security systems. The dataset includes real-world biometric authentication and clinical profiling scenarios while ensuring compliance with standard medical and biometric data regulations.

Categories:

Binary classification is the most suitable task considering the common use cases in MCUs. Numerous datasets for image classification have been proposed. The Visual Wake Words (VWW) dataset, which is derived from the COCO dataset, distinguishes between ‘w/ person’ and ‘w/o person’ and is designed for object detection on MCUs. Therefore, datasets for binary classification and object detection exist. However, the dataset for binary classification has not been proposed for the semantic segmentation task.

Categories: