Computer Vision | IEEE DataPort

DocVerify-Automated documents verification system

In today’s digital ecosystem, verifying the authenticity of identity documents is essential for secure access control and digital trust. Sectors such as finance, education, government, and employment frequently rely on scanned or digital versions of documents like Aadhaar cards, PAN cards, Voter IDs, Driving Licenses, and Passports. However, this convenience introduces risks related to document forgery and fraudulent activity.

Categories:

GeoCaption-12K

The scarcity of multimodal datasets in remote sensing, particularly those combining high-resolution imagery with descriptive textual annotations, limits advancements in context-aware analysis. To address this, we introduce a novel dataset comprising 12,473 aerial and satellite images sourced from established benchmarks (RSSCN7, DLRSD, iSAID, LoveDA, and WHU), enriched with automatically generated pseudo-captions and semantic tags.

Categories:

virDepth

The dataset includes eight urban scenes of different sizes and styles, as well as various lighting and weather conditions. Each scene contains 200 vehicles of different types, 100 pedestrians and 5,000 RGB images, semantic images, and point cloud files. The annotations include both the depth and 2D information of the objects.

Categories:

Computer Vision

Rich Text to Video Generation: Towards Region-Controllable Synthesis with Spatio-Temporal Coherence

Existing plain text-based video generation methods have limited expressiveness and struggle to provide detailed descriptions and precise control of attributes. To address this, we introduce rich text for video generation, which faces two main challenges: coherent control between frames and consistency among rich text attributes, plain text, and control regions.

Categories:

Computer Vision

FNF Matting Dataset

To promote research on flash photography for portrait matting, this work construct the first flash/no-flash portrait matting dataset. It consists of more than 100 diverse videos captured using the green screen, in total con-taining 3,025 well-annotated alpha mattes, named Flash-No-Flash Matting Dataset.

Categories:

Computer Vision

ThermalTrack

ThermalTrack is an RGB-LWIR paired dataset of wheel tracks captured under harsh winter conditions, including white-outs (severely degraded visibility), low-contrast snow terrain, and diverse wheel track geometries. Designed to enable robust alternative navigation strategies for winter autonomy systems, this dataset builds upon WADS (https://digitalcommons.mtu.edu/wads/), a specialized dataset for autonomous vehicle research in inclement winter weather.

Categories:

Computer Vision

QRF Dataset

The QRF dataset is designed to support research in quantum-native photorealistic scene rendering. It consists of high-fidelity 3D indoor and outdoor environments captured from multiple calibrated viewpoints, with detailed annotations of geometry, material properties, and lighting conditions. Each scene is processed into quantum-compatible representations for training and evaluating Quantum Radiance Fields (QRF), which leverage quantum circuits, activation functions, and quantum volume rendering.

Categories:

Computer Vision

CraneDataset

The visual sensor captures images of the crane loading operation scene, while simultaneously collecting the motion control commands from the crane's operational control end. A neural network model is trained to predict the crane's motion control commands in an end-to-end manner.

Categories:

Computer Vision

Intraoperative Adverse Events Dataset

We release a large-scale endoscopic video dataset covering seven types of intraoperative adverse events (iAEs) across heterogeneous surgical domains. Source domain: Cholec80 is re-annotated for iAEs detection from laparoscopic cholecystectomy videos. Target domain: dViAEs comprises robot-assisted colorectal and HPB surgery videos.

Categories:

Computer Vision

ITDAV-25 (Indian Thermal Dataset for Autonomous Vehicles)

ITDAV-25 (Indian Thermal Dataset for Autonomous Vehicles), a thermal image dataset specifically curated to advance research in Advanced Driver Assistance Systems (ADAS), particularly for environments characterized by low visibility, night-time conditions, and inclement weather. The dataset comprises of 13,688 raw thermal images, collected without any synthetic augmentation techniques.

Categories: