Image Processing

The AMD3IR dataset is a large-scale collection of Shortwave Infrared (SWIR) and Longwave Infrared (LWIR) images, designed to advance the ongoing research in the field of drone detection and tracking. It efficiently addresses key challenges such as detecting and distinguishing small airborne objects, differentiating drones from background clutter, and overcoming visibility limitations present in conventional imaging. The dataset comprises 20,865 SWIR images with 24,994 annotated drones and 8,696 LWIR images with 10,400 annotated drones, featuring various UAV models.

Categories:
6 Views

DLSF is the first dedicated dataset for Text-Image Synchronization Forgery (TISF) in multimodal media. The source data for this dataset is scraped from the Chinese news aggregation platform, Toutiao. This dataset includes extensive text, image, and audio-video data from news articles involving politicians and celebrities, featuring samples of both entity-level and attribute-level TISF. It provides comprehensive annotations, including labels for text-image authenticity, types of TISF, image forgery regions, and text forgery tokens.

Categories:
59 Views

Addressing the limitations and inconveniences imposed by the randomness in moiré pattern generation on deep learning model training, we have constructed the SynMoiré dataset through a synthetic approach to generate moiré images. The construction process involves resampling the original images into an RGB sub-pixel format, applying random projection transformations, radial distortions, and Gaussian filtering to simulate camera effects.

Categories:
11 Views

Doppler time-of-flight (Do-ToF) imaging has recently attracted significant attention due to its high-resolution capabilities for measuring radial velocity. However, a challenge arises when the back-reflected signal received by pixels switches between a moving object and a stationary background, leading to the appearance of edge artifacts in the velocity images. To address this issue, we propose a per-pixel gradient-based method for identifying and correcting these artifacts.

Categories:
16 Views

To evaluate SARNet’s generalization, we captured a real-world stereo dataset in Guangzhou using a binocular camera. The dataset includes diverse urban and natural scenes to assess SARNet’s performance beyond synthetic and benchmark datasets. Fig. 7 illustrates SARNet’s predictions on real-world scenes, KITTI 2012, and KITTI 2015. Experimental results demonstrate that SARNet generates clear and consistent disparity maps across both smooth and complex regions, highlighting its robustness in real-world depth estimation tasks.

Categories:
12 Views

A new small aerial flame dataset, called the Aerial Fire and Smoke Essential (AFSE) dataset, is created which is comprised of screenshots from different YouTube wildfire videos as well as images from FLAME2. Two object categories are included in this dataset: smoke and fire. The collection of images is made to mostly contain pictures utilizing aerial viewpoints. It contains a total of 282 images with no augmentations and has a combination of images with only smoke, fire and smoke, and no fire nor smoke.

Categories:
434 Views

The Ancient Handwritten Devanagari Documents dataset is a curated collection of historical manuscripts written in the Devanagari script. It comprises digitized images of handwritten texts from various periods, containing diverse calligraphic styles, degradations, and linguistic variations. This dataset is designed for research in optical character recognition (OCR), handwritten text recognition (HTR), word spotting in historical documents and linguistic analysis.

Categories:
61 Views

Augmented reality (AR) is a rapidly evolving field, yet research has predominantly focused on indoor applications, leaving outdoor environments relatively underexplored. Metric Depth Estimation (MDE) plays a pivotal role in AR, enabling essential functionalities such as object placement and occlusion handling by extracting depth and perspective information from single 2D images.  

Categories:
183 Views

Pages