Image Processing
DLSF is the first dedicated dataset for Text-Image Synchronization Forgery (TISF) in multimodal media. The source data for this dataset is scraped from the Chinese news aggregation platform, Toutiao. This dataset includes extensive text, image, and audio-video data from news articles involving politicians and celebrities, featuring samples of both entity-level and attribute-level TISF. It provides comprehensive annotations, including labels for text-image authenticity, types of TISF, image forgery regions, and text forgery tokens.
- Categories:
![](https://ieee-dataport.org/sites/default/files/styles/3x2/public/tags/images/color-2174045_1280.png?itok=hQ444ipy)
Addressing the limitations and inconveniences imposed by the randomness in moiré pattern generation on deep learning model training, we have constructed the SynMoiré dataset through a synthetic approach to generate moiré images. The construction process involves resampling the original images into an RGB sub-pixel format, applying random projection transformations, radial distortions, and Gaussian filtering to simulate camera effects.
- Categories:
![](https://ieee-dataport.org/sites/default/files/styles/3x2/public/tags/images/color-2174045_1280.png?itok=hQ444ipy)
Doppler time-of-flight (Do-ToF) imaging has recently attracted significant attention due to its high-resolution capabilities for measuring radial velocity. However, a challenge arises when the back-reflected signal received by pixels switches between a moving object and a stationary background, leading to the appearance of edge artifacts in the velocity images. To address this issue, we propose a per-pixel gradient-based method for identifying and correcting these artifacts.
- Categories:
![](https://ieee-dataport.org/sites/default/files/styles/3x2/public/tags/images/color-2174045_1280.png?itok=hQ444ipy)
- Categories:
![](https://ieee-dataport.org/sites/default/files/styles/3x2/public/tags/images/color-2174045_1280.png?itok=hQ444ipy)
To evaluate SARNet’s generalization, we captured a real-world stereo dataset in Guangzhou using a binocular camera. The dataset includes diverse urban and natural scenes to assess SARNet’s performance beyond synthetic and benchmark datasets. Fig. 7 illustrates SARNet’s predictions on real-world scenes, KITTI 2012, and KITTI 2015. Experimental results demonstrate that SARNet generates clear and consistent disparity maps across both smooth and complex regions, highlighting its robustness in real-world depth estimation tasks.
- Categories:
A new small aerial flame dataset, called the Aerial Fire and Smoke Essential (AFSE) dataset, is created which is comprised of screenshots from different YouTube wildfire videos as well as images from FLAME2. Two object categories are included in this dataset: smoke and fire. The collection of images is made to mostly contain pictures utilizing aerial viewpoints. It contains a total of 282 images with no augmentations and has a combination of images with only smoke, fire and smoke, and no fire nor smoke.
- Categories:
![](https://ieee-dataport.org/sites/default/files/styles/3x2/public/tags/images/color-2174045_1280.png?itok=hQ444ipy)
The Ancient Handwritten Devanagari Documents dataset is a curated collection of historical manuscripts written in the Devanagari script. It comprises digitized images of handwritten texts from various periods, containing diverse calligraphic styles, degradations, and linguistic variations. This dataset is designed for research in optical character recognition (OCR), handwritten text recognition (HTR), word spotting in historical documents and linguistic analysis.
- Categories:
Augmented reality (AR) is a rapidly evolving field, yet research has predominantly focused on indoor applications, leaving outdoor environments relatively underexplored. Metric Depth Estimation (MDE) plays a pivotal role in AR, enabling essential functionalities such as object placement and occlusion handling by extracting depth and perspective information from single 2D images.
- Categories:
This dataset comprises 33,800 images of underwater signals captured in aquatic environments. Each signal is presented against three types of backgrounds: pool, marine, and plain white. Additionally, the dataset includes three water tones: clear, blue, and green. A total of 12 different signals are included, each available in all six possible background-tone combinations.
- Categories: