Image Processing

The steel tube dataset comprises comprehensive information on various attributes related to steel tubes, encompassing dimensions, material composition, manufacturing processes, and performance characteristics. This dataset facilitates in-depth analysis of steel tube properties, aiding researchers, engineers, and industry professionals in optimizing designs, ensuring structural integrity, and advancing materials science in the context of steel tube applications.


The ITM-HDR-VQA dataset is a video quality assessment dataset for inversely tone-mapped videos. It contain 200 HDR10 videos with their MOS.
We capture videos of 20 typical HDR scenes including daylight scenes containing both sunlit areas and deep shadows and night scenes lit by artificial lights. The contents of these scenes can be roughly divided into two categories, man-made architecture and natural scenery.


(Under Construction) The IAMCV Dataset was acquired as part of the FWF Austrian Science Fund-funded "Interaction of Autonomous and Manually-Controlled Vehicles" project. It is primarily centered on inter-vehicle interactions and captures a wide range of road scenes in different locations across Germany, including roundabouts, intersections, and highways. These locations were carefully selected to encompass various traffic scenarios, representative of both urban and rural environments.


Automatic white balance (AWB) is an important module for color constancy of cameras. The classification of the normal image and the color-distorted image is critical to realize intelligent AWB. One tenth of ImageNet is utilized as the normal image dataset for training, validating and testing. The distorted dataset is constructed by the proposed theory for generation of color distortion. To generate various distorted color, histogram shifting and matching are proposed to randomly adjust the histogram position or shape.


STP dataset is a dataset for Arabic text detection on traffic panels in the wild. It was collected from Tunisia in “Sfax” city, the second largest Tunisian city after the capital. A total of 506 images were gathered through manual collection one by one, with each image energizing Arabic text detection challenges in natural scene images according to real existing complexity of 15 different routes in addition to ring roads, roundabouts, intersections, airport and highways.


This synthetic dataset or phantom consists of 3 jpg format databases, in the two-dimensional (2-D) domain, which are identified as follows:


DB1: Ground Truth


DB2: Speckle noise with zero mean and 0.005 standard deviation


DB3: Speckle noise with zero mean and 0.05 standard deviation



This work presents a large-scale three-fold annotated, low-cost microscopy image dataset of potato tubers for plant cell analysis in deep learning (DL) framework which has huge potential in the advancement of plant cell biology research. Indeed, low-cost microscopes coupled with new-generation smartphones could open new aspects in DL-based microscopy image analysis, which offers several benefits including portability, ease of use, and maintenance.


FaceEngine is a face recognition database for using in CCTV based video surveillance systems. This dataset contains high-resolution face images of around 500 celebrities. It also contains images captured by the CCTV camera. Against each person folder, there are more than 10 images for that person. Face features can be extracted from this database. Also, there are test videos in the dataset that can be used to test the system. Each unique ID contains high resolution images that might help CCTV surveillance system test or training face detection model.


Low-light images and video footage often exhibit issues due to the interplay of various parameters such as aperture, shutter speed, and ISO settings. These interactions can lead to distortions, especially in extreme lighting conditions. This distortion is primarily caused by the inverse relationship between decreasing light intensity and increasing photon noise, which gets amplified with higher sensor gain. Additionally, secondary characteristics like white balance and color effects can also be adversely affected and may require post-processing correction.


The dataset consists of six .mat files containing three surveillance video test sequences, Hall_qcif_330 (Hall, 330 frames), PETS2009_S1L1-View_001 (PETS, 100 frames), and Crosswalk (CW, 270 frames), and the corresponding background image for three videos (Only the data of each video's gray channel component). Hall is shot indoors and disturbed by noise, PETS is shot outdoors with less noise, and CW is shot outdoors with heavy noise interference. Hall and PETS are two foreground-sparse videos with small objects. CW is a foreground-dense video with dramatic changes in sparsity. All the video