Datasets
Standard Dataset
Document image binarization result of manuscript "Binarization of Unevenly Illuminated Document Images Based on Cloth Simulation Filter"
- Citation Author(s):
- Submitted by:
- Shuhang Zhang
- Last updated:
- Wed, 11/13/2024 - 22:22
- DOI:
- 10.21227/b9f7-4s92
- Data Format:
- Links:
- License:
- Categories:
- Keywords:
Abstract
Photographs have become a cost-effective solution for digitizing real-world data, especially for documents. The rapid expansion of multimedia information has significantly increased the demand for efficient document image processing. A binarization method robust to uneven illumination is crucial for enhancing information retrieval, reducing file size andimproving input quality for downstream applications like OCR and artificial intelligent under challenging lighting conditions. By considering document images with uneven illumination as a terrain point cloud in 3D space, we transformed the problem of handling an uneven lighting background into a problem of ground filtering in the field of point cloud processing. We propose a binarization method based on the cloth simulation filter (CSF), a point cloud processing method, for binarizing unevenly illuminated document images.
This dataset contains binarization results in the manuscript "Binarization of Unevenly Illuminated Document Images Based on Cloth Simulation Filter". The comparing methods include Bradley, ISauvola, Su, T.R. Singh, Wolf, Calvo-Z, DE-GAN, ColDBin, DocDiff, GDB, visibility detection (Kligler et al. 2018) and CSF(proposed). All were applied with the WEZUT OCR dataset and the DocEng 2021 dataset.
The comparing methods include Bradley, ISauvola, Su, T.R. Singh, Wolf, Calvo-Z, DE-GAN, ColDBin, DocDiff, GDB, visibility detection (Kligler et al. 2018) and CSF(proposed). Results from those methods are stored separately in individual sub-folders.
Dataset Files
- binarization_results_WEZUT_OCR.zip (136.54 MB)
- binarization_results_DocEng_2021.zip (142.51 MB)