Datasets
Standard Dataset
TDC560
- Citation Author(s):
- Submitted by:
- YAN LI
- Last updated:
- Fri, 04/01/2022 - 22:01
- DOI:
- 10.21227/9jbk-kx89
- License:
- Categories:
Abstract
TDC560 dataset contains 560 difficult images, one of which are selected from the testing set of CTW1500 and TD500, others are generated by ourselves with text-line annotations. In the selecting process, we sort images with the extreme spatial distances between characters and words. Additionally, to bridge hard text-line detection to real world, we rich existing diverse image sources with our own data, which has two significant merits: (1)we expand abundant images containing Chinese texts which is relatively lack in previous benchmarks such as CTW1500; (2) we collect various types of images such as scene text, design text and some hard stylish text. Some visualization results are demonstrated in the paper.
This dataset is used for arbitrary-shaped texts detection. The images are in the TDC560_image folder, while the corresponding annotations are in TDC560_label_circum folder. The annotations are labeld by label-me tool with clockwise points.
Dataset Files
- Text_images TDC560_image.zip (70.40 MB)
- Groundtruth; label annotations TDC560_label_circum.zip (218.33 kB)