TDC560

Citation Author(s):
Yan
Li
University of International Relations
Submitted by:
YAN LI
Last updated:
Fri, 04/01/2022 - 22:01
DOI:
10.21227/9jbk-kx89
License:
56 Views
Categories:
0
0 ratings - Please login to submit your rating.

Abstract 

TDC560 dataset contains 560 difficult images, one of which are selected from the testing set of CTW1500 and TD500, others are generated by ourselves with text-line annotations. In the selecting process, we sort images with the extreme spatial distances between characters and words. Additionally, to bridge hard text-line detection to real world, we rich existing diverse image sources with our own data, which has two significant merits: (1)we expand abundant images containing Chinese texts which is relatively lack in previous benchmarks such as CTW1500; (2) we collect various types of images such as scene text, design text and some hard stylish text. Some visualization results are demonstrated in the paper.

Instructions: 

This dataset is used for arbitrary-shaped texts detection. The images are in the TDC560_image folder, while the corresponding annotations are in TDC560_label_circum folder. The annotations are labeld by label-me tool with clockwise points.