Skip to main content

Datasets

Standard Dataset

TDC560

Citation Author(s):
Yan Li (University of International Relations)
Submitted by:
YAN LI
Last updated:
DOI:
10.21227/9jbk-kx89
58 views
Categories:
No Ratings Yet

Abstract

TDC560 dataset contains 560 difficult images, one of which are selected from the testing set of CTW1500 and TD500, others are generated by ourselves with text-line annotations. In the selecting process, we sort images with the extreme spatial distances between characters and words. Additionally, to bridge hard text-line detection to real world, we rich existing diverse image sources with our own data, which has two significant merits: (1)we expand abundant images containing Chinese texts which is relatively lack in previous benchmarks such as CTW1500; (2) we collect various types of images such as scene text, design text and some hard stylish text. Some visualization results are demonstrated in the paper.

Instructions:

This dataset is used for arbitrary-shaped texts detection. The images are in the TDC560_image folder, while the corresponding annotations are in TDC560_label_circum folder. The annotations are labeld by label-me tool with clockwise points.