Abstract

TDC560 dataset contains 560 difficult images, one of which are selected from the testing set of CTW1500 and TD500, others are generated by ourselves with text-line annotations. In the selecting process, we sort images with the extreme spatial distances between characters and words. Additionally, to bridge hard text-line detection to real world, we rich existing diverse image sources with our own data, which has two significant merits: (1)we expand abundant images containing Chinese texts which is relatively lack in previous benchmarks such as CTW1500; (2) we collect various types of images such as scene text, design text and some hard stylish text. Some visualization results are demonstrated in the paper.

Instructions:

This dataset is used for arbitrary-shaped texts detection. The images are in the TDC560_image folder, while the corresponding annotations are in TDC560_label_circum folder. The annotations are labeld by label-me tool with clockwise points.

Dataset Files

Text_images TDC560_image.zip (70.40 MB)
Groundtruth; label annotations TDC560_label_circum.zip (218.33 kB)

Datasets

Standard Dataset

TDC560

Abstract

Dataset Files

QUESTIONS?