Although the vertical Chinese text recognition dataset presented by Yu is public, it is reproduced from the PosterErase dataset, collected from the e-commerce platform for the poster text erasing task, and does not contain the challenges from real application scenarios. Therefore, we establish a benchmark dataset (Vertical and Horizontal Text Recognition Dataset, WHU-VHTR) to promote in-depth research on STR. WHU-VHTR contained 23674 images annotated with line-level transcriptions, collecting from Google Street View and real urban scene images in China.