Corpus Linguistics

LATIC: A Non-native Pre-labelled Mandarin Chinese Validation Corpus for Automatic Speech Scoring and Evaluation Task

LATIC is focusing on non-native Mandarin Chinese learners. It is an annotated non-native speech database for Chinese, which is fully open-source can get online for any purpose use. The related using area can be automatic speech scoring, evaluation, derivation—L2 teaching, Education of Chinese as Foreign Language, etc. We are aiming to provide a relatively small-scale and highly efficient training deviation dataset. For this target, four chosen non-native Chinese speaker participated in this project, and their mother tongue (L1s) varies from Russian, Korean, French and Arabic.

Categories:

LATIC: A Non-native Pre-labelled Mandarin Chinese Validation Corpus for Automatic Speech Scoring and Evaluation Task

Category