Datasets
Standard Dataset
Thai Deaf Corpus
- Citation Author(s):
- Submitted by:
- Supachan Traitr...
- Last updated:
- Fri, 07/26/2024 - 12:06
- DOI:
- 10.21227/53w2-1k42
- Data Format:
- License:
225 Views
- Categories:
- Keywords:
0 ratings - Please login to submit your rating.
Abstract
The Thai Deaf Corpus (TDC) is constructed from a writing activity where deaf students randomly select picture words using the image picker wheel, then write sentences corresponding to these words on the writing sheet. The sentences are transcribed and corrected manually to create the TDC.
- It contains 22,719 sentences written by deaf students, with their corresponding corrections separated by "|||".
- For example, sentence x may have one or more possible corrections such as sentence y1, sentence y2, and so on. We may get sentence pairs: sentence x ||| sentence y1, sentence x ||| sentence y2, ...
Instructions:
In TDC, each line contains (1) an original sentence written by deaf students, (2) "|||" separation, and (3) its correction by native speakers.
Funding Agency:
Foundation for the Deaf under the Royal Patronage of Her Majesty the Queen