WANG LI

Datasets & Competitions

QAmultilabelEURLEXsamples

The dataset is the sampling dataset from EURLEX57k and built for multi-answer questioning task with EUROVOC. , Each legal document in the EURLEX57k dataset is assigned several labels from the European Vocabulary (EUROVOC), which maintains thousands of concepts such as "export industry" and "organic acid". Before building the data, the sample is chosen. A Z-scorebased online sample size calculator is used to determine the sample sizes. The given confidence level is 95%. A 5% margin of uncertainty is used. The computation results in a 381 out of 45,000 train sample size.

Categories:: Other

9 Views