Ubaid Rehman

Urdu Nastalique

The performance of most of the classification models is dependent on the data used for training. The data must be reliable, robust and meticulously labelled. In order to form such a data a systematical approach has been designed and moreover, it should be. The data set was collected from a well-known source, namely Center for Language Engineering available at http://www.cle.org.pk. The corpus available on the website used for prediction contains Urdu Naskh data having 4,325 number of lines and 1, 22284 words. This corpus contains three text files.

Categories:

Other

Dataset Entries from this Author

Urdu Nastalique