OCR (Optical Character Recognition); Pattern Recognition; Handwritten Recognition; Public Data

Odia is a classical and popular language in the Indian subcontinent used by more than 50 million people. In spite of its rich history, popularity and usefulness, not much research efforts have been made to achieve high level accuracy in case of Odia OCR. New handwritten alphanumeric character and numeral datasets for Odia are created by our research group@iitbbs and reported here in order to address the paucity of benchmark Odia datasets.

Categories:
230 Views

The "MANUU: Handwritten Urdu OCR Dataset" is an extensive and meticulously curated collection to advance OCR (Optical Character Recognition) for handwritten Urdu letters, digits, and words. The compilation of the dataset has been conducted methodically, ensuring that it encompasses a wide variety of handwritten instances. This comprehensive collection enables the construction and assessment of strong models for Optical Character Recognition (OCR) systems specifically designed for the complexities of the Urdu script.

Categories:
542 Views

Urdu Handwritten Ligature Dataset (UHLD) is the first unconstrained handwritten Urdu dataset developed for various handwritten Urdu recognition tasks and OCR research problems. The UHLD is written independently of paper color, paper type (blank or ruled), ink color, and pen type. The UHLD consists of around six thousand handwritten Urdu text lines written by 200 different writers. The UHLD dataset covers six and seven-character ligatures whereas it was only up to five character ligatures in previous dataset such as UNHD.

Categories:
7 Views

CAPTCHA (Completely Automated Public Turing Tests to Tell Computers and Humans Apart). Only humans can successfully complete this test; current computer systems cannot. It is utilized in several applications for both human and machine identification. Text-based CAPTCHAs are the most typical type used on websites. Most of the letters in this protected CAPTCHA script are in English, it is challenging for rural residents who only speak their native tongues to pass the test.

Categories:
1465 Views

This paper presents a digital image dataset of historical handwritten birth records stored in the archives of several parishes
across Sweden, together with the corresponding metadata that supports the evaluation of document analysis algorithms’

Categories:
106 Views