Meitei Mayek Handwritten Character Dataset (37 classes)

Citation Author(s):: DEENA HIJAM (Tezpur University)
Submitted by:: DEENA HIJAM
Last updated:: Mon, 10/28/2019 - 05:08
DOI:: 10.21227/pjtc-bw69
Data Format:: .tif

.zip
Links:: Convolutional Neural Network Based Meitei Mayek Handwritten Character Recogniti…

Meitei Mayek Handwritten Character (MMHC) Dataset

685 views

Categories:

Keywords:

Handwritten character recognition

Meitei Mayek

Optical character recognition

Dataset

Manipuri

ACCESS DATASET CITE

Abstract

The dataset consists of 60285 character image files which has been randomly divided into 54239 (90%) images as training set 6046 (10%) images as test set. The collection of data samples was carried out in two phases. The first phase consists of distributing a tabular form and asking people to write the characters five times each. Filled-in forms were collected from around 200 different individuals in the age group 12-23 years. The second phase was the collection of handwritten sheets such as answer sheets and classroom notes from students in the same age group. A total of 279 such pages written by 279 different individuals were collected. The reason why a particular age group is considered is because of the fact that individuals older than that do not know how to write the script as Bangla was the script which was used during their times. So in order to capture the natural handwriting and not the drawing of characters, the mentioned age range is considered. The data samples are collected from schools and colleges in different parts of Imphal. The forms and pages collected are scanned at 300 dpi using a canon flatbed scanner in grayscale format and saved in TIF format.

Instructions:

The dataset may be used by citing the paper below:

Hijam D., Saharia S. (2018) Convolutional Neural Network Based Meitei Mayek Handwritten Character Recognition. In: Tiwary U. (eds) Intelligent Human Computer Interaction. IHCI 2018. Lecture Notes in Computer Science, vol 11278. Springer, Cham