Script identification

Real-world images often encompass embedded texts that adhere to disparate disciplines like business, education, and amusement, to name a few. Such images are graphically rich in terms of font attributes, color distribution, foreground-background similarity, and component organization. This aggravates the difficulty of recognizing texts from these images. Such characteristics are very prominent in the case of movie posters. One of the first pieces of information on movie posters is the title.



Videos contain a high volume of texts and are broadcasted via different sources, such as television, the internet, etc. Since optical character recognition (OCR) engines are script-dependent, script identification is the precursor for them. Depending on the video sources, identification of video scripts is not trivial since we have difficult issues, such as low resolution, complex background, noise, blur effects, etc. In this work, a deep learning-based system named as LWSINet: LightWeight Script Identification Network (6-layered CNN) is proposed to identify the video scripts.


Without publicly available dataset, specifically in handwritten document recognition (HDR), we cannot make a fair and/or reliable comparison between the methods. Considering HDR, Indic script’s document recognition is still in its early stage compared to others such as Roman and Arabic. In this paper, we present a page-level handwritten document image dataset (PHDIndic_11), of 11 official Indic scripts: Bangla, Devanagari, Roman, Urdu, Oriya, Gurumukhi, Gujarati, Tamil, Telugu, Malayalam and Kannada.


Wide varieties of scripts are used in writing languages throughout the world. In a multiscript and multi-language environment, it is necessary to know the different scripts used in every part of a document to apply the appropriate document analysis algorithm. Consequently, several approaches for automatic script identification have been proposed in the literature, and can be broadly classified under two categories of techniques: those that are structure and visual appearance-based and those that are deep learning-based.