Text localization

Real-world images often encompass embedded texts that adhere to disparate disciplines like business, education, and amusement, to name a few. Such images are graphically rich in terms of font attributes, color distribution, foreground-background similarity, and component organization. This aggravates the difficulty of recognizing texts from these images. Such characteristics are very prominent in the case of movie posters. One of the first pieces of information on movie posters is the title.



Videos contain a high volume of texts and are broadcasted via different sources, such as television, the internet, etc. Since optical character recognition (OCR) engines are script-dependent, script identification is the precursor for them. Depending on the video sources, identification of video scripts is not trivial since we have difficult issues, such as low resolution, complex background, noise, blur effects, etc. In this work, a deep learning-based system named as LWSINet: LightWeight Script Identification Network (6-layered CNN) is proposed to identify the video scripts.