Image Processing

Point cloud streaming has recently attracted research attention as it has the potential to provide six degrees of freedom movement, which is essential for truly immersive media. The transmission of point clouds requires high-bandwidth connections, and adaptive streaming is a promising solution to cope with fluctuating bandwidth conditions. Thus, understanding the impact of different factors in adaptive streaming on the Quality of Experience (QoE) becomes fundamental. Point clouds have been evaluated in Virtual Reality (VR), where viewers are completely immersed in a virtual environment.


The "Paddy Field Dataset Captured in Palakkad District, Kerala, India" is a comprehensive collection of geospatial and attribute data specifically focused on paddy cultivation within the Palakkad district of the Kerala state in India. This dataset encompasses a wide range of information related to paddy fields, including their spatial distribution, size, crop varieties cultivated, land management practices, and relevant contextual factors. Geographic Information System (GIS) technology has captured accurate geospatial coordinates, enabling precise mapping and analysis.


The "Paddy Disease Dataset" represents a comprehensive collection of data related to various diseases commonly found in paddy crops. Paddy, or rice, is a staple crop crucial for global food security. However, paddy crops are susceptible to a range of diseases that can significantly impact yield and quality. This dataset encompasses a diverse array of disease-related information, including disease types, symptoms, geographical distribution, severity levels, and potential management strategies.


We used  Sentinel-2 images to create the dataset In order to estimate sequestered carbon in the above-ground forest Biomass.  Moreover, fieldwork was completed to gather related forest biomass volume. The clipped image has a size of 1115 × 955 pixels and consists of bands 3, 4, and 8, which correspond to green, red, and near-infrared.


Real-world images often encompass embedded texts that adhere to disparate disciplines like business, education, and amusement, to name a few. Such images are graphically rich in terms of font attributes, color distribution, foreground-background similarity, and component organization. This aggravates the difficulty of recognizing texts from these images. Such characteristics are very prominent in the case of movie posters. One of the first pieces of information on movie posters is the title.



Videos contain a high volume of texts and are broadcasted via different sources, such as television, the internet, etc. Since optical character recognition (OCR) engines are script-dependent, script identification is the precursor for them. Depending on the video sources, identification of video scripts is not trivial since we have difficult issues, such as low resolution, complex background, noise, blur effects, etc. In this work, a deep learning-based system named as LWSINet: LightWeight Script Identification Network (6-layered CNN) is proposed to identify the video scripts.


This dataset collects samples of different types of surface defects on aircraft fuselages to facilitate the identification and location of aircraft fuselage defects by computational vision and machine learning algorithms. The dataset consists of 5,601 images of four types of aircraft fuselage defects. The camera was used to photograph different parts of the aircraft fuselage in different lighting environments.


"LaneVisionIITR: A Comprehensive High-Resolution Dataset for Lane Detection recorded at IIT Roorkee ", which is a newly built high-resolution dataset for developing Lane detection dataset for advanced driver assistance systems. 

This folder consists of three files for each image: 

1. The image captured in .jpg format.

2. Annotations (.json) having left and center line coordinates represented as “L” and “C” respectively.


Blade damage inspection without stopping the normal operation of wind turbines has significant economic value. This study proposes an AI-based method AQUADA-Seg to segment the images of blades from complex backgrounds by fusing optical and thermal videos taken from normal operating wind turbines. The method follows an encoder-decoder architecture and uses both optical and thermal videos to overcome the challenges associated with field application.


With the development and implementation of convolutional neural networks in pattern recognition, there are large number of parameters needs to calculate and storage, which makes the algorithm hard to run in common computer.