Computer Vision

Ear biting is a welfare challenge in commercial pig farming. Pigs sustain injuries at the site of bite paving the way for bacterial infections. Early detection and management of this behaviour is important to enhance animal health and welfare, increase productivity whilst minimising inputs from medication. Pig management using physical observation is not practical due to the scale of modern pig production systems. The same applies to the manual analysis of captured videos from pig houses. Therefore, a method of automated detection is desirable.


The deployment of unmanned aerial vehicles (UAV) for logistics and other civil purposes is consistently disrupting airspace security. Consequently, there is a scarcity of robust datasets for the development of real-time systems that can checkmate the incessant deployment of UAVs in carrying out criminal or terrorist activities. VisioDECT is a robust vision-based drone dataset for classifying, detecting, and countering unauthorized drone deployment using visual and electro-optical infra-red detection technologies.



In this study, we present advances on the development of proactive control for online individual user adaptation in a welfare robot guidance scenario, with the integration of three main modules: navigation control, visual human detection, and temporal error correlation-based neural learning. The proposed control approach can drive a mobile robot to autonomously navigate in relevant indoor environments. At the same time, it can predict human walking speed based on visual information without prior knowledge of personality and preferences (i.e., walking speed).


This paper presents a digital image dataset of historical handwritten birth records stored in the archives of several parishes
across Sweden, together with the corresponding metadata that supports the evaluation of document analysis algorithms’


The addy Doctor dataset contains 16,225 labeled paddy leaf images across 13 classes (12 different paddy diseases and healthy leaves). It is the largest expert-annotated visual image dataset to experiment with and benchmark computer vision algorithms. The paddy leaf images were collected from real paddy fields using a high-resolution (1,080 x 1,440 pixels) smartphone camera. The collected images were carefully cleaned and annotated with the help of an agronomist.


The problem of effective disposal of the trash generated by people has rightfully attracted major interest from various sections of society in recent times. Recently, deep learning solutions have been proposed to design automated mechanisms to segregate waste. However, most datasets used for this purpose are not adequate. In this paper, we introduce a new dataset, TrashBox, containing 17,785 images across seven different classes, including medical and e-waste classes which are not included in any other existing dataset.


Biometric management and that to which uses face, is indeed a very challenging work and requires a dedicated dataset which imbibes in it variations in pose, emotion and even occlusions. The Current work aims at delivering a dataset for training and testing purposes.SJB Face dataset is one such Indian face image dataset, which can be used to recognize faces. SJB Face dataset contains face images which were collected from digital camera. The face dataset collected has certain conditions such as different pose, Expressions, face partially occluded and with a uniform attire.


Sign languages are the most common mode of communication with and between hearing-impaired individuals. In the Arab world, Arabic sign language is used with different dialects supporting a distinct set of rules for the gestures used. With research on natural language processing advancing, models have been developed to translate sign language to spoken language and vice versa. However, Arabic sign language has rarely been studied due to the lack of availability of datasets dealing with Arabic sign language.


The dataset contains short video clips of four shoulder exercises.

  1. Arm flexion and extension
  2. Arm abduction and adduction
  3. Arm lateral and medial rotation
  4. Arm circumduction


The videos are labeled as either correct or incorrect.



Deep video representation learning has recently attained state-of-the-art performance in video action recognition. However, when used with video clips from varied perspectives, the performance of these models degrades significantly. Existing VAR models frequently simultaneously contain both view information and action attributes, making it difficult to learn a view-invariant representation.