Computer Vision

This dataset provides RGB and Depth images acquired by Kinect v2 of 10 cerebral palsy patients. For each subject (0001, 0002, ecc) there are 12 folders: 

- 5 folders containing 5 left full gait cycles (L_01, L_02, ecc)

- 5 folders containing 5 right full gait cycles (R_01, R_02. ecc)

- 1 folder containing one static lateral view (left side) of the subject while standing upright (L_s)

- 1 folder containing one static lateral view (right side) of the subject while standing upright  (R_s)

In each folder (dynamic and static) there are two subfolders:


We present the RQMD dataset, a comprehensive collection of diverse material samples aimed at advancing computer vision and machine learning algorithms in terrain classification tasks. This dataset contains RGB images of 5 different terrains, such as Asphalt, Brick, Grass, Gravel, and Tiles, captured using an 8-megapixel Raspberry Pi camera from a top-view perspective. Notably, the dataset encompasses images taken at different times of the day, introducing variations in lighting conditions and environmental factors.


Lettuce Farm SLAM Dataset (LFSD) is a VSLAM dataset based on RGB and depth images captured by VegeBot robot in a lettuce farm. The dataset consists of RGB and depth images, IMU, and RTK-GPS sensor data. Detection and tracking of lettuce plants on images are annotated with the standard Multiple Object Tracking (MOT) format. It aims to accelerate the development of algorithms for localization and mapping in the agricultural field, and crop detection and tracking.


Dronescape presents a dataset comprising 25 drone videos showcasing vast areas filled with trees, rivers, and mountains. The dataset includes two subsets: 25 videos with tree segmentation and 25 videos without tree segmentation, offering diverse perspectives on the presence and absence of segmented tree regions. The dataset focuses on highlighting the regions containing trees using the SAM (Segment Anything Model) and Track Anything library. Video object tracking and segmentation techniques are utilized to track the regions of trees throughout the dataset.


The morphological characteristics of skeletal muscles, such as fascicle orientation, fascicle length, and muscle thickness, contain valuable mechanical information that aids in understanding muscle contractility and excitation due to commands from the central nervous system. Ultrasound (US) imaging, a non-invasive measurement technique, has been employed in clinical research to provide visualized images that capture morphological characteristics. However, accurately and efficiently detecting the fascicle in US images is challenging.


As a hot research topic, there are many related datasets for occlusion detection. Due to the different scenarios and definitions of occlusion for different tasks, there are significant differences between different occlusion detection datasets, making existing datasets difficult to apply to the video shot occlusion detection task. To this end, we contribute the first large-scale video shot occlusion detection dataset, namely VSOD, which serves as a benchmark for evaluating the performance of shot occlusion detection methods. 


The HQA1K dataset was developed for assessing the quality of Computer Generated Holography (CGH) image renderings based on direct human input.
HQA1K is comprised of 1,000 pairs of natural images matched to simulated CGH renderings of various quality levels. The result is a diverse set of data for evaluating image quality algorithms and models.


This Dataset used a non-invasive blood group prediction approach using deep learning. Rapid and meticulous prediction of blood type is a major step during medical emergency before supervising the red blood cell, platelet, and plasma transfusion. Any small mistake during transfer of blood can cause death. In conventional pathological assessment, the blood test is conducted using automated blood analyser; however, it results into time taking process.


This dataset was acquired during the dissertation entitled "Optical Camera Communications and Machine Learning for Indoor Visible Light Positioning". This work was carried out in the academic year 2020/2021 at the Instituto de Telecomunicações in Aveiro in the scope of the Integrated Master in Electronics and Telecommunications Engineering at the Department of Electronics, Telecommunication and Informatics of the University of Aveiro.


Sign language correctness discrimination (SLCD) dataset is collected for sign language teaching. Different from general sign language recognition datasets, SLCD dataset has two kind labels of sign language category and standardization category at the same time. The standardization category is to describe action correctness of the same sign language made by students. The SLCD dataset videos in this paper are obtained by camera. 76 students are recruited to collect sign language actions.