Computer Vision

The Numerical Latin Letters (DNLL) dataset consists of Latin numeric letters organized into 26 distinct letter classes, corresponding to the Latin alphabet. Each class within this dataset encompasses multiple letter forms, resulting in a diverse and extensive collection. These letters vary in color, size, writing style, thickness, background, orientation, luminosity, and other attributes, making the dataset highly comprehensive and rich.

Categories:
461 Views

Our video action dataset is generated using a 3D simulation program developed in Unity. Each data sample consists of a video capturing a human performing various actions. Our initial set of actions comprises a total of 10 different yoga poses: camel, chair, child's pose, lord of the dance, lotus, thunderbolt, triangle, upward dog, warrior II, and warrior III. Within each of these 10 yoga poses, there are four variations, some exhibiting more pronounced differences than others. This results in a total of 40 action types within our dataset.

Categories:
100 Views

This dataset features a wide range of synthetic American Sign Language (ASL) digits, spanning numbers 0 through 9. These ASL sign representations were meticulously crafted using Unity software, resulting in dynamic 3-D scenes set against diverse backgrounds. To enhance the dataset's comprehensiveness, it includes contributions from three distinct subjects, adding a rich variety of ASL digit gestures. This diversity makes it a valuable resource for researchers interested in ASL digit recognition and gesture analysis.

Categories:
343 Views

Quantifying performance of methods for tracking and mapping tissue in endoscopic environments is essential for enabling image guidance and automation of medical interventions and surgery. Datasets developed so far either use rigid environments, visible markers, or require annotators to label salient points in videos after collection. These are respectively: not general, visible to algorithms, or costly and error-prone. We introduce a novel labeling methodology along with a dataset that uses said methodology, Surgical Tattoos in Infrared (STIR).

Categories:
1245 Views

This dataset was used to support our work and provided to the review for reference.

Categories:
86 Views

Recognizing and categorizing banknotes is a crucial task, especially for individuals with visual impairments. It plays a vital role in assisting them with everyday financial transactions, such as making purchases or accessing their workplaces or educational institutions. The primary objectives for creating this dataset were as follows:

Categories:
266 Views

This dataset contains video-clips of five volunteers developing daily life activities. Each video-clip is recorded with a Far InfraRed (FIR) camera and includes an associated file which contains the three-dimensional and two-dimensional coordinates of the main body joints in each frame of the clip. This way, it is possible to train human pose estimation networks using FIR imagery.

Categories:
312 Views

SYPHAXAR dataset is a dataset for Arabic text detection in the wild. It was collected from Tunisia in “Sfax” city, the second largest Tunisian city after the capital. A total of 3078 images were gathered through manual collection one by one, with each image energizing text detection challenges in nature according to real existing complexity of 15 different routes along with ring roads, intersections and roundabouts. These annotated images consist of more than 31000 objects, each of which is enclosed within a bounding box.

Categories:
218 Views

It is important to accurately classify the defects in hot rolled steel strip since the detection of defects in hot rolled steel strip is closely related to the quality of the fifinal product. The lack of actual hot-rolled strip defect data sets currently limits further research on the classifification of hot-rolled strip defects to some extent. In real production, the convolutional neural network (CNN)-based algorithm has some diffificulties, for example, the algorithm is not particularly accurate in classifying some uncommon defects.

Categories:
309 Views

Blade damage inspection without stopping the normal operation of wind turbines has significant economic value. This study proposes an AI-based method AQUADA-Seg to segment the images of blades from complex backgrounds by fusing optical and thermal videos taken from normal operating wind turbines. The method follows an encoder-decoder architecture and uses both optical and thermal videos to overcome the challenges associated with field application.

Categories:
619 Views

Pages