Computer Vision

The ICRA conference is celebrating its 40th anniversary in Rotterdam in September 2024, with as highlight the Happy Birthday ICRA Party at the iconic Holland America Line Cruise Terminal. One month later the IROS conference will take place, which will include the Earth Rover Challenge. In this challenge open-world autonomous navigation models are studied truly open-world settings.

Categories:
140 Views

Smart focal-plane and in-chip image processing has emerged as a crucial technology for vision-enabled embedded systems with energy efficiency and privacy. However, the lack of special datasets providing examples of the data that these neuromorphic sensors compute to convey visual information has hindered the adoption of these promising technologies.

Categories:
122 Views

Grasping and manipulating objects is an important human skill. Since hand-object contact is fundamental to grasping, capturing it can lead to important insights. However, observing contact through external sensors is challenging because of occlusion and the complexity of the human hand. We present ContactDB, a novel dataset of contact maps for household objects that captures the rich hand-object contact that occurs during grasping, enabled by use of a thermal camera. Participants in our study grasped 3D printed objects with a post-grasp functional intent.

Categories:
92 Views

Grasping is natural for humans. However, it involves complex hand configurations and soft tissue deformation that can result in complicated regions of contact between the hand and the object. Understanding and modeling this contact can potentially improve hand models, AR/VR experiences, and robotic grasping. Yet, we currently lack datasets of hand-object contact paired with other data modalities, which is crucial for developing and evaluating contact modeling techniques. We introduce ContactPose, the first dataset of hand-object contact paired with hand pose, object pose, and RGB-D images.

Categories:
365 Views

Infrared dim-small target detection has gained increasing importance in both military and civilian applications due to its ability to detect thermal radiation, operate effectively at night, passively sense radiation, and offer strong concealment with high resistance to interference. These capabilities make it ideal for systems such as aircraft and bird surveillance, missile guidance, and maritime rescue operations. In these applications, the need for mid- to long-range observations often results in small targets that appear dim and are difficult to detect.

Categories:
299 Views

This dataset, titled "Synthetic Sand Boil Dataset for Levee Monitoring: Generated Using DreamBooth Diffusion Models," provides a comprehensive collection of synthetic images designed to facilitate the study and development of semantic segmentation models for sand boil detection in levee systems. Sand boils, a critical factor in levee integrity, pose significant risks during floods, necessitating accurate and efficient monitoring solutions.

Categories:
288 Views

This dataset presents the recognition of handwritten hieroglyphic alphabets. In this dataset we use consisting of 18 distinct classes of hieroglyphs alphabets. The dataset is designed to facilitate research in the field of ancient script recognition, particularly focusing on handwriting variability and pattern recognition. Each class represents a unique hieroglyph, with samples collected to ensure a diverse range of writing styles. To create this dataset, 25 students each handwrote samples for all 18 classes of hieroglyphs. Afterward, we carefully photographed each image.

Categories:
353 Views

The SynoClip dataset is a standard dataset specifically prepared for the video synopsis task, featuring videos that represent real-world surveillance conditions. It includes six videos, with durations ranging from 8 to 45 minutes, captured from outdoor-mounted surveillance cameras. Each video comes with manually annotated tracking information, enabling direct comparisons between different video synopsis models.

Categories:
63 Views

3D-COCO is a dataset composed of MS COCO images with 3D models aligned on each instance. 3D-COCO was designed to achieve computer vision tasks such as 3D reconstruction or image detection configurable with textual, 2D image, and 3D CAD model queries.

3D-COCO is an extension of the original MS-COCO dataset providing 3D models and 2D-3D alignment annotations. We complete the existing MS-COCO dataset with 28K 3D models collected on ShapeNet and Objaverse. By using an IoU-based method, we match each MS-COCO annotation with the best 3D models to provide a 2D-3D alignment.

Categories:
580 Views

There is growing widespread adoption of augmented reality in tech-driven industries and sectors of society, such as medicine, gaming, flight simulation, education, interior design and modelling, entertainment, construction, tourism, repair and maintenance, public safety, agriculture, and quantum computing. However, ensuring smooth and intuitive interactions with augmented objects is challenging, requiring practical performance evaluation and optimisation models to assess and improve users' experiences as they engage with AR-enhanced devices or systems.

Categories:
90 Views

Pages