Multi-Modal Robust Geometry Primitive Shape Scene Abstraction for Grasp Detection

Citation Author(s):: Hamed Hosseini

MohammadHosein Koosheshi

Mehdi Tale Masouleh

Ahmad Kalhor
Submitted by:: Mehdi Tale Masouleh
Last updated:: Mon, 09/16/2024 - 11:18
DOI:: 10.21227/jdp3-xh70
Data Format:: COCO format
Research Article Link:: Multi-Modal Robust Geometry Primitive Shape Scene Abstraction for Grasp Detecti…
Links:: Multi-Modal Robust Geometry Primitive Shape Scene Abstraction for Grasp Detecti…

370 views

Categories:

Artificial Intelligence

Keywords:

Grasp detection; primitive shapes; segmentation; CoppeliaSim;mask R-CNN; delta parallel robot

ACCESS DATASET CITE

Abstract

Scene understanding is essential for a wide range of robotic tasks, such as grasping. Simplifying the scene into predefined forms makes the robot perform the robotic task more properly, especially in an unknown environment. This paper proposes a combination of simulation-based and realworld datasets for domain adaptation purposes and grasping in practical settings. In order to compensate for the weakness of depth images in previous studies reported in the literature for clearly representing boundaries, the RGB image has also been fed as input in RGB and RGB-D input modalities. The implemented architecture is based on the Mask R-CNN network with a backbone of ResNet101. By using RGB and RGB-D images as input, the proposed approach has thus improved the segmentation Dice score over primitive shape abstraction by 3.73% and 6.19%, respectively. Moreover, in order to improve and evaluate the robustness of the model to occlusion and a variety of primitive shapes and colors that may occur in the scene, different versions of simulation-based datasets are generated using the Coppeliasim simulator. Additionally, a real-world primitive shape abstraction dataset is created to make the model more robust in more complex objects and real-world experiments. To further generalize the model to apply to a wider range of objects, new primitive shapes, such as cones, and both filled and hollow types of each primitive shape, are considered. Subsequently, the point clouds of the segmented parts are generated, and the ICP algorithm is used to derive the 6-DOF grasp parameters using reference primitive shapes and their predefined grasps. Simulation experiments result in a 95% grasp success rate using the Coppeliamsim simulation environment on unseen objects. A Delta parallel robot and a 2-fingered fabricated gripper are used for practical experiments. These experiments yielded a 98% grasp success rate on common objects used in baseline evaluations, outscoring the state-of-the-art by 2%. Real-world tests also include scenes with multiple objects and cluttered scenes.

Instructions:

This dataset consists of synthetic, simulation-based data generated using the CoppeliaSim simulator, along with a real-world dataset collected by a Kinect camera.