Equidistant and Uniform Data Augmentation for 3D Objects

Citation Author(s):
Alexander
Morozov
Skoltech
Davide
Zgyatti
Skoltech
Petr
Popov
Skoltech
Submitted by:
Alexander Morozov
Last updated:
Wed, 01/05/2022 - 20:01
DOI:
10.21227/ya9z-zk95
Data Format:
Links:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Data augmentation is commonly used to increase the size and diversity of the datasets in machine learning. It is of particular importance to evaluate the robustness of the existing machine learning methods. With progress in geometrical and 3D machine learning, many methods exist to augment a 3D object, from the generation of random orientations to exploring different perspectives of an object. In high-precision applications, the machine learning model must be robust with respect to the small perturbations of the input object. Therefore, there is a need for 3D data augmentation tools that consider the distribution of distance metrics between the original and augmented objects. Here we present Eurecon, the first 3D data augmentation approach with spatial control over the augmented samples. It generates objects uniformly distributed over a sphere and with the user-defined radius, which is a distance with respect to the original object. Eurecon is applicable to both point cloud and polygon mesh representations of the 3D objects, as demonstrated on the ModelNet dataset. The method is particularly useful in assessing and improving the machine learning model’s robustness with respect to the transformations of a small magnitude. We demonstrated the superior performance of a point cloud-based model (PointNet++) and a mesh-based model (MeshNet) when trained on datasets augmented with Eurecon, compared to non-augmented and randomly augmented models.

Instructions: 

Uploaded eurecon_dataset.zip archive contains 2 model-specfifc augmented versions of Princeton ModelNet40 dataset (for MeshNet and PointNet++ architectures correspondingly). Each archive contains 5 subfolders, corrseponding to dataset augmentation types - Original (no augmentation), Random (augmentation by random sample orientation), RMSD001 (controlled spatial augmentation with RMSD parameter set to 0.01), RMSD01 (controlled spatial augmentation with RMSD parameter set to 0.1) and RMSD02 (controlled spatial augmentation with RMSD parameter set to 0.2). Each of the aforementioned folders contatins 2 more subfolders - proccessed and raw. Folder "raw" contains original dataset class folder subdivison (40 folders), each containing an original ModelNet40 sample with .off extension and 20 Eurecon-augmented samples - 258 531 files in total. Folder "processed" contains samples, which were preprocessed for an exact machine learning architectur: MN folder contains original dataset class folder subdivison (40 folders), each containing an original ModelNet40 sample with .npz extension and 20 Eurecon-augmented samples - 258 531 files in total (with an exception of MN Random folder, which contains 258 510 files, due to resampling tool inability to correctly process a single original dataset sample), while PN folder contains 258 531 .pt files, placed directly into the "processed" folder.