Simulated and phantom colon data for 6-dof camera pose estimation

Citation Author(s):
Min
Tan
Wentao
Jin
Gaosheng
Xie
Zeyang
Xia
Jing
Xiong
Submitted by:
Min Tan
Last updated:
Mon, 12/11/2023 - 22:15
DOI:
10.21227/af5p-mp40
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Due to the complex and unstructured nature of the intestine, 3D reconstruction and visual navigation are imperative for clinical endoscopists performing the skill-intensive colonoscopy. Unsupervised 3D reconstruction methods, as a mainstream paradigm in auto-driving scenarios, exploit warping loss to predict 6-DOF pose and depth information jointly. However, owing to illumination inconsistency, repeated texture regions, and non-Lambertian reflection, the geometry warping constraint cannot be efficiently applied to the colonic environment. Therefore, we propose a novel feature point-based method to handle the pose estimation in the colonic scenario. Specifically, PoseNet incorporates an attention module that learns the feature points embedding, enabling the network to concentrate on distinguishable tissue regions. In addition, to finalize visual reconstruction, we proposed a DepthNet that adopts a multi-scale structure and scale-invariant loss to alleviate the ambiguity of the global scale. We also introduce a clinically-oriented colon dataset containing simulated and phantom data to facilitate the study of transfer-learning and prospective clinical applications. To validate our framework’s generalization ability on different datasets, we train our model on a public dataset but test it on the proposed datasets without any fine-tuning. The results reveal its superior performance. All datasets are publicly available for effective quantitative benchmarking.

Instructions: 

Detailed information about the phantom can be viewed at the following website.
https://www.gtsimulators.com/collections/gastrointestinal-tract-simulato....
For the hardware of the dataset acquisition, we assembled and self-developed a flexible endoscope and imaging device, and placed the Aurora electromagnetic tracking system on top of the phantom.
The final collected data format is color RGB image and magnetic positioning position and pose data. RGB image resolution is 640 × 480 with black edge. we also provide the scripts of data pre-process in practice. In order for the endoscope to insert the body membrane smoothly, we used lubricating oil during capturing the image, so the image has a little bit of water reflection as like the clinical image. We recorded the data 7 times, including the forward and backward posture of the endoscope, more details are as follows:
Record01 (545 frames):backward from the tip of phantom as a very fast speed.
Record02 (1951 frames):backward from the tip of phantom as a very slow speed.
Record03 (1950 frames):backward from the tip of phantom as a very slow speed.
Record04 (838 frames):backward from the tip of phantom as a very slow speed.
Record05 (1952 frames):forward from the start of phantom as a very slow speed.
Record06 (1564 frames):forward from the start of phantom as a very slow speed.
Record07 (1952 frames):backward from the start of phantom as a very slow speed.
Note: The Record03 and Record04 can form a complete trajectory, namely part1 and part2, same as Record05 and Record06.