Datasets
Standard Dataset
X-ScanRefer
- Citation Author(s):
- Submitted by:
- Yiwei Ma
- Last updated:
- Sat, 08/26/2023 - 05:04
- DOI:
- 10.21227/5hhz-kx15
- License:
- Categories:
- Keywords:
Abstract
ScanReferr facilitates a clear correspondence between expressions and instances in 3D point cloud scenes, enabling effective identification of target objects. However, the explicit mention of the target object in the expression creates a shortcut that filters out negative samples, aiding model learning. In order to mitigate overreliance on this shortcut, we conducted manual processing of the ScanReferr dataset. Specifically, we replaced the name of the referring object with the term ``object'' while preserving the names of other objects. For example, consider the expression ``The trash can is to the left of the bookshelf. It is behind the chair.'' After processing, we replaced ``trash can'' with ``object'' while keeping ``bookshelf'' unchanged, resulting in the sentence ``The object is to the left of the bookshelf. It is behind the chair.'' By removing the explicit mention of the target object, the model is compelled to rely on additional information such as attributes and positional relationships within the expression to identify the target instance. In this paper, we will use the term X-ScanReferer to refer to the processed dataset.
ScanReferr facilitates a clear correspondence between expressions and instances in 3D point cloud scenes, enabling effective identification of target objects. However, the explicit mention of the target object in the expression creates a shortcut that filters out negative samples, aiding model learning. In order to mitigate overreliance on this shortcut, we conducted manual processing of the ScanReferr dataset. Specifically, we replaced the name of the referring object with the term ``object'' while preserving the names of other objects. For example, consider the expression ``The trash can is to the left of the bookshelf. It is behind the chair.'' After processing, we replaced ``trash can'' with ``object'' while keeping ``bookshelf'' unchanged, resulting in the sentence ``The object is to the left of the bookshelf. It is behind the chair.'' By removing the explicit mention of the target object, the model is compelled to rely on additional information such as attributes and positional relationships within the expression to identify the target instance. In this paper, we will use the term X-ScanReferer to refer to the processed dataset.