Semi-supervised video object segmentation aims to leverage the ground truth object masks given for the first frame to segment video sequences at the pixel level. OVOS is a dataset to evaluate the performance of video object segmentation under occlusions.