Scene Perception Datasets
Cityscapes a new large-scale dataset that contains a diverse set of stereo video sequences recorded in street scenes from 50 different cities, with high quality pixel-level annotations of 5 000 frames in addition to a larger set of 20 000 weakly annotated frames. The dataset is thus an order of magnitude larger than similar previous attempts. Details on annotated classes and examples of our annotations are available at https://www.cityscapes-dataset.com/dataset-overview/#features.
The Cambridge-driving Labeled Video Database (CamVid) is the first collection of videos with object class semantic labels, complete with metadata. The database provides ground truth labels that associate each pixel. Details are available at http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/.
The README and various scripts for inspection, preparation, and evaluation can be found in git repository: