Solving the external perception problem for autonomous vehicles and driver-assistance systems requires accurate and robust driving scene perception in both regularly-occurring driving scenarios (termed “common cases”) and rare outlier driving scenarios (termed “edge cases”). In order to develop and evaluate driving scene perception models at scale, and more importantly, covering potential edge cases from the real world, we take advantage of the MIT-AVT Clustered Driving Scene Dataset and build a subset for the semantic scene segmentation task.

Instructions: 

 

MIT DriveSeg (Semi-auto) Dataset is a set of forward facing frame-by-frame pixel level semantic labeled dataset (coarsely annotated through a novel semiautomatic annotation approach) captured from moving vehicles driving in a range of real world scenarios drawn from MIT Advanced Vehicle Technology (AVT) Consortium data.

 

Technical Summary

Video data - Sixty seven 10 second 720P (1280x720) 30 fps videos (20,100 frames)

Class definitions (12) - vehicle, pedestrian, road, sidewalk, bicycle, motorcycle, building, terrain (horizontal vegetation), vegetation (vertical vegetation), pole, traffic light, and traffic sign

 

Technical Specifications, Open Source Licensing and Citation Information

Ding, L., Glazer, M., Terwilliger, J., Reimer, B. & Fridman, L. (2020). MIT DriveSeg (Semi-auto) Dataset: Large-scale Semi-automated Annotation of Semantic Driving Scenes. Massachusetts Institute of Technology AgeLab Technical Report 2020-2, Cambridge, MA. (pdf)

Ding, L., Terwilliger, J., Sherony, R., Reimer, B. & Fridman, L. (2020). MIT DriveSeg (Manual) Dataset. IEEE Dataport. DOI: 10.21227/nb3n-kk46.

 

Attribution and Contact Information

This work was done in collaboration with the Toyota Collaborative Safety Research Center (CSRC). For more information, click here.

For any questions related to this dataset or requests to remove identifying information please contact driveseg@mit.edu.

 

Categories:
1748 Views

Semantic scene segmentation has primarily been addressed by forming representations of single images both with supervised and unsupervised methods. The problem of semantic segmentation in dynamic scenes has begun to recently receive attention with video object segmentation approaches. What is not known is how much extra information the temporal dynamics of the visual scene carries that is complimentary to the information available in the individual frames of the video.

Instructions: 

 

MIT DriveSeg (Manual) Dataset is a forward facing frame-by-frame pixel level semantic labeled dataset captured from a moving vehicle during continuous daylight driving through a crowded city street.

The dataset can be downloaded from the IEEE DataPort or demoed as a video.

 

Technical Summary

Video data - 2 minutes 47 seconds (5,000 frame) 1080P (1920x1080) 30 fps

Class definitions (12) - vehicle, pedestrian, road, sidewalk, bicycle, motorcycle, building, terrain (horizontal vegetation), vegetation (vertical vegetation), pole, traffic light, and traffic sign

 

Technical Specifications, Open Source Licensing and Citation Information

Ding, L., Terwilliger, J., Sherony, R., Reimer, B. & Fridman, L. (2020). MIT DriveSeg (Manual) Dataset for Dynamic Driving Scene Segmentation. Massachusetts Institute of Technology AgeLab Technical Report 2020-1, Cambridge, MA. (pdf)

Ding, L., Terwilliger, J., Sherony, R., Reimer, B. & Fridman, L. (2020). MIT DriveSeg (Manual) Dataset. IEEE Dataport. DOI: 10.21227/mmke-dv03.

 

Related Research

Ding, L., Terwilliger, J., Sherony, R., Reimer. B. & Fridman, L. (2019). Value of Temporal Dynamics Information in Driving Scene Segmentation. arXiv preprint arXiv:1904.00758. (link)

 

Attribution and Contact Information

This work was done in collaboration with the Toyota Collaborative Safety Research Center (CSRC). For more information, click here.

For any questions related to this dataset or requests to remove Identifying information please contact driveseg@mit.edu.

 

Categories:
3340 Views