2020 IEEE GRSS Data Fusion Contest
The 2020 Data Fusion Contest, organized by the Image Analysis and Data Fusion Technical Committee (IADF TC) of the IEEE Geoscience and Remote Sensing Society (GRSS) and the Technical University of Munich, aims to promote research in large-scale land cover mapping based on weakly supervised learning from globally available multimodal satellite data. The task is to train a machine learning model for global land cover mapping based on weakly annotated samples. For this task, the SEN12MS dataset will be used, which contains corresponding triplets of Sentinel-1 SAR images, Sentinel-2 multi-spectral images, and MODIS-derived land cover maps.
The Contest: Goals and Organization
The 2020 Data Fusion Contest, organized by the Image Analysis and Data Fusion Technical Committee (IADF TC) of the IEEE Geoscience and Remote Sensing Society (GRSS) and the Technical University of Munich, aims to promote research in large-scale land cover mapping from globally available multimodal satellite data.
The task is to train a machine learning model for global land cover mapping based on weakly annotated samples. The 2020 Data Fusion Contest will consist of two challenge tracks:
- Track 1: Land cover classification with low-resolution labels
- Track 2: Land cover classification with low- and high-resolution labels
Scientific papers describing the best entries will be included in the Technical Program of IGARSS 2020, presented in an invited session “IEEE GRSS Data Fusion Contest”, and published in the IGARSS 2020 Proceedings.
The contest aims to promote innovation in automatic large-scale land cover mapping, as well as to provide objective and fair comparisons among methods. The ranking is based on quantitative accuracy parameters computed with respect to undisclosed test samples. Participants will be given a limited time to submit their land cover maps after the competition started. The contest will consist of three phases:
- Phase 1: Participants are provided with the SEN12MS dataset for training and additional validation images (without any corresponding high-resolution labels) to train and validate their algorithms. Participants can submit prediction results for the validation set to the Codalab competition website (https://competitions.codalab.org/competitions/22289) to get feedback on the performance (the website will be open on January 13th, 2020). The performance of the last submission from each account will be displayed on the leaderboard. In parallel, participants submit a short description of the approach used to be eligible to enter Phases 2 and 3.
- Phase 2 (Track 1): Participants receive the test data set (without any corresponding high-resolution labels) and submit their land cover maps within five days from the release of the test data set.
- Phase 3 (Track 2): Participants receive semi-manually generated high-resolution labels for the validation set and the problem setting changes to Track 2. Participants submit their land cover maps within two weeks. In parallel, they submit a short description of the approach used. After evaluation of the results, seven winners from the two tracks are announced. Following this, they will have one month to write their manuscript that will be included in the IGARSS proceedings. Manuscripts are 4-page IEEE-style formatted. Each manuscript describes the addressed problem, the proposed method, and the experimental results.
- December 13th, 2019: Contest opening: release of training and validation data
- January 13th, 2020: Validation server with public leaderboard is open
- February 28th, 2020: Short description of the approach for Track 1 is sent to email@example.com (using IGARSS paper template)
- March 1st, 2020: Release of test data; test server for Track 1 is open
- March 6th, 2020: Submission deadline for Track 1: the submission server for Track 1 is closed; Release of high-resolution labels for the validation set; test server for Track 2 is open
- March 20th, 2020: Submission deadline for Track 2: the submission server for Track 2 is closed
- March 25th, 2020: Short description of the approach for Track 2 is sent to firstname.lastname@example.org (using IGARSS paper template)
- March 27th, 2020: Winner announcement
In the contest, we use the SEN12MS dataset  (https://mediatum.ub.tum.de/1474000) for training the land cover prediction models. This publicly available dataset includes more than 180,000 triplets of corresponding Sentinel-1 SAR data, Sentinel-2 multispectral imagery, and MODIS-derived land cover maps sampled across the entire globe. While all data are provided at a ground sample distance (GSD) of 10m, the Sentinel images have a native resolution of about 10-20m per pixel, whereas the MODIS-derived land cover has a native resolution of 500m per pixel. The main challenge, therefore, is to train powerful models for high-resolution land cover prediction (target resolution: 10m) from noisy, low-resolution annotations. For validation and testing, semi-manually derived high-resolution land cover maps of scenes with undisclosed geolocation and not contained in the SEN12MS dataset are used. The validation and test datasets will be distributed step-by-step at IEEE DataPort (https://ieee-dataport.org/competitions/2020-ieee-grss-data-fusion-contest).
Sentinel-1 and Sentinel-2 satellite data
180,662 patch pairs of corresponding Sentinel-1 dual-pol SAR data and Sentinel-2 multispectral images. In detail:
- Sentinel-1 SAR: 2 channels corresponding to sigma nought backscatter values in dB scale for VV and VH polarization.
- Sentinel-2 Multispectral: 13 channels corresponding to the 13 spectral bands (B1, B2, B3, B4, B5, B6, B7, B8, B8a, B9, B10, B11, B12).
The patches comprising the dataset are distributed across the land masses of the Earth and spread over all four meteorological seasons. This is reflected by the dataset structure. All patches are provided in the form of 16-bit GeoTiffs with a GSD of 10m.
MODIS-derived land cover labels
Routinely, the data of the MODIS instruments onboard the Terra and Aqua satellites of NASA are used to derive land cover maps at yearly intervals using six different classification schemes. The final land cover maps undergo additional post-processing including prior knowledge and ancillary information and come at a resolution of 500m per pixel.
By default, SEN12MS contains a 4-layer image of MODIS-derived land cover maps oversampled to a pixel spacing of 10m for every Sentinel-1/Sentinel-2 patch pair, according to the following classification schemes: IGBP , and LCCS Land Cover, Land Use, and Surface Hydrology .
For the contest, a simplified version of the IGBP classification scheme is used. More details can be found at http://www.grss-ieee.org/community/technical-committees/data-fusion/
In order to load the SEN12MS training data in a form compatible to the contest (i.e., Sentinel-1, Sentinel-2, and 500m-resolution land cover labels following the simplified IGBP scheme), a Python-based data loader is provided at IEEE DataPort.
The validation images are provided in the same format that also this data loader prepares, but the semi-manually created land cover labels have a GSD of 10 m.
Note again: For the sake of convenience, ALL data follow a 10m-per-pixel sampling, regardless of their resolution.
Reference:  Schmitt, M., Hughes, L. H., Qiu, C., and Zhu, X. X., (2019). SEN12MS – A curated dataset of georeferenced multi-spectral Sentinel-1/2 imagery for deep learning and data fusion, ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. IV-2/W7: 153–160.  Loveland, T. R. and Belward, A. S., (1997). The international geosphere biosphere programme data and information system global land cover data set (DISCover), Acta Astronautica, 41(4): 681–689.  Di Gregorio, A., (2005). Land cover classification system: classification concepts and user manual, LCCS. Food & Agriculture Org.
High-resolution land cover mapping with a global model
For validation and testing, patches of corresponding Sentinel-1 and Sentinel-2 data with all channels sampled to a pixel spacing of 10m are provided for undisclosed locations somewhere around the world. The objective is to predict a land cover map following the simplified IGBP scheme at a resolution of 10m. Participants are required to submit their land cover maps in raster format (similar to the tif files of the training and validation sets). Performance is assessed using average accuracy (AA). To promote innovation in two practical scenarios, the 2020 Data Fusion Contest consists of two challenge tracks:
Track 1: Land cover classification with low-resolution labels
Semi-manually derived high-resolution land cover maps for the validation set are kept undisclosed. The objective is to predict land cover labels using only low-resolution labels for training.
Track 2: Land cover classification with low- and high-resolution labels
Participants receive high-resolution labels for the validation set, and the goal is to train models for land cover mapping using both low-resolution labels and a limited number of high-resolution labels.
Results, Awards, and Prizes:
The following seven teams will be declared as winners:
- The first, second, third, and fourth ranked teams in track 1
- The first, second, and third ranked teams in track 2
The authors of the seven winning submissions will:
- Present their manuscripts in an invited session dedicated to the Contest at IGARSS 2020
- Publish their manuscripts in the Proceedings of IGARSS 2020
The first ranked teams in both tracks will:
- Receive a special prize at IGARSS 2020
- Co-author a journal paper (in a limit of 3 co-authors per team), which will summarize the outcome of the Contest and will be submitted to IEEE JSTARS. To maximize impact and promote research in global land cover mapping with weak supervision, the open-access option will be used for this journal submission.
Top ranked teams will be awarded during IGARSS 2020, Waikoloa, Hawaii, USA in July 2020. The costs for open-access publication will be supported by the GRSS. The winner team prize is kindly sponsored by the IGARSS 2020 team.
The rules of the game:
- The SEN12MS dataset can be openly downloaded at https://mediatum.ub.tum.de/1474000
- Validation and test data can be requested by registering for the Contest at IEEE DataPort.
- To enter the contest, participants must read and accept the Contest Terms and Conditions.
- Participants of the contest are intended to submit land cover maps following the simplified IGBP scheme in raster format (similar to the tif files of the training and validation sets).
- For sake of visual comparability of the results, all land cover maps shown in figures or illustrations should follow the color palette of the simplified IGBP scheme detailed in the class table above.
- The classification results will be submitted to the Codalab competition website (https://competitions.codalab.org/competitions/22289) for evaluation.
- Ranking between the participants will be based on average accuracy (AA) of all classes.
- The maximum number of trials of one team for each classification challenge is ten in the test phase.
- Submission server for track 1 will be open from March 1, 2020. Deadline of classification result submission for track 1 is March 6, 2020, 23:59 UTC – 12 hours (e.g., March 6, 2020, 7:59 in New York City, 13:59 in Paris, or 19:59 in Beijing).
- Submission server for track 2 will be open from March 6, 2020. Deadline of classification result submission for track 2 is March 20, 2020, 23:59 UTC – 12 hours (e.g., March 20, 2020, 7:59 in New York City, 13:59 in Paris, or 19:59 in Beijing).
- Each team needs to submit a short paper of 1–2 pages clarifying the used approach, the team members, their Codalab accounts, and one Codalab account to be used for the test phase by Feburary 28, 2020. Please send a paper to email@example.com using the IGARSS paper template.
- For the seven winners, internal deadline for full paper submission is April 24, 2020, 23:59 UTC – 12 hours (e.g., April 24, 2020, 7:59 in New York City, 13:59 in Paris, or 19:59 in Beijing). IGARSS Full paper submission is May 29, 2020.
- While submitting a classification result, each team will acknowledge that, should the result be among the winners, at least one team member will participate to the invited session at IGARSS 2020.
Failure to follow any of these rules will automatically make the submission invalid, resulting in the manuscript not being evaluated and disqualification from prize award.
Participants to the Contest are requested not to submit an extended abstract to IGARSS 2020 by the corresponding conference deadline in January 2020. Only contest winners (participants corresponding to the seven best-ranking submissions) will submit a 4-page paper describing their approach to the Contest by April 24, 2020. The received manuscripts will be reviewed by the Award Committee of the Contest, and reviews sent to the winners. Then winners will submit the final version of the 4 full-paper to IGARSS Data Fusion Contest Invited Session by May 29, 2020, for inclusion in the IGARSS Technical Program and Proceedings.
The contest is being organized in collaboration with the research group for Signal Processing in Earth Observation (SiPEO) at the Technical University of Munich (TUM). The IADF TC chairs would like to thank TUM-SiPEO for providing the data used in the competition, and the IEEE GRSS for continuously supporting the annual Data Fusion Contest through funding and resources. Original Copernicus Sentinel Data 2019 are available from the European Space Agency (https://sentinel.esa.int/). MODIS-derived land cover maps are available from the Land Processes Distributed Active Archive Center (https://lpdaac.usgs.gov/).
Contest Terms and Conditions
The data are provided for the purpose of participation in the 2020 Data Fusion Contest. Participants acknowledge that they have read and agree to the following Contest Terms and Conditions:
- The data can be used in scientific publications subject to approval by the IEEE GRSS Image Analysis and Data Fusion Technical Committee and by the Technical University of Munich on a case-by-case basis. To submit a scientific publication for approval, the publication shall be sent as an attachment to an e-mail addressed to firstname.lastname@example.org.
- In any scientific publication using the data, the data shall be referenced as follows: “[REF. NO.] 2020 IEEE GRSS Data Fusion Contest. Online: http://www.grss-ieee.org/community/technical-committees/data-fusion”.
- Any scientific publication using the data shall include a section “Acknowledgement”. This section shall include the following sentence: “The authors would like to thank the research group for Signal Processing in Earth Observation at the Technical University of Munich for providing the data used in this study, and the IEEE GRSS Image Analysis and Data Fusion Technical Committee for organizing the Data Fusion Contest.
- Any scientific publication using the data shall refer to the following paper:
- [Schmitt et al., 2019] M. Schmitt, L. H. Hughes, C. Qiu, and X. X. Zhu, “SEN12MS – A curated dataset of georeferenced multi-spectral sentinel-1/2 imagery for deep learning and data fusion,” in ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. IV-2/W7, 2019, pp. 153–160.