Wide-area Sentinel-1 SAR mosaics patches over Finland for semantic segmentation

0 ratings - Please login to submit your rating.


Here, a dataset used in manuscript "Wide-Area Land Cover Mapping with Sentinel-1 Imagery using Deep Learning Semantic Segmentation Models" Scepanovic et al. (https://doi.org/10.1109/JSTARS.2021.3116094) is published. The data contains preprocessed SAR backscatter digital numbers as 7000 geotiff image patches of size 512x512 (about 10 km x 10 km size) sampled from several wide-area SAR mosaics compiled from all summer Sentinel-1A images  acquired over Finland in the summer of 2018. The geographical area where image patches are sampled covers the territory of Finland located to the south of 66.0∘latitude, which is nearly the whole country without Finnish Lapland. Southern Finland is primarily covered by boreal forests with lakes, marshes, open bogs, agricultural areas, and urban settlements. The SAR data are accompanied by reference ("ground truth") dataset representing 5 basic land cover classes produced based on Finnish version of CORINE Land Cover map, 2018.


Original SAR images were downloaded as Level-1 ground range detected (GRD) products. They represent focused SAR data that has been detected, multilooked and projected to ground range using an Earth ellipsoid. The images were orthorectified using the VTT's Technical Research Centre of Finland in-house software employing the local digital terrain model (with 20-m resolution) available from the National Land Survey of Finland. The pixel spacing of orthorectified scenes was set to 20 m. Orthorectification included terrain flattening to obtain the backscatter signal in gamma-naught format.

The scenes were further reprojected to the ERTS89/ETRS-TM35FIN projection (EPSG:3067) and resampled to a final pixel size of 20 m. The Sentinel-1 images were mosaiced into seven homogeneous SAR mosaics covering the whole territory of Finland. Each mosaic was compiled from approximately 90 Sentinel-1 IW images(both ascending and descending paths), and it took approximately 12 days to collect enough imagery to have the whole country covered. Altogether seven SAR mosaics were produced during summer 2018. These SAR mosaics are further used for sampling the training, development, and testing images.

The sampling approach is illustrated in the Figure.  Image patches from spatial area C2 were set aside for testing (2800 images), while image patches from C1&C3 (4200) were used in model training and validation.

Stored digital numbers represent backscatter amplitude DN. Geotiff files store two bands (VH- and VV-polarization data). Backscatter values in dB can be obtained as gamma0_dB =20*log10(DN/536)

However, amplitude DNs without dB-conversion were used in the manuscript by Scepanovic et al. https://doi.org/10.1109/JSTARS.2021.3116094

For further use in the Semantic segmentation suite, additional 3rd band, VH-to-VV ratio, had to be computed. Further, all bands need to be normalized as described in the paper, so that input value in range 0-255 are feeded to the deep learning models. After calculating the third ratio band, e.g. such coefficients can be applied: [ 255/186; 255/354; 255/1.1 ].

In principle, other pre-processing options are possible, including computing dB-values for VH and VV bands before the normalization.

Dataset Files



File readme.pdf99.49 KB