GeoNRW

Citation Author(s):
Gerald
Baier
RIKEN AIP
Antonin
Deschemps
INRIA
Michael
Schmitt
Munich University of Applied Sciences
Naoto
Yokoya
The University of Tokyo & RIKEN AIP
Submitted by:
Naoto Yokoya
Last updated:
Mon, 11/23/2020 - 08:27
DOI:
10.21227/s5xq-b822
License:
5
2 ratings - Please login to submit your rating.

Abstract 

This dataset consists of orthorectified aerial photographs, LiDAR derived digital elevation models and segmentation maps with 10 classes, acquired through the open data program of the German state North Rhine-Westphalia (https://www.opengeodata.nrw.de/produkte/) and refined with OpenStreeMap. Please check the license information (http://www.govdata.de/dl-de/by-2-0). Preprocessing consists of resampling the 0.1m resolution photographs to 1m, taking the first LiDAR return while averaging within 1m² to arrive at the same resolution as the photographs, and rasterizing vector files of the land cover data. In total the dataset consists of 7783 triplets of size 1000x1000 pixels.

Instructions: 

Dataset description

The data was mostly acquired over urban areas in North-Rhine Westphalia, Germany. Since the acquisition dates for the aerial photographs and LiDAR do not match exactly, there can be discrepancies in what they show and in which season, e.g., trees change their leaves or lose them in autumn. In our experience, these differences are not drastic but should be kept in mind.

We have included two Python scripts. plot_examples.py creates the example image used on this website. calc_and_plot_stats.py calculates and plots the class statistics. Furthermore, we published the code to create the dataset at https://github.com/gbaier/geonrw, which makes it easy to extend the dataset with other areas in North-Rhine Westphalia. The repository also contains a PyTorch data loader.

This multimodal dataset should be useful for a variety of tasks. Image segmentation using multiple inputs, height estimation from the aerial photographs, or semantic image synthesis.

Organization

Similar to the original source of the data (https://www.opengeodata.nrw.de/produkte/geobasis/lbi/dop/dop_jp2_f10_paketiert/), we organize all samples by the city they were acquired over. Their filenames, e.g., 345_5668_rgb.jp2 consists of the UTM zone 32N coordinates and the datatype (RGB, DEM or seg for land cover).

File formats

All data is geocoded and can be opened using QGIS (https://www.qgis.org/). The aerial photographs are stored as JPEG2000 files, the land cover maps and digital elevation models both as GeoTIFFs. The accompanying scripts show how to read the data into Python.

Comments

Can this dataset be used for water body segmentation?

Submitted by Swati Gautam on Sun, 12/20/2020 - 05:22