Datasets
Standard Dataset
SeaIceWeather
- Citation Author(s):
- Submitted by:
- Nabil Panchi
- Last updated:
- Fri, 04/12/2024 - 21:32
- DOI:
- 10.21227/q3v5-3348
- Data Format:
- Research Article Link:
- Links:
- License:
- Categories:
- Keywords:
Abstract
SeaIceWeather Dataset
This is the SeaIceWeather dataset, collected for training and evaluation of deep learning based de-weathering models. To the best of our knowledge, this is the first such publicly available dataset for the sea ice domain. This dataset is linked to our paper titled: Deep Learning Strategies for Analysis of Weather-Degraded Optical Sea Ice Images. The paper can be accessed at: https://doi.org/10.1109/jsen.2024.3376518
Abstract of the paper:
Ship-based sea ice analysis algorithms rely on optical images captured in optimal weather conditions with high visibility. However, Arctic imagery is often affected by weather-related degradation due to haze, snow, and rain, impacting the efficacy of deep learning tasks for sea ice analysis, such as segmentation and classification. This article introduces and evaluates two strategies to address weather-induced degradation in optical sea ice images (RGB). Strategy 1 employs a two-step pipeline: first, removal of weather degradation using deep learning-based de-weathering algorithms, and then, analysis of images as a part of sea ice segmentation/classification tasks. Strategy 2 proposes a “weather as augmentation” training approach to create all-in-one weather-resilient segmentation and classification models. Furthermore, we introduce the first open-source ice image dataset (SeaIceWeather) with paired images—one clean and one weather-degraded. Such a dataset allows for training and validation of supervised deep learning-based de-weathering algorithms. Using this dataset, we show that the proposed strategies are effective against weather-degraded images, achieving parity with the segmentation and classification performance on clean images. In addition, we demonstrate de-weathering models capable of removing degradations due to four different weather conditions, including rain, haze, snow, and raindrops on a camera lens with a single set of weights. The presented strategies lay the foundation for robust shipborne sea ice analysis systems resilient to adverse weather conditions. Furthermore, we hope that the dataset introduced in this study ignites further interest in the analysis of weather-degraded sea ice images.
Data collection methodology
This dataset was collected on the GoNorth 2023 cruise. A two GoPro Hero 11 camera setup was used to collect clean + raindrop degraded sea ice imagery. More information about the data collection procedure can be found in Section IV of our paper.
Description of the data
The main dataset folder (SeaIceWeather) is organized as follows:
SeaIceWeather/ -Clean/ -train_0.JPG -train_1.JPG ... -train_1699.JPG -valid_0.JPG -valid_1.JPG ... -valid_1599.JPG -Haze/ -train_0.JPG -train_1.JPG ... -train_1699.JPG -valid_0.JPG -valid_1.JPG ... -valid_1599.JPG ... -License.txt -Readme.md -valid_camera_to_screen_dist.csv
There are a total of 3300 images (Width of 512 pixels, Height of 384 pixels) in each of the subfolders. These are divided into training and validation set, as used for our paper mentioned above. The training images are marked with train_ prefix, i.e., train_0.JPG, train_1.JPG, ..., train_1699.JPG. The validation images are marked with valid_ prefix, i.e., valid_0.JPG, valid_1.JPG, ..., valid_1599.JPG.
For each clean image in the clean folder, seven different variations of the same image were created for four different types of weather conditions: snow, rain, haze, and raindrops. Three types of snow degradations were created, Snow-S, Snow-M, Snow-L, where S, M, and L indicate the sizes of snow particles, i.e., small, medium, and large. Two types of raindrop degradation were created: Raindrop-real, collected on the GoNorth 2023 cruise, and Raindrop-sim, with simulated raindrops. Details about the creation and collection of our dataset can be found in Section IV of our paper.
For the validation images, the distance between the camera and the screen with droplets was also recorded (roughly), and it is provided in the 'valid_camera_to_screen_dist.csv' file, which is organized as follows:
Image Name,Camera to Screen Distance [cm] valid_0.JPG,0 valid_1.JPG,1 valid_2.JPG,2 valid_3.JPG,3 ... valid_1599.JPG,6
The first column presents the image name, and the second column provides the rough distance between the camera and the screen with water droplets. Note that this is only relevant to the Raindrop-real degradation.
File formats and preprocessing
The images are in .JPG format. The images with real raindrops were semi-manually aligned with the clean images. Information about pre-processing is provided in the Section IV of our paper.
Authors
- Nabil Panchi (nabilpa@ntnu.no, panchinabil@gmail.com)
- Ekaterina Kim (ekaterina.kim@ntnu.no)
Ownership
This dataset was created as part of Nabil Panchi's PhD work and is owned by the Norwegian University of Science and Technology (NTNU).
License
This dataset is licensed under the CC BY-NC-SA 4.0 license. See the license terms in the License.txt file. These terms were taken from https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.en, accessed on 12.04.24.
Acknowledgments
The authors wish to thank:
- Mr. Stian Eine for his help in creation of the data collection module.
- The GoNorth cruise managers for allowing us to participate in the GoNorth 2023 cruise.
The computations were performed on resources provided by UNINETT Sigma2 - the National Infrastructure for High Performance Computing and Data Storage in Norway.