Datasets
Standard Dataset
Ground-based Pixel-level Cloud Dataset (GPCD)
- Citation Author(s):
- Submitted by:
- fy hu
- Last updated:
- Mon, 11/11/2024 - 10:54
- DOI:
- 10.21227/fdef-gx52
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
Using the PVIFS-02 whole-sky imagers, we collected 500,000 independent cloud images from 2021 to 2023, captured in a southern city and a northern city in China. The cloud images collected in southern China are clear, with obvious cloud edges. In contrast, the cloud images from northern China appear relatively blurred. This difference is attributed to the geographical characteristics of northern China, where regions are frequently affected by sand and dust, leading to a certain degree of image blurring. It brings challenges to cloud detection and classification.
In order to train and test the algorithms for pixel-level cloud detection and classification, 714 images which contained various types of clouds were selected, and were manually annotated at the pixel-level after being normalized to a resolution of 1024 $\times$ 1024. The annotated dataset, referred to as the Ground-based Pixel-level Cloud Dataset (GPCD), as shown in Fig. \ref{fig3}. This dataset contains two types of annotation files, one with a binarized cloud-sky segmentation and the other classifying clouds into eight categories at the pixel-level according to cloud genera definitions of the World Meteorological Organization (WMO), cloud approximate appearance and sky conditions in practice. Table \ref{tab1} presents the cloud genera and descriptions for each category in GPCD.
To further enhance the robustness and applicability of the GPCD, the dataset was subdivided into two region-specific subsets: GPCD-North and GPCD-South. This subdivision was based on the geographical origin of the images, with GPCD-North comprising data collected from northern China and GPCD-South encompassing data from southern China. The rationale behind creating these subsets is to account for regional atmospheric differences that may influence cloud morphology and behavior. By conducting separate analyses on these subsets, we aim to evaluate the performance and robustness of cloud detection and classification algorithms in the context of regional variations. This approach not only allows for a more nuanced understanding of algorithm performance across diverse climatic conditions but also facilitates the testing of transfer learning capabilities.
This is a image dataset, use python to read it。
Comments
We will upload the complete dataset after the article is accepted.