<p>This is the image dataset for satellite image processing&nbsp; which is a collection therml infrared and multispectral images .</p>


Dataset images
Thermal infrared images and multispectral images
image size:512x512
file :.h5


In the field of 3D reconstruction, although there exist some standard datasets for evaluating the segmentation results of close-up 3D models, these datasets cannot be used to evaluate the segmentation results of 3D models based on satellite images. To address this issue, we provide a standard dataset for evaluating the segmentation results of satellite images and their corresponding DSMs. In this dataset, the satellite images maintain an exact correspondence with the DSMs, thus the segmentation results of both satellite images and DSMs can be evaluated by our proposed dataset.


The accompanying dataset for the CVSports 2021 paper: DeepDarts: Modeling Keypoints as Objects for Automatic Scoring in Darts using a Single Camera

Paper Abstract:


The recommended way to load the labels is to use the pandas Python package:

import pandas as pd

labels = pd.read_pickle("labels.pkl")

See github repository for more information: https://github.com/wmcnally/deep-darts


We present a 4D Light Field (LF) video dataset collected by the camera matrix, to be used for designing and testing algorithms and systems for LF video coding and processing. For collecting these videos, a 10x10 LF capture matrix composed of 100 cameras is designed and implemented, and the resolution of each camera is 1920x1056. The videos are taken in real and varying illumination conditions. The dataset contains a total of nine groups of LF videos, of which eight groups are collected with a fixed camera matrix position and a fixed orientation.


The stream data of all LF videos are stored under the directory LF_video_data, and each video contains 100 stream files. The stream files are in 1920 x 1056 24 frames per second MP4 format without audio. There are 100 mat files in the Calibration_mat directory, which store all camera parameters of 100 lenses within the matrix. The depth estimation results are stored under the directory Depth_evaluation_result, which is for reference only.


The dermoscopic images considered in the paper "Dermoscopic Image Classification with Neural Style Transfer" are available for public download through the ISIC database (https://www.isic-archive.com/#!/topWithHeader/wideContentTop/main). These are 24-bit JPEG images with a typical resolution of 768 × 512 pixels. However, not all the images in the database are in satisfactory condition.


In a context of rapid urban evolution, there is a need of surveying cities. Nowadays predictive models based on machine learning require large amount of data to be trained, hence the necessity of providing some public dataset allowing to follow up urban evolution. While most of changes occurs onto the vertical axis, there is no public change detection dataset composed of 3D point clouds and directly annotated according to the change at point level yet.


Urban Point Clouds simulator

We have developed a simulator to generate time series of point clouds (PCs) for urban datasets. Given a 3D model of a city, the simulator allows us to introduce random changes in the model and generates a synthetic aerial LiDAR Survey (ALS) above the city. The 3D model is in practice issued from a real city, e.g. with a Level of Detail 2 (LoD2) precision. From this model, we extract each existing building as well as the ground. By adding or removing buildings in the model, we can simulate construction or demolition of buildings. Notice that depending on area the ground is not necessarily flat. The simulator allows us to obtain as many 3D PCs over urban changed areas as needed. It could be useful especially for deep learning supervised approaches that require lots of training dates. Moreover, the created PCs are all directly annotated by the simulator according to the changes, thus no time-consuming manual annotation needs to be done with this process.

For each obtained model, the ALS simulation is performed thanks to a flight plan and ray tracing with the Visualisation ToolKit (VTK) python library.  Space between flight lines is computed in accordance to predefined parameters such as resolution, covering between swaths and scanning angle. Following this computation, a flight plan is set with a random starting position and direction of flight in order to introduce more variability between two acquisitions. Moreover, Gaussian noise can be added to simulate errors and lack of precision in LiDAR range measuring and scan direction.

Dataset Description

To conduct fair qualitative and quantitative evaluation of PC change detection techniques, we have build some datasets based on LoD2 models of the first and second districts of Lyon  (https://geo.data.gouv.fr/datasets/0731989349742867f8e659b4d70b707612bece89), France. For each simulation, buildings have been added or removed to introduce changes in the model and to generate a large number of pairs of PCs. We also consider various initial states across simulations, and randomly update the set of buildings from the first date through random addition or deletion of  buildings to create the second landscape. In addition, flight starting position and direction are always set randomly. As a consequence, the acquisition patterns will not be the same between generated PCs, thus each acquisition may not have exactly the same visible or hidden parts.

This first version of the dataset is composed of point clouds at a challenging low resolution of around 0.5 points/meter².

Technical details

All PCs are available at PLY format. Each train, val, test folder contains sub-folders containing pairs of PCs : pointCloud0.ply and pointCloud1.ply for both first and second dates.

Each ply file contain the coordinates X Y Z of each points and the label:

  • 0 for unchanged points
  • 1 for points on a new building
  • 2 for for points on a destruction.

The label is given in a scalar field named label_ch. Notice that the first PC (pointCloud0.ply) has a label field even if it is set at 0 for every points because change are set in comparison to previous date.


If you use this dataset for your work, please use the following citation:

@inproceedings{degelis2021,  title={Benchmarking change detection in urban 3D point clouds},  author={de G\'elis, I. and Lef\`evre, S. and Corpetti, T. and Ristorcelli, T. and Th\'enoz, C. and Lassalle, P.},  booktitle={IEEE International Geoscience and Remote Sensing Symposium (IGARSS)},  year={2021}}


Reverse transcription-polymerase chain reaction (RT-PCR) is currently the gold standard in COVID-19 diagnosis. It can, however, take days to provide the diagnosis, and false negative rate is relatively high. Imaging, in particular chest computed tomography (CT), can assist with diagnosis and assessment of this disease. Nevertheless, it is shown that standard dose CT scan gives significant radiation burden to patients, especially those in need of multiple scans.



“Dataset-S1” contains two folders for COVID-19 and Normal DICOM images, named as “COVID-S1” and “Normal-S1”, respectively. Within the same folder, three CSV files are available. The first one, named as “Radiologist-S1.csv”, contains labels assigned to the corresponding cases by three experienced radiologists. The second CSV file, “Clinical-S1.csv”, includes the clinical information as well as the result of the RT-PCR test, if available. The third file is named “LDCT-SL-Labels-S1.csv” and contains the slice-level labels related to COVID-19 cases. In other words, slices demonstrating infection are specified in this file.

Each row in this CSV file corresponds to a specific case, and each column represents the slice number in the volumetric CT scan. Label 1 indicates a slice with the evidence of infection, while 0 is assigned to slices with no evidence of infection.

Note that slices in each case should be sorted based on the “Slice-Location” value to match with the provided labels in the CSV file. The Slice Location values are stored in DICOM files and accessible from the following DICOM tag: (0020,1041) – DS – Slice Location

 “Dataset-S2” contains 100 COVID-19 positive cases, confirmed with RT-PCR test. 68 cases have related imaging findings, whereas 32 do not reveal signs of infection. These two groups are placed in two folders of “PCP-Lung-Positive “and “PCP-Lung-Negative”. “Dataset-S2” also includes a CSV file, namely “Clinical-S2.csv” presenting the clinical information.



The AOLAH databases are contributions from Aswan faculty of engineering to help researchers in the field of online handwriting recognition to build a powerful system to recognize Arabic handwritten script. AOLAH stands for Aswan On-Line Arabic Handwritten where “Aswan” is the small beautiful city located at the south of Egypt, “On-Line” means that the databases are collected the same time as they are written, “Arabic” cause these databases are just collected for Arabic characters, and “Handwritten” written by the natural human hand.


* There are two databases; first database is for Arabic characters, it consists of 2,520 sample files written by 90 writers using simulation of a stylus pen and a touch screen. The second database is for Arabic characters’ strokes, it consists of 1,530 sample files for 17 strokes. The second database is extracted from the previous accepted database by extracting strokes from characters.
* Writers are volunteers from Aswan faculty of engineering with ages from 18 to 20 years old.
* Natural writings with unrestricted writing styles.
* Each volunteer writes the 28 characters of Arabic script using the GUI.
* It can be used for Arabic online characters recognition.
* The developed tools for collecting the data is code acts as a simulation of a stylus pen and a touch screen, pre-processing data samples of characters are also available for researchers.
* The database is available free of charge (for academic and research purposes) to the researchers.
* The databases available here are the training databases.


The images containing honey bees were extracted from the video recorded in the Botanic Garden of the University of Ljubljana, where a beehive with a colony of the Carnolian Grey, the native Slovene species, is placed. We set the camera above the beehive entrance and recorded the honey bees on the shelf in front of the entrance and the honey bees entering and exiting the hive. With such a setup, we ensured a non-invasive recording of the honey bees in their natural environment. The dataset contains 65 images of size 2688 x 1504 pixels.


The dataset consists of two classes: COVID-19 cases and Healthy cases 


Unzip the dataset