Four fully annotated marine image datasets. The annotations are given as train and test splits that can be used to evaluate machine learning methods.


The following classes of fauna were used for annotation:

  • anemone
  • coral
  • crustacean
  • ipnops fish
  • litter
  • ophiuroid
  • other fauna
  • sea cucumber
  • sponge
  • stalked crinoid

For a definition of the classes see [1].

A dataset file contains the following files:

  • annotations/test.csv: The BIIGLE CSV annotation report of the annotations of the test split of this dataset. These annotations are used to test the performance of the trained Mask R-CNN model.
  • annotations/train.csv: The BIIGLE CSV annotation report of the annotations of the train split of this dataset. These annotations are used to generate the annotation patches which are transformed with scale and style transfer to be used to train the Mask R-CNN model.
  • images/: Directory that contains all the original image files.
  • dataset.json: JSON file that contains information about the dataset.
    • name: The name of the dataset.
    • images_dir: Name of the directory that contains the original image files.
    • metadata_file: Path to the CSV file that contains image metadata.
    • test_annotations_file: Path to the CSV file that contains the test annotations.
    • train_annotations_file: Path to the CSV file that contains the train annotations.
    • annotation_patches_dir: Name of the directory that should contain the scale- and style-transferred annotation patches.
    • crop_dimension: Edge length of an annotation or style patch in pixels.
  • metadata.csv: A CSV file that contains metadata for each original image file. In this case the distance of the camera to the sea floor is given for each image.

CUPSNBOTTLES is an object data set, recorded by a mobile service robot. There are 10 object classes, each with a varying number of samples. Additionally, there is a clutter class, containing samples where the object detector failed.


Download and extract the ZIP file containing all files. There is python code available (under 'scripts') to easily load the data set. Other programming languages should also handle .jpg, .hdf and .csv files for easy access. For easy access with python, a pickle dump file has been added. This has no extra information compared to the .csv file.


Along with the increasing use of unmanned aerial vehicles (UAVs), large volumes of aerial videos have been produced. It is unrealistic for humans to screen such big data and understand their contents. Hence methodological research on the automatic understanding of UAV videos is of paramount importance.


=================  Authors  ===========================

Lichao Mou,

Yuansheng Hua,

Pu Jin,

Xiao Xiang Zhu,


=================  Citation  ===========================

If you use this dataset for your work, please use the following citation:


  title= {{ERA: A dataset and deep learning benchmark for event recognition in aerial videos}},

  author= {Mou, L. and Hua, Y. and Jin, P. and Zhu, X. X.},

  journal= {IEEE Geoscience and Remote Sensing Magazine},

  year= {in press}



==================  Notice!  ===========================

This dataset is ONLY released for academic uses. Please do not further distribute the dataset on other public websites.


This is a dataset having paired thermal-visual images collected over 1.5 years from different locations in Chitrakoot, India and Prayagraj, India. The images can be broadly classified into greenery, urban, historical buildings and crowd data.

The crowd data was collected from the Maha Kumbh Mela 2019, Prayagraj, which is the largest religious fair in the world and is held every 6 years.



The images are classified according to the thermal imager they were used to capture them with.

The SONEL thermal images are inside register_sonel.

The FLIR images are in register_flir and register_flir_old. There are 2 image zip files because FLIR thermal imagers reuse the image names after a certain limit.

The unregistered images are kept as files inside each base zip as unreg folders.


The work associated with this database, which details the registration method, the overall logic behind the creation of this database, resizing factors and the reason why there are unregistered images, is a work on thermal image colorization has been submited to IEEE for consideration, and is currently pre printed and available on arXiv.

We ask that you refer to this work when using this database for your work.

A Novel Registration & Colorization Technique for Thermal to Cross Domain Colorized Images 


If you find any problem with the data in this dataset (missing images, wrong names, superfluous python files etc), please let us know and we will try to correct the same.


The naming classification is as follows:

·         FLIR

o   Registered images are named as <name>.jpg and <name>_color.png with the png file being the optical registered image

o   The raw files are named as FLIR<#number>.jpg and FLIR<#number+1>.jpg where the initial file is the thermal image

o   The unreg_flir folder contains just the raw files

·         SONEL

o   Registered images are named as <name>.jpg and <name>_color.png with the png file being the optical registered image

o   The raw files are named as IRI_<name>.jpg and VIS_< name >.jpg where the IRI file is the thermal image and VIS is the visual image

o   The unreg folder contains just the raw files


As developers create or analyze an application,they often want to visualize the code through some graphical notation that aids their understanding of the code’s structure or behavior. In order to do this, we develop a integrated debugger.The debugger first record the walkthrough of application as assembly instructions by dynamic way.Then compression mapping block transforms previous outcome into three-dimensional-linked list structure,which then transformed into tree structure by the improved suffix tree algorithm.


The zizania image dataset consists of a total of 4900 zizanias. The quantity of high quality samples is 2648 and defective quality samples is 2252.

There are four classes in the apple image dataset, which are apples with a diameter greater than 90 mm, between 80 mm and 90 mm, less than 80 mm, and diseases and insect pests. The quantity distributionin above categories are 3647 (51.19%), 2464 (34.59%), 558 (7.83%), 455 (6.39%).


Beijing Building Dataset(BGB) is an elevation satellite image dataset which is integrated by satellite image and aerial photograph for building detection and identification. It contains 2000 images from Google Earth History Map of five different areas in Beijing on November 24th, 2016, and all these images are 512*512 in resolution ratio with a precision of 0.458m. It covers more than 100 km2 geographic areas of Beijing both in suburbs and urban areas.


In recent years, the utilization of biometric information has become more and more common for various forms of identity verification and user authentication. However, as a consequence of the widespread use and storage of biometric information, concerns regarding sensitive information leakage and the protection of users' privacy have been raised. Recent research efforts targeted these concerns by proposing the Semi-Adversarial Networks (SAN) framework for imparting gender privacy to face images.


Double-identity fingerprint is a fake fingerprint created by aligning two fingerprints for maximum ridge similarity and then joining them along an estimated cutline such that relevant features of both fingerprints

are present on either sides of the cutline. The fake fingerprint containing the features of the criminal and his innocuous accomplice can be enrolled with an electronic machine readable travel document and later used to cross the automated


Semantic Segmentation Image