This dataset contains a collection of videos consisting of satellite imagery augmented with 3D ship models, accompanied by the ships' corresponding AIS data. The intention of this dataset is for detecting dark ships, which are sea vessels acting maliciously, often while spoofing their AIS data. Multiple datasets exist that consist of satellite imagery of ships, however this dataset has the advantage of including each ships' corresponding AIS data. The simulated ships include both normal and anomalous behavior, whether the anomalous behavior is benign or malicious.


Of late, efforts are underway to build computer-assisted diagnostic tools for cancer diagnosis via image processing. Such computer-assisted tools require capturing of images, stain color normalization of images, segmentation of cells of interest, and classification to count malignant versus healthy cells. This dataset is positioned towards robust segmentation of cells which is the first stage to build such a tool for plasma cell cancer, namely, Multiple Myeloma (MM), which is a type of blood cancer. The images are provided after stain color normalization.



If you use this dataset, please cite below publications-

  1. Anubha Gupta, Rahul Duggal, Shiv Gehlot, Ritu Gupta, Anvit Mangal, Lalit Kumar, Nisarg Thakkar, and Devprakash Satpathy, "GCTI-SN: Geometry-Inspired Chemical and Tissue Invariant Stain Normalization of Microscopic Medical Images," Medical Image Analysis, vol. 65, Oct 2020. DOI: (2020 IF: 11.148)
  2. Shiv Gehlot, Anubha Gupta and Ritu Gupta, "EDNFC-Net: Convolutional Neural Network with Nested Feature Concatenation for Nuclei-Instance Segmentation," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 1389-1393.
  3. Anubha Gupta, Pramit Mallick, Ojaswa Sharma, Ritu Gupta, and Rahul Duggal, "PCSeg: Color model driven probabilistic multiphase level set based tool for plasma cell segmentation in multiple myeloma," PLoS ONE 13(12): e0207908, Dec 2018. DOI: 10.1371/journal.pone.0207908

The early detection of damaged (partially broken) outdoor insulators in primary distribution systems is of paramount importance for continuous electricity supply and public safety. In this dataset, we present different images and videos for computer vision-based research. The dataset comprises images and videos taken from different sources such as a Drone, a DSLR camera, and a mobile phone camera.


Please find the attached file for complete description


The DREAM (Data Rang or EArth Monitoring): a multimode database including optics, radar, DEM and OSM labels for deep machine learning purposes.

DREAM, is a multimodal remote sensing database, developed from open-source data.

The database has been created using the Google Earth Engine platform, the GDAL python library; the “pyosm” python package developed by Alexandre Mayerowitz (Airbus, France). If you want to use this dataset in your study, please cite:


The two datasets are stored in two separate zip files: and

After unzip, each directory contain different sub directories with different areas. Each available tile is a 1024x1024 tile GeoTiffs format.

In France:

  • CoupleZZ_S2_date1_date2_XX_YY (Uint16 GeoTiff, UTM, RGB)
  • CoupleZZ_SRTM_V2_XX_YY (Int16 GeoTiff)
  • CoupleZZ_S1_date2_date1_XX_YY (Float32 GeoTiff 2 bands, Red:VV, Green: HV)
  • CoupleZZ_S1moy_date2__dual_XX_YY (Float32 GeoTiff 2 bands, Red:VV, Green: HV)
  • CoupleZZ_OSMraster_XX_YY (Uint8 3 bands RGB GeoTIff)

In the USA There are directories named zoneZ that include following subdirectories

  • optique     contains    *_pauli_x***_y***_optique.tif 
    • Ex: SanAnd_09018_18038_017_180730_L090_CX_01_pauli_x000_y002_optique.tif
  • radar                            *_pauli_x***_y***.tif 
    • Ex: SanAnd_09018_18038_017_180730_L090_CX_01_pauli_x000_y002.tif
  • S1                                 *_pauli_x***_y***_S1moy.tif 
    • Ex: SanAnd_09018_18038_017_180730_L090_CX_01_pauli_x000_y002_S1moy.tif
  • S2                                 *_pauli_x***_y***_S2mosa.tif 
    • Ex: SanAnd_09018_18038_017_180730_L090_CX_01_pauli_x000_y002_S2mosa.tif
  • SRTM                           *__x***_y***_hgt.tif
    • Ex:  SanAnd_09018_18038_017_180730_L090_CX_01__x000_y002_hgt.tif




Intracellular organelle networks such as the endoplasmic reticulum (ER) network and the mitochondrial network serve crucial physiological functions. Morphology of these networks plays critical roles in mediating their functions.Accurate image segmentation is required for analyzing morphology of these networks for applications such as disease diagnosis and drug discovery. Deep learning models have shown remarkable advantages in accurate and robust segmentation of these complex network structures.


Basil/Tulsi Plant is harvested in India because of some spiritual facts behind this plant,this plant is used for essential oil and pharmaceutical purpose. There are two types of Basil plants cultivated in India as Krushna Tulsi/Black Tulsi and Ram Tulsi/Green Tulsi.

Many of the investigator working on disease detection in Basil leaves where the following diseases occur

 1) Gray Mold

2) Basal Root Rot, Damping Off

 3) Fusarium Wilt and Crown Rot


Basil/Tulsi Plant is harvested in India because of some spiritual facts behind this plant,this plant is used for essential oil and pharmaceutical purpose. There are two types of Basil plants cultivated in India as Krushna Tulsi/Black Tulsi and Ram Tulsi/Green Tulsi.

Many of the investigator working on disease detection in Basil leaves where the following diseases occur

 1) Gray Mold

2) Basal Root Rot, Damping Off

 3) Fusarium Wilt and Crown Rot

4) Leaf Spot

5) Downy Mildew

The Quality parameters (Healthy/Diseased) and also classification based on the texture and color of leaves. For the object detection purpose researcher using an algorithm like Yolo,  TensorFlow, OpenCV, deep learning, CNN

I had collected a dataset from the region Amravati, Pune, Nagpur Maharashtra state the format of the images is in .jpg.


INDIA is the second-largest fruit and vegetable exporter in the world after China. It ranked first in the production of Bananas, Papayas, and Mangoes. Public datasets of fruits are available but they are limited to general fruit classes and failed to classify the fruits according to the fruit quality. To overcome this problem, we have created a dataset named FruitsGB (Fruits Good/Bad) dataset.


The data set contains 12 classes of fruits namely Bad Apple, Good Apple, Bad Banana, Good Banana, Bad Guava, Good Guava, Bad Lime, Good Lime, Bad Orange, Good Orange, Bad Pomegranate, and Good Pomegranate.


This is the data for paper "Environmental Context Prediction for Lower Limb Prostheses with Uncertainty Quantification" published on IEEE Transactions on Automation Science and Engineering, 2020. DOI: 10.1109/TASE.2020.2993399. For more details, please refer to 


Seven able-bodied subjects and one transtibial amputee participated in this study. Subject_001 to Subject_007 are able-bodied participants and Subject_008 is a transtibial amputee.


Each folder in the file has one continuous session of data with the following items: 

1. folder named "rpi_frames": the frames collected from the lower limb camera. Frame rate: 10 frames per second. 

2. folder named "tobii_frames": the frames collected from the on-glasses camera. Frame rate: 10 frames per second. 

3. labels_fps10.mat: synchronized terrain labels, gaze from the eye-tracking glasses, GPS coordinates, and IMU signals. 

3.1 cam_time: the timestamps for the videos, GPS, gazes, and labeled terrains (unit: second). 10Hz

3.2 imu_time: the timestamps for the IMU sensors (unit: second). 40Hz.

3.3 GPS: the GPS coordinates (latitude, longitude)

3.4 rpi_FrameIds, tobii_FrameIds: the frame ID for the lower-limb and on-glasses cameras respectively. The ids indicate the filenames in "rpi_frames" and "tobii_frames" respectively. 

3.5 rpi_IMUs, tobii_IMUs: the imu signals from the two devices. Columns: (accel_x,accel_y,accel_z,gyro_x,gyro_y,gyro_z)

3.6 terrains: the type of terrains the subjects are current on. Six terrains: tile, brick, grass, cement, upstairs, downstairs. "undefined" and "unlabelled" can be regarded as the same kind of data that needs to be deprecated.


The following sessions were collected during busy hours (many pedestrians were around):







The following sessions were collected during non-busy hours (few pedestrians were around):









The other sessions were collected without specific collecting hours (e.g. busy or non-busy). 

For the following sessions, the data collection devices were not optimized (e.g. non-optimal brightness balance). Thus, we recommend to use these sessions as training or validation dataset but not as testing data.








A composite dataset with eight videos (totaling the pronunciation of seventeen words, with intervals, sagittal plane, and gray scale), for experiments in computer vision, video processing, and articulation investigation of the vocal tract.


In this dataset:- There is no audio.- Sagittal image- Grey Scale


Nextmed project is a software platform for the segmentation and visualization of medical images. It consist on a series of different automatic segmentation algorithms for different anatomical structures and  a platform for the visualization of the results as 3D models.

This dataset contains the .obj and .nrrd files that correspond to the results of applying our automatic lung segmentation algorithm to the LIDC-IDRI dataset.

This dataset relates to 718 of the 1012 LIDC-IDRI scans.


The file consists in a folder for each result whith the .obj and .nrrd files generated by the Nextmed algorithms.