With the rapid development of augmented reality


This paper produces a data set containing 1127 images, using VOC12 format, the size of the image is 3840*2160, and the corresponding relation of file names


Semantic segmentation is the topic of interest among deep learning researchers in the recent era.  It has many applications in different domains including, food recognition. In the case of food recognition, it removes the non-food background from the food portion. There is no large public food dataset available to train semantic segmentation models. We prepared a dataset named ’SEG-FOOD’[44] containing images of FOOD101, PFID, and Pakistani Food dataset and open-sourced the annotated dataset for future research. We annotated the images using JS Segment annotator.


*  For detailed experimentation, please refer to our paper which is under review. we will update the link of that later.

* For starter code please refer to our Github repository. https://github.com/ghalib2021/SEGFOOD

* Note: This dataset contains images from Food101, PFID, and Pakistani Food Dataset. Our main contribution is the manual annotation of the food images for background removal using semantic segmentation and collection of Pakistani food dataset images. Please cite our work besides the original dataset collector if you are using a segmented dataset otherwise, cite the original dataset collector.


only a test la


The detection of settlements without electricity challenge track (Track DSE) of the 2021 IEEE GRSS Data Fusion Contest, organized by the Image Analysis and Data Fusion Technical Committee (IADF TC) of the IEEE Geoscience and Remote Sensing Society (GRSS), Hewlett Packard Enterprise, SolarAid, and Data Science Experts, aims to promote research in automatic detection of human settlements deprived of access to electricity using multimodal and multitemporal remote sensing data.

Last Updated On: 
Sun, 02/28/2021 - 07:59
Citation Author(s): 
Colin Prieur, Hana Malha, Frederic Ciesielski, Paul Vandame, Giorgio Licciardi, Jocelyn Chanussot, Pedram Ghamisi, Ronny Hänsch, Naoto Yokoya

A medium-scale synthetic 4D Light Field video dataset for depth (disparity) estimation. From the open-source movie Sintel. The dataset consists of 24 synthetic 4D LFVs with 1,204x436 pixels, 9x9 views, and 20–50 frames, and has ground-truth disparity values, so that can be used for training deep learning-based methods. Each scene was rendered with a clean pass after modifying the production file of Sintel with reference to the MPI Sintel dataset.



Light Field videos:
  • 24 synthetic scenes
  • 1,204x436 pixels
  • 9x9 views
  • 20--50 frames
Ground-truth disparity values:
  • Provides disparity values for all scenes, all views, all frames, and all pixels.
  • The disparity value was obtained by transforming the depth value obtained in Blender.
    • The unit of disparity is [mm], so if the unit of [px] is needed, it needs to be multiplied by 32 to convert. (Mentioned in this issue)
Light Field setup:
  • Rendering with a “clean” pass using Blender (render25 branch).
  • The Light Field was captured by moving the camera to 9x9 viewpoints with a baseline of 0.01[m] towards a common focal plane while keeping the optical axes parallel. 



Three types of datasets are provided on this page.

The reason for the three types is to eliminate the need to download extra data.

All types include all scene, all frames, and differ only in the RGB and disparity views.

1. Sintel_LFV_9x9_with_all_disp.zip
  • Includes RGB sequences for 9x9 views and disparity sequences for 9x9 views.
  • The unzipped file has 190GiB.
  • It can be used for a variety of depth estimations, e.g. not only light field but also (multi) stereo, as it includes the disparity for all views.
2. Sintel_LFV_9x9.zip
  • Includes RGB sequences for 9x9 views and disparity sequences for center view.
  • The unzipped file has 51.4GiB.
  • It can be used for light field-based depth estimations using 9x9 views.
3. Sintel_LFV_cross-hair.zip
  • Includes RGB sequences for cross-hair views and disparity sequences for center view.
  • The unzipped file has 12.1GiB.
  • It can be used for light field-based depth estimations using cross-hair views.
    • This is the data we used in our paper. (Note: We didn't use the scene named shaman_b_2 because it was not completed at that time.)

* The datasets contain RGB in .png and disparity in .npy.


File structure.

The following is the case of Sintel_LFV_9x9_with_all_disp.

In other cases, there is no view directory or no disparity file.

The naming convention for the view directory is {viewpoint_y:02}_{viewpoint_x:02} with 00_00 being the upper left viewpoint.


  ┣━━ ambushfight_1/    ...    scene directory
  ┃          ┣━━ 00_00/ ...    view directory
  ┃          ┃         ┣━━ 000.png ...    RGB of frame 0
  ┃          ┃         ┣━━ 000.npy ...    disparity of frame 0
  ┃          ┃         ┣━━ 001.png ...    RGB of frame 1
  ┃          ┃         ┣━━ 001.npy ...    disparity of frame 1
  ┃          ┣━━ 04_04/ ...    center view directory 
  ┃          ┃         ┣━━ 000.png ...    RGB of frame 0
  ┃          ┃         ┣━━ 000.npy ...    disparity of frame 0
  ┃          ┗━━ .../
  ┣━━ ambushfight_2/
  ┣━━ ambushfight_3/
  ┗━━ .../



Retinal Fundus Multi-disease Image Dataset (RFMiD) consisting of a wide variety of pathological conditions. 


Detailed instructions about this dataset are available on the challenge website: https://riadd.grand-challenge.org/.


Computer vision in animal monitoring has become a research application in stable or confined conditions.

Detecting animals from the top view is challenging due to barn conditions.

In this dataset called ICV-TxLamb, images are proposed for the monitoring of lamb inside a barn.

This set of data is made up of two categories, the first is lamb (classifies the only lamb), the second consists of four states of the posture of lambs, these are: eating, sleeping, lying down, and normal (standing or without activity ).