The LEDNet dataset consists of image data of a field area that are captured from a mobile phone camera.

Images in the dataset contain the information of an area where a PCB board is placed, containing 6 LEDs. Each state of the LEDs on the PCB board represents a binary number, with the ON state corresponding to binary 1 and the OFF state corresponding to binary 0. All the LEDs placed in sequence represent a binary sequence or encoding of an analog value.


For the task of detecting casualties and persons in search and rescue scenarios in drone images and videos, our database called SARD was built. The actors in the footage have simulate exhausted and injured persons as well as "classic" types of movement of people in nature, such as running, walking, standing, sitting, or lying down. Since different types of terrain and backgrounds determine possible events and scenarios in captured images and videos, the shots include persons on macadam roads, in quarries, low and high grass, forest shade, and the like.


Crowds express emotions as a collective individual, which is evident from the sounds that a crowd produces in particular events, e.g., collective booing, laughing or cheering in sports matches, movies, theaters, concerts, political demonstrations, and riots.


Extract locally the zip files, read the readme file.

Instructions for dataset usage are included in the open access paper: Franzoni, V., Biondi, G., Milani, A., Emotional sounds of crowds: spectrogram-based analysis using deep learning (2020) Multimedia Tools and Applications, 79 (47-48), pp. 36063-36075.

File are released under Creative Commons Attribution-ShareAlike 4.0 International License


The early detection of damaged (partially broken) outdoor insulators in primary distribution systems is of paramount importance for continuous electricity supply and public safety. In this dataset, we present different images and videos for computer vision-based research. The dataset comprises images and videos taken from different sources such as a Drone, a DSLR camera, and a mobile phone camera.


Please find the attached file for complete description


Indirect hand measurement processes have been used to improve remote accessibility and non-contact acquisition methods. This is particularly helpful when developing custom products, such as prostheses or gloves, to a user. Indirect hand measurements, however, may be difficult to acquire due to the requirement that certain specifications to be met. In the case of indirect measurement determination from 3D scans, obstructions may affect the observed outcome. This is especially true when using low-cost 3D scanners that have not been optimized for medical use.


The data is provided in .xlsx Excel format. It contains one sheet that includes all the hand measurements corresponding with the devices used. Key measurements for each device are summarized in four .txt files, each belonging to a separate scanner. The .R file is included and contains the R code used for statistical analysis of the observed measurements. 


Data for "A Framework for Recognizing and Estimating Human Concentration Levels"


Segmentation of TC clouds in 2016. The segmentation task was accomplished by an algorithm which takes a time series of brightness temperature images of TCs and uses image processing techniques to acquire segmentation for each image in a semi-supervised manner. 


2016 TC cloud segmentation animation


 Histopathological characterization of colorectal polyps allows to tailor patients' management and follow up with the ultimate aim of avoiding or promptly detecting an invasive carcinoma. Colorectal polyps characterization relies on the histological analysis of tissue samples to determine the polyps malignancy and dysplasia grade. Deep neural networks achieve outstanding accuracy in medical patterns recognition, however they require large sets of annotated training images.


In order to load the data, we provide below an example routine working within PyTorch frameworks. We provide two different resolutions, 800 and 7000 um/px.

Within each resolution, we provide .csv files, containing all metadata information for all the included files, comprising:

  • image_id;
  • label (6 classes - HP, NORM, TA.HG, TA.LG, TVA.HG, TVA.LG);
  • type (4 classes - HP, NORM, HG, LG);
  • reference WSI;
  • reference region of interest in WSI (roi);
  • resolution (micron per pixels, mpp);
  • coordinates for the patch (x, y, w, h).

Below you can find the dataloader class of UNITOPatho for PyTorch. More examples can be found here.

import torch

import torchvision

import numpy as np

import cv2

import os


class UNITOPatho(

def __init__(self, df, T, path, target, subsample=-1, gray=False, mock=False):

self.path = path

self.df = df

self.T = T = target

self.subsample = subsample

self.mock = mock

self.gray = gray

allowed_target = ['type', 'grade', 'top_label']

if target not in allowed_target:

print(f'Target must be in {allowed_target}, got {target}')


print(f'Loaded {len(self.df)} images')

def __len__(self):

return len(self.df)

def __getitem__(self, index):

entry = self.df.iloc[index]

image_id = entry.image_id

image_id = os.path.join(self.path, entry.top_label_name, image_id)

img = None

if self.mock:

C = 1 if self.gray else 3

img = np.random.randint(0, 255, (224, 224, C)).astype(np.uint8)


img = cv2.imread(image_id)

if self.subsample != -1:

w = img.shape[0]

while w//2 > self.subsample:

img = cv2.resize(img, (w//2, w//2))

w = w//2

img = cv2.resize(img, (self.subsample, self.subsample))

if self.gray:

img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

img = np.expand_dims(img, axis=2)


img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

if self.T is not None:

img = self.T(img)

return img, entry[]


This is a collection of 2D and 3D images used for grayscale image processing tests. It includes at least 8 images of each of the following sizes: