UNITOPATHO

Citation Author(s):: Luca Bertero (Medical Sciences Department, University of Turin, 10126, Torino, Italy)

Carlo Alberto Barbano (Computer Science Department, University of Turin, 10149 Torino, Italy)

Daniele Perlo (Computer Science Department, University of Turin, 10149 Torino, Italy)

Enzo Tartaglione (Computer Science Department, University of Turin, 10149 Torino, Italy)

Paola Cassoni (Medical Sciences Department, University of Turin, 10126, Torino, Italy)

Marco Grangetto (Computer Science Department, University of Turin, 10149 Torino, Italy)

Attilio Fiandrotti (Computer Science Department, University of Turin, 10149 Torino, Italy)

Alessandro Gambella

Luca Cavallo
Submitted by:: Enzo Tartaglione
Last updated:: Tue, 05/04/2021 - 12:53
DOI:: 10.21227/9fsv-tm25
Data Format:: *.png; *.csv
Links:: UNITOPATHO

7139 views

Categories:

Keywords:

CITE

Abstract

Histopathological characterization of colorectal polyps allows to tailor patients' management and follow up with the ultimate aim of avoiding or promptly detecting an invasive carcinoma. Colorectal polyps characterization relies on the histological analysis of tissue samples to determine the polyps malignancy and dysplasia grade. Deep neural networks achieve outstanding accuracy in medical patterns recognition, however they require large sets of annotated training images.

We introduce UniToPatho, an annotated dataset of 9536 hematoxylin and eosin stained patches extracted from 292 whole-slide images, meant for training deep neural networks for colorectal polyps classification and adenomas grading. The slides are acquired through a Hamamatsu Nanozoomer S210 scanner at 20× magnification (0.4415 μm/px). Each slide belongs to a different patient and is annotated by expert pathologists, according to six classes as follows:

NORM- Normal tissue;
HP- Hyperplastic Polyp;
TA.HG- Tubular Adenoma, High-Grade dysplasia;
TA.LG- Tubular Adenoma, Low-Grade dysplasia;
TVA.HG- Tubulo-Villous Adenoma, High-Grade dysplasia;
TVA.LG- Tubulo-Villous Adenoma, Low-Grade dysplasia.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825111, DeepHealth Project.

Instructions:

In order to load the data, we provide below an example routine working within PyTorch frameworks. We provide two different resolutions, 800 and 7000 um/px.

Within each resolution, we provide .csv files, containing all metadata information for all the included files, comprising:

image_id;
label (6 classes - HP, NORM, TA.HG, TA.LG, TVA.HG, TVA.LG);
type (4 classes - HP, NORM, HG, LG);
reference WSI;
reference region of interest in WSI (roi);
resolution (micron per pixels, mpp);
coordinates for the patch (x, y, w, h).

Below you can find the dataloader class of UNITOPatho for PyTorch. More examples can be found here.

import torch

import torchvision

import numpy as np

import cv2

import os

class UNITOPatho(torch.utils.data.Dataset):

def __init__(self, df, T, path, target, subsample=-1, gray=False, mock=False):

self.path = path

self.df = df

self.T = T

self.target = target

self.subsample = subsample

self.mock = mock

self.gray = gray allowed_target = ['type', 'grade', 'top_label']

if target not in allowed_target:

print(f'Target must be in {allowed_target}, got {target}')

exit(1)

print(f'Loaded {len(self.df)} images')

def __len__(self):

return len(self.df)

def __getitem__(self, index):

entry = self.df.iloc[index]

image_id = entry.image_id

image_id = os.path.join(self.path, entry.top_label_name, image_id)

img = None

if self.mock:

C = 1 if self.gray else 3

img = np.random.randint(0, 255, (224, 224, C)).astype(np.uint8)

else:

img = cv2.imread(image_id)

if self.subsample != -1:

w = img.shape[0]

while w//2 > self.subsample:

img = cv2.resize(img, (w//2, w//2))

w = w//2

img = cv2.resize(img, (self.subsample, self.subsample))

if self.gray:

img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

img = np.expand_dims(img, axis=2)

else:

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

if self.T is not None:

img = self.T(img)

return img, entry[self.target]

Dataset Files

UNITOPatho.zip (Size: 274.98 GB)

Open Access dataset files are accessible to all logged in users. Don't have a login? Create a free IEEE account. IEEE Membership is not required.

Datasets

Open Access

UNITOPATHO

Abstract

NORM- Normal tissue;

HP- Hyperplastic Polyp;

TA.HG- Tubular Adenoma, High-Grade dysplasia;

TA.LG- Tubular Adenoma, Low-Grade dysplasia;

TVA.HG- Tubulo-Villous Adenoma, High-Grade dysplasia;

TVA.LG- Tubulo-Villous Adenoma, Low-Grade dysplasia.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825111, DeepHealth Project.

Instructions:

Dataset Files

QUESTIONS?

More like this Dataset

Weather Monitoring Station For Farms And Agriculture

Trilateration based on RSSI values in transmitters and receivers

The FLAME dataset: Aerial Imagery Pile burn detection using drones (UAVs)

Retinal Fundus Multi-disease Image Dataset (RFMiD)

Experimental database for detecting and diagnosing rotor broken bar in a three-phase induction motor.

Dataset for classification of handwritten and printed text in a Doctor's prescription