
This dataset includes the data used in our two research papers. GNN4TJ and GNN4IP.
- Categories:

Crowds express emotions as a collective individual, which is evident from the sounds that a crowd produces in particular events, e.g., collective booing, laughing or cheering in sports matches, movies, theaters, concerts, political demonstrations, and riots.
Extract locally the zip files, read the readme file.
Instructions for dataset usage are included in the open access paper: Franzoni, V., Biondi, G., Milani, A., Emotional sounds of crowds: spectrogram-based analysis using deep learning (2020) Multimedia Tools and Applications, 79 (47-48), pp. 36063-36075. https://doi.org/10.1007/s11042-020-09428-x
File are released under Creative Commons Attribution-ShareAlike 4.0 International License
- Categories:
The dataset is collected for the purpose of investigating how brainwave signals can be used to industrial insider threat detection. The dataset was connected using Emotiv Insight 5 channels device. The dataset contains data from 17 subjects who accepted to participate in this data collection.
- Categories:
This dataset is part of my Master's research on malware detection and classification using the XGBoost library on Nvidia GPU. The dataset is a collection of 1.55 million of 1000 API import features extract from jsonl format of the EMBER dataset 2017 v2 and 2018. All data is pre-processing, duplicated records are removed. The dataset contains 800,000 malware and 750,000 "goodware" samples.
* FEATURES *
Column name: sha256
Description: SHA256 hash of the example
Type: string
Column name: appeared
Description: appeared date of the sample
Type: date (yyyy-mm format)
Column name: label
Description: specify malware or "goodware" of the sample
Type: 0 ("goodware") or 1 (malware)
Column name: GetProcAddress
Description: Most imported function (1st)
Type: 0 (Not imported) or 1 (Imported)
...
Column name: LookupAccountSidW
Description: Least imported function (1000th)
Type: 0 (Not imported) or 1 (Imported)
The full dataset features header can be downloaded at https://github.com/tvquynh/api_import_dataset/blob/main/full_dataset_fea...
All processing code will be uploaded to https://github.com/tvquynh/api_import_dataset/
- Categories:
The early detection of damaged (partially broken) outdoor insulators in primary distribution systems is of paramount importance for continuous electricity supply and public safety. In this dataset, we present different images and videos for computer vision-based research. The dataset comprises images and videos taken from different sources such as a Drone, a DSLR camera, and a mobile phone camera.
Please find the attached file for complete description
- Categories:
This dataset is released with our research paper titled “Scene-graph Augmented Data-driven Risk Assessment of Autonomous Vehicle Decisions” (https://arxiv.org/abs/2009.06435). In this paper, we propose a novel data-driven approach that uses scene-graphs as intermediate representations for modeling the subjective risk of driving maneuvers. Our approach includes a Multi-Relation Graph Convolution Network, a Long-Short Term Memory Network, and attention layers.
- Categories:
As an alternative to classical cryptography, Physical Layer Security (PhySec) provides primitives to achieve fundamental security goals like confidentiality, authentication or key derivation. Through its origins in the field of information theory, these primitives are rigorously analysed and their information theoretic security is proven. Nevertheless, the practical realizations of the different approaches do take certain assumptions about the physical world as granted.
The data is provided as zipped NumPy arrays with custom headers. To load an file the NumPy package is required.
The respective loadz primitive allows for a straight forward loading of the datasets.
To load a file “file.npz” the following code is sufficient:
import numpy as np
measurement = np.load(’file.npz ’, allow pickle =False)
header , data = measurement [’header ’], measurement [’data ’]
The dataset comes with a supplementary script example_script.py illustrating the basic usage of the dataset.
- Categories:
The Magnetic Resonance – Computed Tomography (MR-CT) Jordan University Hospital (JUH) dataset has been collected after receiving Institutional Review Board (IRB) approval of the hospital and consent forms have been obtained from all patients. All procedures followed are consistent with the ethics of handling patients’ data.
- Categories:
The Magnetic Resonance – Computed Tomography (MR-CT) Jordan University Hospital (JUH) dataset has been collected after receiving Institutional Review Board (IRB) approval of the hospital and consent forms have been obtained from all patients. All procedures followed are consistent with the ethics of handling patients’ data.
- Categories:
In order to load the data, we provide below an example routine working within PyTorch frameworks. We provide two different resolutions, 800 and 7000 um/px.
Within each resolution, we provide .csv files, containing all metadata information for all the included files, comprising:
- image_id;
- label (6 classes - HP, NORM, TA.HG, TA.LG, TVA.HG, TVA.LG);
- type (4 classes - HP, NORM, HG, LG);
- reference WSI;
- reference region of interest in WSI (roi);
- resolution (micron per pixels, mpp);
- coordinates for the patch (x, y, w, h).
Below you can find the dataloader class of UNITOPatho for PyTorch. More examples can be found here.
import torch
import torchvision
import numpy as np
import cv2
import os
class UNITOPatho(torch.utils.data.Dataset):
def __init__(self, df, T, path, target, subsample=-1, gray=False, mock=False):
self.path = path
self.df = df
self.T = T
self.target = target
self.subsample = subsample
self.mock = mock
self.gray = gray
allowed_target = ['type', 'grade', 'top_label']
if target not in allowed_target:
print(f'Target must be in {allowed_target}, got {target}')
exit(1)
print(f'Loaded {len(self.df)} images')
def __len__(self):
return len(self.df)
def __getitem__(self, index):
entry = self.df.iloc[index]
image_id = entry.image_id
image_id = os.path.join(self.path, entry.top_label_name, image_id)
img = None
if self.mock:
C = 1 if self.gray else 3
img = np.random.randint(0, 255, (224, 224, C)).astype(np.uint8)
else:
img = cv2.imread(image_id)
if self.subsample != -1:
w = img.shape[0]
while w//2 > self.subsample:
img = cv2.resize(img, (w//2, w//2))
w = w//2
img = cv2.resize(img, (self.subsample, self.subsample))
if self.gray:
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = np.expand_dims(img, axis=2)
else:
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
if self.T is not None:
img = self.T(img)
return img, entry[self.target]
- Categories: