Machine Learning

The JKU-ITS AVDM contains data from 17 participants performing different tasks with various levels of distraction.
The data collection was carried out in accordance with the relevant guidelines and regulations and informed consent was obtained from all participants.
The dataset was collected using the JKU-ITS research vehicle with automated capabilities under different illumination and weather conditions along a secure test route within the


Nasal Cytology, or Rhinology, is the subfield of otolaryngology, focused on the microscope observation of samples of the nasal mucosa, aimed to recognize cells of different types, to spot and diagnose ongoing pathologies. Such methodology can claim good accuracy in diagnosing rhinitis and infections, being very cheap and accessible without any instrument more complex than a microscope, even optical ones.


This database contains Synthetic High-Voltage Power Line Insulator Images.

There are two sets of images: one for image segmentation and another for image classification.

The first set contains images with different types of materials and landscapes, including the following landscape types: Mountains, Forest, Desert, City, Stream, Plantation. Each of the above-mentioned landscape types consists of 2,627 images per insulator type, which can be Ceramic, Polymeric or made of Glass, with a total of 47,286 distinct images.


To address the challenges faced by patients with neurodegenerative disorders, Brain-Computer Interface (BCI) solutions are being developed. However, many current datasets lack inclusion of languages spoken by patients, such as Telugu, which is spoken by over 90 million people in India. To bridge this gap, we have created a dataset comprising Electroencephalograph (EEG) signal samples of commonly used Telugu words. Using the Open-BCI Cyton device, EEG samples were captured from volunteers as they pronounced these words.


Popularity of smartphones also popularized, reading content using smartphones. Reading using smartphones quite differs from reading using desktop system. Mouse and Keyboard are the peripherals associated with the reading in desktop systems. Study of the handling of such devices has led to provide implicit feedback of the content read. Similar study in smartphones to get implicit feedback remains to be a huge gap. Reading using smartphones involves screen gestures like pinch to zoom, tap, scroll, orientation change and screen capture.


The dataset consists of 4-channeled EOG data recorded in two environments. First category of data were recorded from 21 poeple using driving simulator (1976 samples). The second category of data were recorded from 30 people in real-road conditions (390 samples).

All the signals were acquired with JINS MEME ES_R smart glasses equipped with 3-point EOG sensor. Sampling frequency is 200 Hz.


The dataset involves two sets of participants: a group of twenty skilled drivers aged between 40 and 68, each having a minimum of ten years of driving experience (class 1), and another group consisting of ten novice drivers aged between 18 and 46, who were currently undergoing driving lessons at a driving school (class 2).

The data was recorded using JINS MEME ES_R smart glasses by JINS, Inc. (Tokyo, Japan).

Each file consists of a signals from one sigle ride.


data have 16 features with 1 target value

Scope: Primarily focused on diabetes-related information.

Data Size: Contains a substantial volume of records.

Variables: Likely includes patient demographics, medical history, lab results, medications, treatments, and outcomes.

Temporal Range: Time span covered by the dataset may vary.

Privacy Measures: Anonymized to protect patient identities.

Ethical Considerations: Collected and shared adhering to ethical guidelines.


SeaIceWeather Dataset 

This is the SeaIceWeather dataset, collected for training and evaluation of deep learning based de-weathering models. To the best of our knowledge, this is the first such publicly available dataset for the sea ice domain. This dataset is linked to our paper titled: Deep Learning Strategies for Analysis of Weather-Degraded Optical Sea Ice Images. The paper can be accessed at: 


QuaN is a collection of specially designed datasets for exploring the impact of noise quantum machine learning and other applications. The presented work focuses on the transformation of clean datasets into noisy counterparts across diverse domains, including MNIST-handwritten digits datasets, Medical MNIST, IRIS datasets and Mobile Health datasets. The dataset is created using noise from classical and quantum domains.