Transformer Electrocardiogram Biometrics Dataset

Citation Author(s):: Kai Jye Chee (School of Electrical and Electronic Engineering, USM Engineering Campus, Universiti Sains Malaysia, Nibong Tebal 14300, Malaysia)

Dzati Athiar Ramli (School of Electrical and Electronic Engineering, USM Engineering Campus, Universiti Sains Malaysia, Nibong Tebal 14300, Malaysia)
Submitted by:: Kai Jye Chee
Last updated:: Sun, 10/23/2022 - 22:39
DOI:: 10.21227/syhd-3948
Data Format:: tfrecord
Research Article Link:: Electrocardiogram Biometrics Using Transformer’s Self-Attention Mechanism for S…
Links:: Apnea-ECG Database

Long Term AF Database

MIT-BIH Arrhythmia Database

MIT-BIH Long-Term ECG Database

MIT-BIH Malignant Ventricular Ectopy Database

MIT-BIH Polysomnographic Database

MIT-BIH Supraventricular Arrhythmia Database

St Petersburg INCART 12-lead Arrhythmia Database

Fantasia Database

PTB-XL

308 views

Categories:

Artificial Intelligence

Keywords:

transformer

Electrocardiogram (ECG)

Deep Learning

Convolutional neural networks; Long Short-Term Memory; Attention Mechanism; Traffic Flow Prediction; Transportation Cyber-Physical Systems

ACCESS DATASET CITE

Abstract

Many of the publicly available electrocardiogram (ECG) databases either have a low number of people in the database, each with longer recordings, or have more people, each with shorter recordings. As a result, attempting to split a single database into training, testing, and, optionally, validation datasets is challenging. Some models seem to do well with larger training sets, but that leaves only a small set of data for testing. Moreover, if the ECG is segmented by heartbeat, the data are further limited by the number of heartbeats in the recording. Combining multiple databases to increase the dataset is difficult because it needs to reconcile the differences across databases, potentially having to deal with different measuring devices, measuring conditions, sampling rate, type of noise, etc. A dataset generation procedure using blind segmentation as a data augmentation technique is used to generate huge amount of training and validation dataset. This procedure is not limited by the number of heartbeats in the ECG recording. Multiple ECG databases are combined to increase the total number of subjects and to provide more ECG variations. A total of 10 databases were used to generate the training and validation datasets. The huge amount of data with wide variations trained a generalized model.

Instructions:

.tfrecord files have "training" or "validation" prefixed filenames. each example is an dict with key: "label", "d0", "d1". "label" contains the position of the identity where the query is matched. "d0" contains the query ECG segment. "d1" contains the classification scope ECG segments.

Funding Agency

Ministry of Higher Education Malaysia

Grant Number

FRGS/1/2020/ICT03/USM/02/1

Datasets

Standard Dataset

Transformer Electrocardiogram Biometrics Dataset

Abstract

Instructions:

Dataset Files

DOCUMENTATION

DATASET SCRIPTS

QUESTIONS?

More like this Dataset

Weather Monitoring Station For Farms And Agriculture

Trilateration based on RSSI values in transmitters and receivers

The FLAME dataset: Aerial Imagery Pile burn detection using drones (UAVs)

Retinal Fundus Multi-disease Image Dataset (RFMiD)

Experimental database for detecting and diagnosing rotor broken bar in a three-phase induction motor.

Dataset for classification of handwritten and printed text in a Doctor's prescription