Preprocessed CHB-MIT Scalp EEG Database

Citation Author(s):
Deepa
B
Karnataka State Akkamahadevi Women's University Vijayapura, India
Dr.Ramesh
K
Karnataka State Akkamahadevi Women's University Vijayapura, India
Submitted by:
Mrs Deepa .B
Last updated:
Tue, 07/26/2022 - 10:54
DOI:
10.21227/awcw-mn88
Data Format:
License:
4.75
4 ratings - Please login to submit your rating.

Abstract 

Recent advances in computational power availibility and cloud computing has prompted extensive research in epileptic seizure detection and prediction. EEG (electroencephalogram) datasets from ‘Dept. of Epileptology, Univ. of Bonn’ and ‘CHB-MIT Scalp EEG Database’ are publically available datasets which are the most sought after amongst researchers. Bonn dataset is very small compared to CHB-MIT. But still researchers prefer Bonn as it is in simple '.txt' format. The dataset being published here is a preprocessed form of CHB-MIT. The dataset is available in '.csv' format. Machine learning and Deep learning models are easily implementable with aid of '.csv' format.

Instructions: 

If the dataset is helpful, please site the OpenAccess Paper indicated below. The paper describes the procedure and results in detail.

Deepa, B., & Ramesh, K. (2022). Epileptic seizure detection using deep learning through min max scaler normalization. International Journal of Health Sciences6(S1), 10981–10996. https://doi.org/10.53730/ijhs.v6nS1.7801

Procedure in short:

  1. The tool used for preprocessing is Anaconda-Jupyter Notebook on Intel 8th gen i5 processor with 8GB RAM
  2. The dataset is prepared by extracting datapoints from '.edf' by using mne package in python. Equal amount of preictal and ictal data are extracted.
  3. A period of 4096 seconds (68 minutes) each of preictal and ictal data is extracted from the '.edf' files. All ictal periods for 24 patients annotated have been included in the dataset.
  4. Datapoints are loaded and preprocessed as dataframes by using pandas package in python.
  5. System RAM size should be available to the maximum possible extent as dataframes are large.
  6. The file chbmit_preprocessed_data.csv can be used as is for machine learning and deep learning models.

Data Availability :

The datset contains following files.

  • chbmit_ictal_raw_data.csv : This file contains only ictal data from all 24 patients. The channels vary largely and amount to 96 columns in this file.
  • chbmit_preictal_raw_data.csv : This file contains only preictal data from all 24 patients. The channels vary largely and amount to 96 columns in this file.
  • chbmit_preictal_23channels_data.csv :This file contains only preictal data from all 24 patients. Only 23 channels are retained and amount to 23 columns in this file.
  • chbmit_ictal_23channels_data.csv :This file contains only ictal data from all 24 patients. Only 23 channels are retained and amount to 23 columns in this file.
  • chbmit_preprocessed_data.csv :This file contains balanced preictal and ictal data from all 24 patients. Only 23 channels are retained, outcome column is added and amount to 24 columns in this file. In outcome column '0' indicates preictal and '1' indicates ictal.
  • RECENTLY ADDED
  • 24 sheets (Seizures info: patient & file number, start-stop times, datapoints)
  • File 278 files (139 preictal+ 139 ictal) ptno_fileno_seizureORnoseizure.csv(Raw data)

This dataset is prepared with data reduction techniques. Data cleaning and data transformation need to be done as suitable for the application or model under development. 

Last 2 files can be used for accessing all raw data from 24 patients.

Original Data:

 

The original raw dataset in '.edf' is available at https://physionet.org/content/chbmit/1.0.0/  and to be cited as 

Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220

Comments

Please send me this dataset.

Submitted by RANTU BURAGOHAIN on Thu, 03/24/2022 - 05:58

The dataset is for OpenAccess, You can download it by logging into your IEEE-dataport account.
Best regards, good luck :)

Submitted by Mrs Deepa .B on Tue, 07/26/2022 - 10:56

is it possible to get sensor locations for use in mne-tools?

Submitted by Syed Shah on Thu, 09/15/2022 - 01:43

Unfortunately, No. https://mne.tools/0.16/manual/io.html#importing-eeg-data. But the 23 channel heads can be read and can be plotted on 10-20 standard EEG placement method. Best Regards,

Submitted by Mrs Deepa .B on Tue, 09/20/2022 - 11:48

Hi

Submitted by Sina Shafiezadeh on Mon, 10/31/2022 - 06:48

Hi,

Can you provide any details about what kind of preprocessing you did on the dataset? Also when you extracted the data using mne package, did you perform any additional preprocessing or just extracted the data itself? -Thanks

Submitted by Sharmin Kibria on Tue, 01/24/2023 - 01:42

What is the preictal period that you have taken for preparing this dataset?

Submitted by Plaban Datta on Tue, 03/07/2023 - 02:19

is it possible to get the dataset which are are preprocessed and feature are extracted and marked seizure and non seizure point which we can directly use for epileptic seizure detection through deep learning model

Submitted by Md Arshad on Thu, 03/09/2023 - 14:41

Hello, I would like the data set about patients with epilepsy, please provide me with it for use in my thesis

Submitted by reza kazempoor on Sat, 12/16/2023 - 03:47

Hello, I could not download the dataset. Kindly send me the dataset

Submitted by Beno J on Mon, 02/05/2024 - 05:06