Preprocessed EEG dataset with epileptic seizure from SNMC Bagalkot, India

Citation Author(s):
Deepa
B
Ramesh
K
Submitted by:
Mrs Deepa .B
Last updated:
Mon, 02/28/2022 - 01:23
DOI:
10.21227/q6y0-e695
License:
5
1 rating - Please login to submit your rating.

Abstract 

EEG consists of collecting information from brain activity in the form of electrical voltage. Epileptic Seizure prediction and detection is a major sought after research nowadays. This dataset contains data from 11 patients of whom seizures are observed in EEG for 2 patients.

 

The total duration of seizures is 170 seconds. The number of channels is 16 and data is collected at 256Hz sampling rate.

 

The final dataset files in .csv format contain 87040 rows x 17 columns,

 

where 17 columns are 16 channels and one outcome column indicating seizure(1) and not seizure(0).

 

87040 rows are obtained by 170 seconds x 256 seizure data and 170 seconds x 256 non seizure data.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`

RawData_SNMC_EEG contains final dataset with no filtering 

RawData_SNMC_EEG-1-70 contains final dataset with filtering (1Hz to 70Hz)

RawData_SNMC_EEG-0-100 contains final dataset with filtering (0Hz to 100Hz)

Annotations_SNMC contains details about patients and seizure activity.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 https://www.snmcbgk.in/ S. NIJALINGAPPA MEDICAL COLLEGE, Bagalkot 587 102, India

I am unable to apprehend procedure to upload zipped files.  More raw information of this dataset can be accessed at https://drive.google.com/drive/folders/1DYrh3D41-aAfz81iTfJocez00wwuhGKG...

I will pull all files to IEEE-dataport soon.

Instructions: 

RawData_SNMC_EEG contains final dataset with no filtering 

 RawData_SNMC_EEG-1-70 contains final dataset with filtering (1Hz to 70Hz)

RawData_SNMC_EEG-0-100 contains final dataset with filtering (0Hz to 100Hz)

 

Comments

Many thanks for this dataset. I am pretty new to EEG analysis. Why do you have only 16 channels and not 19 (10-20 system)? And why are they measured in pairs? Thanks again!

Submitted by Matheus Guerrero on Thu, 02/24/2022 - 08:40

Very Glad to hear from you.

Why do you have only 16 channels and not 19 (10-20 system)?
>>>10-20 system indicates spacing between electrodes to be 10% or 20%. The channels can be any number based on the required resolution. This dataset has 16 where as CHBMIT has 23 channels.

And why are they measured in pairs?
>>> EEG records electrical activity from brain scalp. The electrical voltage needs reference. Most common is a 'bipolar montage' where one electrode is common between 2 points on brain. For example FP1-F3 indicates value on FP1 with respect to F3.

Hope this helps. Thanks for reaching out. Keep us updated. :):)

Submitted by Mrs Deepa .B on Mon, 02/28/2022 - 01:29

Thank you!

Submitted by Matheus Guerrero on Tue, 03/01/2022 - 02:26

Thank you for sharing this precious data.

I wonder how one can split the data into the 11 different subjects?

We have 43520 lines for seizure events (Outcome = 1). If 2 patients experienced a seizure, it means the first 21760 lines of 1s should be subject "A," while the following 21760 lines of 1s should be subject "B." Imagining that the data is time ordered by lines. Is this correct? If yes, why don't we have 21760 x 9 = 195840 observations for the 9 non-seizure patients? Or is the data the average by channels for the different groups (seizure, non-seizure)?

I really appreciate any help you can provide.

Submitted by Matheus Guerrero on Sun, 02/27/2022 - 03:31

I wonder how one can split the data into the 11 different subjects?
>>> The data is appended from 11 patients. I have uploaded annotations file which might be helpful.

............ Or is the data the average by channels for the different groups (seizure, non-seizure)................?
>>> Complete procedure is as below
2 patients seizure time 145+25= 170 seconds. indicating 145*256 Patient A and 25*256 Patient B rows at the end of dataset.
For keeping dataset balanced and for cross validation with all patients. Non seizure activities are collected from 11 patients as "16seconds * 10 patients + 10 seconds * 1 patient =170 "
I have shared a google drive link in abstract (updated) which might provide more clarity to this discussion. Please have a look.

Thanks and regards, Good Luck again :):)

Submitted by Mrs Deepa .B on Mon, 02/28/2022 - 01:39

Many thanks for the clarification.

And just to be sure, these "16seconds*10 patients + 10seconds*1patient=170" nonseizure data are displayed patient-wise when Outcome = 0, right?
So, the first 16*256 rows are Patient A, and the next 16*256 rows are Patient B, and so on.

Cheers,

Submitted by Matheus Guerrero on Tue, 03/01/2022 - 02:50

10 seconds 'non seizure data' is of patient before last (i.e 10th patient data). Because the patient data had seizures in the initial recordings itself. Remaining all patients serially appended as you have mentioned.
:):) thanks and regards,

Submitted by Mrs Deepa .B on Wed, 03/02/2022 - 02:49