EEG_Schizophrenia

Binary classification (Schizophrenia / Normal) of the adolescent EEG signals, obtained from an open-access dataset.

In this project, various Time and Frequency feature extraction methods (DWT, STFT and CWT) are applied to the EEG signals, in order to obtain better classification performance. The result of %94 test accuracy is obtained by the DWT-MLP method which uses spectral features, and by the CWT-CNN method.

Oguzhan Memis January 2025

Content

Files in this repository
Dataset desription
Code Organization
Considerations
Reference

1-Files in this repository

Code file is "eeg_schizophrenia.py" and dataset file is "dataset_text.zip" which includes two folders. There is also a model file called "dwt_mlp_model_96.h5" to import and use for the DWT method. Relevant instructions are noted in later sections.

2-Dataset Description

EEG Dataset which contains 2 classes of EEG signals captured from adolescents.

-Classes: Normal (39 people) and Schizophrenia (45 people).

-Properties:

16 channels * 128 sample-per-second * 60 seconds of measurement for each person.

Voltages are captured in units of microvolts (µV) 10^-6
So the amplitudes of the signals varies from -2000 to +2000

-Orientation:

Signals are vertically placed into text files, ordered by channel number (1 to 16).

The length of 1 signal is = 128 x 60 = 7680 samples.
So each text file contains 16 x 7680 = 122880 samples, vertically.

Source of the dataset: Moscow State University 2005
The original article of the dataset: 2005 Borisov et al. Physiology (Q4)
A recent article that uses this dataset: 2024 Bagherzadeh & Shalbaf Cognitive Neuroscience (Q2)

3-Code Organization:

The codes are divided into separate cells by putting #%%,

RUN EACH CELL ONE BY ONE CONSECUTIVELY.

The cells are as follows:
    
    1) Importing the data
    2) Filtering stage (includes time and frequency plots)
    3:
        3.1) Visualization of all the healthy EEG channels together
        3.2) Visualization of all the patient EEG channels together
        
    4) Feature Examinations (including many statistical features on the signals)
    5) Further explorations: Correlation matrix, and Recurrence plot
    6) Multi-level Decomposition by DWT (examination)
    7:
        7.1) DWT Feature Extraction and Data Transformation
        7.2) SVM Grid-search
        7.3) SVM cross-validation
        7.4) MLP model
        7.5) Optional part: save the best model
        7.6) MLP k-fold cross-validation
        7.7) Leave One Out CV on the MLP
    
    8:
        8.1) STFT-Feature extraction method
        8.2) STFT-MLP
        8.3) STFT-SVM (Grid-search)
        
    9:
        9.1) STFT Data Transformation
        9.2) STFT - CNN
        
    10:
        10.1) CWT Data Transformation
        10.2) CWT - CNN
        10.3) CNN k-fold cross-validation
        10.4) Leave One Out CV on the CNN

4-Considerations:

Before running the classification models, consider related data transformation/feature extraction methods and the input size (for the Deep Learning models).

The DWT-Feature extraction method gives an output dataset in size of (84,16,25) then the data of every subject are flattened into 16 x 25=400

Use different wavelets for SVM and the MLP models. Such as 'bior2.8' and 'bior3.3' for the SVM

The first STFT-Feature extraction method gives an output dataset in size of (84,16,325) It uses a downsampled and flattened STFT. Then the data of every subject are flattened into 16 x 325=5200

In the second STFT method, Spectrograms of the signals are not flattened, and dataset in size of (84, 16, 513, 21) is obtained. The CNN model takes the input as 16 channel 513x21 matrices.

In the last CWT method, Scalograms (downsampled in one axis) of the signals are captured into the resultant dataset which has a size of (84, 16, 60, 1920). The CNN model takes the input as 16 channel 60x1920 matrices.

All the MLP models are built by using Keras, and all the CNN models are built by using PyTorch (uses GPU)

5-Reference this repository

Please refer with the name of the repository owner Oğuzhan Memiş, with the link of this repository. Also don't forget to cite the dataset owner 2005 Borisov et al..

Contacts and suggestions are welcomed: memisoguzhants@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

EEG_Schizophrenia

Content

1-Files in this repository

2-Dataset Description

3-Code Organization:

4-Considerations:

5-Reference this repository

Files

README.md

Latest commit

History

README.md

File metadata and controls

EEG_Schizophrenia

Content

1-Files in this repository

2-Dataset Description

3-Code Organization:

4-Considerations:

5-Reference this repository