Introduction and Research Goal: Epilepsy is a common neurological disorder that severely impacts patients’ quality of life. Current diagnostic standards rely on the presence of seizures or interictal epileptiform discharges (IEDs) in the electroencephalogram (EEG). However
...
Introduction and Research Goal: Epilepsy is a common neurological disorder that severely impacts patients’ quality of life. Current diagnostic standards rely on the presence of seizures or interictal epileptiform discharges (IEDs) in the electroencephalogram (EEG). However, some patients who are ultimately diagnosed with epilepsy do not present with seizures or IEDs on their initial EEG, which delays their diagnosis and appropriate medical treatment. Previous studies by Thangavel et al. 2022 and Mirwani 2024 suggest the potential of machine learning methods applied to IED-free EEGs for classifying epilepsy. The aim of this study is to evaluate whether adding clinical characteristics and visual EEG interpretation to existing machine learning models based on quantitative EEG improves the performance of these models for the identification of epilepsy in IED-free EEGs. Additionally, the study explores model interpretability to promote clinical application.
Methods: We focus on subjects who presented at the emergency room following a first clinical seizure and were not diagnosed with epilepsy based on their initial EEG. Ten quantitative EEG feature sets based various mathematical transforms were readily available from previous research by Mirwani 2024. EXtreme Gradient Boosting (XGBoost) models were trained using Leave-One-Subject-Out Cross-Validation (LOSO CV) on these feature sets as benchmark models. Model performance was assessed using the Area Under the Curve (AUC). Clinical and EEG report features were added individually to each quantitative EEG feature set to evaluate their added value to model performance. The best performing clinical and report features for each quantitative EEG feature set were determined using a two-fold grid search, and significance was tested via Welch’s t-test. Ensembles were created from the best performing models of each quantitative EEG feature set including their best performing clinical and report features. SHapley Additive exPlanations (SHAP) and XGBoost feature importance were used for model interpretation, while Bayesian statistics were applied to gain insight into clinical implementation.
Results: The best performing clinical and report features varied across EEG feature sets and did not consistently yield significant performance gains. Only the addition of EEG background to graph metrics of phase lock values showed a significant increase in model performance. SHAP analysis identified residual focal sharp activity as the primary contributor to this improvement. Combining individual models into ensembles substantially improved performance, achieving AUCs up to 0.870. To align model performance interpretation to the guidelines of the ILAE, sensitivity at P(posterior) ≥ 0.6 was proposed as a key evaluation metric. The best XGBoost model with clinical and report features achieved 0.48 sensitivity at P(posterior) = 0.6, while the best ensemble attained 0.81.
Conclusion: Incorporating clinical and report features into XGBoost models based on quantitative EEG data does not consistently improve the detection of epilepsy based on IED-free EEGs. The variability of the best performing clinical and report features across quantitative EEG feature sets suggests that their impact is dependent on the quantitative EEG feature sets. Combining individual models into ensembles significantly enhances performance, achieving a sensitivity of 0.81 at P(posterior) ≥ 0.6. However, external validation is required to confirm these findings. Future models should assess model performance according to the sensitivity at P(posterior) ≥ 0.6 to comply with ILAE guidelines.