Background: Undetected Intellectual Disability (ID) can lead to chronic stress due to overestimation by society. Chronic stress can cause stress-related health issues, like hypertension, chronic fatigue and abdominal complaints. When a physician (General Practitioner (GP)
...
Background: Undetected Intellectual Disability (ID) can lead to chronic stress due to overestimation by society. Chronic stress can cause stress-related health issues, like hypertension, chronic fatigue and abdominal complaints. When a physician (General Practitioner (GP) or medical specialist) does not recognize that a patient has ID, the relation with stress may go unnoticed. In that case, the complaint is often treated as a purely somatic problem, while the underlying cause (overestimation due to unrecognized ID) remains untreated. This can increase healthcare consumption and impair the patient’s quality of life. While physicians with ID-expertise can recognize subtle signs of mild ID, physicians without extensive experience will easily overlook the ID. To improve medical care for patients with ID, we aim to improve ID detection among physicians. As it is not feasible to give all individual doctors an ‘ID-recognition training’, we study the possibility of using AI to improve ID detection. In the past years, we have been working on an ‘ID Alert’ (IDA) using ML. In previous phases of the IDA project, structured Electronic Health Record (EHR) data was used for the creation of an IDA. In addition, in the current study, we investigate the use of unstructured EHR data (clinical text).
Methods: We analyzed unstructured correspondence files of 200 ID-adults and 200 non-ID adults of Novicare, an organization that provides multidisciplinary care to clients with complex and chronic conditions in intra- and extramural settings. Structured clinical data was unavailable. Therefore, we used an automated method of text extraction, de-identification and two types of feature extraction (bag-of-words and clinical concept extraction). Features were compared between ID-adults and non-ID adults. Significant features that were unlikely to be intrinsically different between ID- and non-ID adults were excluded. The remaining significant features were used for the training and evaluation (10-fold stratified cross-validation) of two Gradient Boosting Classifiers.
Results: Most features differed significantly between ID- and non-ID adults due to confounders such as differences in age, type of care and the doctor’s word choice (which is inherent to the specialty and training of the doctor). Significant ‘unbiased’ features identified by both types of feature extraction methods are epilepsy, emotional disturbance (tension, arousal, agitation), visual or hearing problems and the presence of family members during consult. The developed ML models showed Areas Under the Curve (AUCs) of 0.98 and 0.89 for bag-of-words and clinical concepts, respectively.
Conclusion: This is the first study to investigate the use of unstructured correspondence files for developing an IDA. The developed models show a very high performance. Despite efforts to mitigate the effect of confounders, limitations may have influenced generalizability. Therefore, external validation of the proposed methods is necessary in future research.