Exploring Hybrid Intelligence for Topic Interpretation in Colorectal Cancer Research: A Comparative Study of GPT-3.5 and Human Expertise

More Info
expand_more

Abstract

Colorectal cancer is a widespread disease that significantly impacts the health of individuals worldwide. Understanding the needs and concerns of those affected by this disease is crucial for improving patient outcomes and enhancing the quality of care. Patient web forums have emerged as valuable platforms for individuals to openly share their experiences and thoughts related to colorectal cancer, providing unique insights into the social, physical and emotional aspects of their patient journey. These forums offer a more comprehensive and authentic portrayal of patient experiences compared to traditional patient data collection methods, such as questionnaires and interviews, which may not capture the full scope of patients experiences in the colorectal cancer carepath.

However, analyzing the vast amount of unstructured data within these patient web forums presents a significant challenge. Traditional manual analysis by human experts is time-consuming, labor-intensive, and limited in scalability, making it impractical to analyze the sheer volume of patient-generated content. This is where the application of natural language processing (NLP) techniques becomes crucial. NLP enables the automated processing and analysis of textual data, allowing for efficient extraction and interpretation of the large amounts of patient forum posts.

Nevertheless, relying solely on machine intelligence, such as topic modeling and natural language generation, for interpreting patient forum data carries inherent risks, including the potential for disseminating misleading information. While these machine-driven techniques offer efficient and scalable ways to analyze and generate insights from the large amount of diverse and unstructured patient forums, they may lack the necessary contextual understanding and domain expertise to ensure the accuracy, relevance, and ethical implications for interpreting colorectal cancer patient experiences.

To close this gap between human experts and machine intelligence, this thesis explores the potential of hybrid intelligence (HI) for topic interpretation in colorectal cancer research. The main research question is: ``How can topic modeling, GPT-3.5 language generation and human expertise be combined to explore the interpretation of patient web forums in colorectal cancer (CRC) research?"

To address the research question, three human studies were conducted. The first study employed NMF topic modeling to compare topic interpretations created independently by medical workers and GPT-3.5. This comparative analysis discovered unique observations that differentiate human-written and AI-generated interpretations on online patient stories. In the second study, it was investigated how medical researchers collaborate with GPT-3.5 to develop hybrid interpretations on patient experience topics generated by the BERTopic model. A Flask web application served as the interactive platform for combining their knowledge with the AI model. Finally, the third study made professional human evaluators assess the topic relevance of the interpretations generated by medical researchers and GPT-3.5 to determine whether the combination of GPT-3.5 and human expertise leads to improved topic interpretations compared to individual interpretations.

The proposed solution to the research problem is to explore a hybrid workflow that compares, combines and validates GPT-3.5 language generation and human expertise, aiming for enhanced interpretations of topics extracted from colorectal cancer patient forums. The three studies provide opportunities for researchers and medical professionals to integrate machine intelligence from topic models and GPT-3.5 in their field of work. The hybrid workflow has conclusively demonstrated that human experts were successfully able to compare and enhance the relevance of human and GPT-3.5 interpretations of colorectal cancer patient experience topics. This allowed human experts to efficiently reach a more comprehensive understanding of patient forum data, which is essential for improving patient health in colorectal cancer research.