Exploring Stance Detection of Opinion Texts: Evaluating the Performance of a Large Language Model

Mateijsen, N.

Exploring Stance Detection of Opinion Texts: Evaluating the Performance of a Large Language Model

Benchmarking the Performance of Stance Classification by GPT-3-Turbo

Bachelor thesis (2023)

Authors

N. Mateijsen Electrical Engineering, Mathematics and Computer Science

Contributors

M. Tarvirdians Interactive Intelligence - (mentor)

Catholijn Jonker Interactive Intelligence - (mentor)

M.L. Molenaar Computer Graphics and Visualisation - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

To reference this document use:

http://resolver.tudelft.nl/uuid:7cc7ae50-9cd6-4055-8314-8f98a6aea081

More Info

expand_more

Published Date

03-07-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

In April 2020, a Dutch research team swiftly analyzed public opinions on COVID-19 lockdown relaxations. However, due to time constraints, only a small amount of opinion data could be processed. With the surge of popularity in the field of Natural Language Processing (NLP) and the arrival of tools like ChatGPT, a number of tasks involving Large Language Models (LLMs) have become easier. This study aims to address the effectiveness of these LLMs on stance detection using this COVID-19 opinion corpus. The corpus is chunked and sampled to be used as input for OpenAI's GPT-3.5-Turbo LLM. The machine-generated stances are then evaluated against multiple binary classification metrics. It is shown that these models perform very well in the field of stance detection, with an average F-score of 0.895. However, a significant number of misclassifications are observed in one dataset. Therefore we conclude that while LLMs offer valuable guidelines, it is still crucial to verify their outputs when dealing with complex or important public matters.

Files

CSE3000_Final_Paper.pdf

(pdf | 0.131 Mb)

Unknown license