Exploring Test Suite Coverage of Large Language Model–Enhanced Unit Test Generation

Drăgoi, A.

Exploring Test Suite Coverage of Large Language Model–Enhanced Unit Test Generation

A Study on the Ability of Large Language Models to Improve the Understandability of Generated Unit Tests Without Compromising Coverage

Bachelor thesis (2024)

Authors

A. Drăgoi Electrical Engineering, Mathematics and Computer Science

Contributors

A.E. Zaidman Software Technology (mentor)

A. Deljouyi Software Engineering - (mentor)

A Katsifodimos Data-Intensive Systems - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

To reference this document use:

http://resolver.tudelft.nl/uuid:3940725d-a367-418d-bf19-407e65d7b902

More Info

expand_more

Published Date

25-06-2024

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Automated software testing is a frequently studied topic in specialized literature. Search-based software testing tools, like EvoSuite, can generate test suites using genetic algorithms without the developer’s input. Large Language Models (LLMs) have recently attracted significant attention in the software engineering domain for their potential to automate test generation. UTGen, a tool integrating LLMs with EvoSuite, produces more understandable tests than EvoSuite; however, the generated tests suffer a coverage drop.

To streamline bug detection by developers, we propose UTGenCov, a concept that focuses on improving the understandability of EvoSuite-generated tests without compromising on coverage. This approach builds upon UTGen by thoroughly analyzing the reasons behind the decrease in coverage and proposing an alternative approach.

Our investigation determined that the leading cause of coverage reduction in UTGen is LLM hallucination in the Understandability phase. UTGenCov aims to address hallucinations by providing the source code of the methods used in the test to the LLM. Yet, our experiment results indicate inconsistent performance and a further decrease in branch coverage of 0.74% compared to UTGen.

Files

Exploring_Test_Suite_Coverage_... (pdf)

(pdf | 0.454 Mb)

Unknown license