Using LLM-Generated Summarizations to Improve the Understandability of Generated Unit Tests

Enhancing Unit Test Understandability: An Evaluation of LLM-Generated Summaries

More Info
expand_more

Abstract

Since software testing is crucial, there has been research on generating test cases automatically. The problem is that the generated test cases can be hard to understand. Multiple factors play a role in understandability and one of them is test summarization, which provides an overview of the test of what it is exactly testing and sometimes highlights the key functionalities. There already exist some tools to generate test summaries that use template-based summarization techniques. Limitations of generated summaries include that they can be lengthy and redundant, and that it is best to use them in combination with well-defined test names and variables. There is a tool developed named UTGen, which combines Evosuite and Large Language models to increase understandability which includes improving the test names and variables, but does not have a summarization functionality yet. In this research, we extend UTGen using LLM-generated summaries. We investigate to what extent the understandability of a test case can be influenced by LLM-generated test summaries in terms of context, conciseness, and naturalness. For this reason, we do a user evaluation with 11 participants with a software testing background. They will judge LLM-generated summaries and compare them to existing summarization tools. The LLM-generated summaries scored overall higher than the template-based summaries and were also more favorable by the participants.

Files