Reducing LLM Hallucinations with Retrieval Prompt Engineering
Minimising the Need for Re-prompting in Automatic Understandable Test Generation
More Info
expand_more
Abstract
Automated test generation is the means to produce correct and usable code while maintaining an efficient and effective development process. UTGen is a tool that utilizes a Large Language Model (LLM) to improve the understandability of a test suite generated by a Search-Based Software Testing tool, namely EvoSuite. Often while the LLM attempts to improve a given test case, it generates code that is too far from the original, changing the test's purpose. Alternatively, it may generate code that does not compile. Such behaviour is called ``LLM Hallucination".
The current hallucination handling of UTGen is time-consuming and resource-expensive. To address this, we propose two alternative approaches that use information retrieval prompt engineering techniques to minimise hallucinations. Our respective techniques include incorporating the source code under test and the errors thrown by the latest generated test case to the LLM prompt. We assess our methods through a comparison study against the base UTGen version. We observe that source code retrieval enhances the generation of compilable test cases for complex classes. Error code retrieval shows similar hallucination performance to base UTGen, with a decrease in the number of re-prompts for classes with a high normalised Lack of Cohesion of Methods (*LCOM).
Index Terms - Automated Test Generation, Large Language Models (LLMs), LLM Hallucination, Prompt Engineering