A User Evaluation of UniXcoder Using Statement Completion in a Real-World Setting

More Info
expand_more

Abstract

State-of-the-art machine learning-based models provide automatic intelligent code completion based on large pre-trained language models. The theoretical accuracy of these models reaches 70%. However, the research on the practicality of these models is limited. Our paper will discuss the usefulness of UniXcoder, a machine learning-based cross-modal auto-completion model, in a normal environment through user evaluation. These models incorporate context around the requested completion and then return a prediction of code based on the context. To accomplish this, two plugins were made called 'Code4Me'. One for Visual Studio Code and PyCharm. These plugins work with a remote API that requires a segment of 3966 characters of the left and right context at the trigger point. The data collected consists of the inserted code completion, verification of the code completion, the IDE used, the trigger point, and the inference time. To evaluate the data the following metrics are used: BLUE~4, ROUGE-L, Exact Match, Edit Similarity, and the METEOR score. The results point out that developers accept once every 8 suggestions with an Exact Match of 62.5%, and the user evaluated, albeit with limited responses, are favourable towards the model and Code4Me. The accuracy of UniXcoder is lower in a real-world setting than when it is predicting source code. However, the usefulness of UniXcoder as an auto-completion model is apparent.