User Evaluation of InCoder Based on Statement Completion

Bachelor thesis (2022)

Authors

M.J.C. Otten Electrical Engineering, Mathematics and Computer Science

Contributors

M. Izadi Software Engineering - (mentor)

A. van Deursen Software Technology (mentor)

A. Lukina Algorithmics - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

To reference this document use:

http://resolver.tudelft.nl/uuid:09c775cf-fdc5-4e91-88ce-eeed3a76d002

More Info

expand_more

Published Date

24-06-2022

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

A lot of models have been proposed to automatically complete code with promising evaluation results when tested in isolation on testing sets. This research aims to evaluate the performance of these models when used by developers when programming. Are these models still useful for actual programming and do developers even want this functionality? The model evaluated in this study is the InCoder model by Facebook, specifically the ability to complete code statements for the Python programming language. To evaluate this a plugin called Code4Me was made for PyCharm and VSC that will show code completion suggestions from the model when a keybind is pressed or a trigger point is encountered. If the user is shown a suggestion the plugin will send the actual line of code made by the developer after a delay, this can also be the suggestion itself if the user thought the suggestion was correct. When the user has used the model sufficiently they will also be asked to fill in a survey to gather opinions on the functionality that the model provides. The results show that there is a 21.95% ExactMatch, 52,73% edit similarity, and a BLEU-4 score of 36.05 for the statement completion functionality of InCoder. All users that filled in the survey preferred the automatic suggestions on trigger points but some indicated the keybind functionality was also useful. If the suggestion shown to the user was good they would use it instead of typing it themselves. The users indicated that the suggestions were a lot or a bit better than the default suggestions and using the plugin did save time programming. Overall all users were positive of the performance and thought the statement completion functionality provided useful suggestions.

Files

Marc_Otten_BSC_Thesis.pdf

(pdf | 0.222 Mb)