AI-assisted development tools use Machine Learning models to help developers achieve tasks such as Method Name Generation, Code Captioning, Smart Bug Finding and others. A common practice among data scientists training these models is to omit inline code comments from training da
...
AI-assisted development tools use Machine Learning models to help developers achieve tasks such as Method Name Generation, Code Captioning, Smart Bug Finding and others. A common practice among data scientists training these models is to omit inline code comments from training data. We hypothesize that including inline comments in the training code will provide more information to the model and improve the model's performance for natural-language related tasks, specifically Code Captioning. We adjust one of these models, code2seq, to include inline comments in its data processing, then train and compare it to a commentless version. We find that including inline comments tends to increase the performance of the model by making it faster and producing more verbose results, and then reflect on the results of this work to formulate suggestions on how to improve upon this body of research.