Meet Your Onboarding Buddy

A Smart, Adaptive, and Conversational LLM Assistant to Smooth Your Software Onboarding Journey

More Info
expand_more

Abstract

Effective onboarding in software engineering is critical yet challenging due to the rapid evolution of technologies, languages, frameworks, and tools. Traditional exploration, documentation and workshop-based onboarding methods tend to be expensive, time-consuming and can get outdated very fast in large, complex projects.

In this thesis, we introduce a novel solution: the Onboarding Buddy system which uses large language models (LLMs) and retrieval augmented generation (RAG), enhanced by an automated approach for chain-of-thought (CoT) that improves onboarding for new and existing developers. It integrates natural language explanations available in the development environment with relevant information, code explanations, and project-specific guidance. The system architecture is agent-centric, including contextualization, onboarding agents, instruction step processors and message enhancement agents that cooperate in delivering comprehensive, customized support with minimal reliance on human mentors.

While effective in supporting the completion of tasks and reducing stress related to onboarding, feedback also revealed some areas for improvement, like better context awareness, explicit instructions, improved technical stability, and UX adjustments. In general, Onboarding Buddy is an excellent promise to smoothen the onboarding process and therefore increase developer productivity and job satisfaction.

The experimental results demonstrated the system's effectiveness: participants spent an average of 175 minutes actively engaged in the IDE, completed tasks with nearly 100\% accuracy, and gave high helpfulness ratings (3 out of 4). Task completion times averaged 50 minutes, with simpler tasks taking around 28 minutes and complex ones requiring 67 minutes. User feedback showed high satisfaction (3.35/4 for understanding, 3.15/4 for accuracy) and strong interest in such solutions (7.75/10). Interestingly, more experienced developers spent more time on tasks, suggesting a deeper exploration of the codebase. A strong positive correlation (0.70) between system usage frequency and perceived helpfulness indicated that increased engagement led to better outcomes.

In other words, while there are areas for improvement, such as context awareness and processing complex tasks, this research proves that LLM-based onboarding solutions are feasible and can have significant positive impacts on the software engineering onboarding process, thus laying the foundation for future progress in automated developer support and knowledge sharing for software development.

Files