Improving embodied LLM agents' capabilities through collaboration

More Info
expand_more

Abstract

The emergence of Language Language Models (LLMs)-based agents represents a significant advancement in artificial intelligence (AI), offering new possibilities for complex problem-solving and interaction within a virtual environment. Our work is based on the Voyager paper [1], which is a state-of-the-art LLM-based agent for Minecraft. However, this system suffers from some significant limitations, such as its reliance on closed-source LLMs and lack of social awareness. Indeed, current open-source LLMs often fail to match closed-source ones in the agent setting, leaving research reliant on third-party closed-source technology [1] [2]. This gap highlights the need for alternative strategies to enhance LLM performance without the high costs associated with fine-tuning. To address these challenges, we propose the Collaborative Voyager, a new architecture designed to enable agent collaboration and social awareness using open-source LLMs. Inspired by the social intelligence hypothesis, which suggests that intelligence emerges from social interactions, we propose collaboration as an alternative learning paradigm for LLMs. This alternative learning paradigm could potentially supplement the expensive fine-tuning currently needed to bridge the performance gap between open-source and closed-source models in the agent setting [2]. Our approach involves developing a framework that allows agents to communicate, understand, and learn from each other, enabling them to correct errors and adapt to new tasks dynamically. By using a memory module, our agent is able to remember interactions and learn from them in order to accomplish a task that it was previously unable to do on its own. Through various experiments, we demonstrate that collaboration significantly enhances the performance of LLM agents in both task completion and adaptability, addressing issues like hallucinations. This study provides insights into developing more sophisticated, adaptable AI systems capable of dynamic interactions and problem-solving. These findings have potential applications extending beyond Minecraft and virtual environments to fields such as robotics, where collaboration and social awareness are crucial.

Files

Baptiste_Thesis.pdf
(pdf | 9.39 Mb)
Unknown license