Improving embodied LLM agents' capabilities through collaboration

Collé, B.J.

Improving embodied LLM agents' capabilities through collaboration

Master thesis (2024)

Authors

B.J. Collé Electrical Engineering, Mathematics and Computer Science

Contributors

Chirag Raman Pattern Recognition and Bioinformatics (mentor)

Marcel JT Reinders Pattern Recognition and Bioinformatics (graduation committee member)

Catherine Oertel Interactive Intelligence (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Collaboration LLM Agent

To reference this document use:

http://resolver.tudelft.nl/uuid:53bdeeef-5ef1-40ad-88a7-edcc7526ac3a

More Info

expand_more

Published Date

07-06-2024

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

The emergence of Language Language Models (LLMs)-based agents represents a significant advancement in artificial intelligence (AI), offering new possibilities for complex problem-solving and interaction within a virtual environment. Our work is based on the Voyager paper [1], which is a state-of-the-art LLM-based agent for Minecraft. However, this system suffers from some significant limitations, such as its reliance on closed-source LLMs and lack of social awareness. Indeed, current open-source LLMs often fail to match closed-source ones in the agent setting, leaving research reliant on third-party closed-source technology [1] [2]. This gap highlights the need for alternative strategies to enhance LLM performance without the high costs associated with fine-tuning. To address these challenges, we propose the Collaborative Voyager, a new architecture designed to enable agent collaboration and social awareness using open-source LLMs. Inspired by the social intelligence hypothesis, which suggests that intelligence emerges from social interactions, we propose collaboration as an alternative learning paradigm for LLMs. This alternative learning paradigm could potentially supplement the expensive fine-tuning currently needed to bridge the performance gap between open-source and closed-source models in the agent setting [2]. Our approach involves developing a framework that allows agents to communicate, understand, and learn from each other, enabling them to correct errors and adapt to new tasks dynamically. By using a memory module, our agent is able to remember interactions and learn from them in order to accomplish a task that it was previously unable to do on its own. Through various experiments, we demonstrate that collaboration significantly enhances the performance of LLM agents in both task completion and adaptability, addressing issues like hallucinations. This study provides insights into developing more sophisticated, adaptable AI systems capable of dynamic interactions and problem-solving. These findings have potential applications extending beyond Minecraft and virtual environments to fields such as robotics, where collaboration and social awareness are crucial.

Files

Baptiste_Thesis.pdf

(pdf | 9.39 Mb)

Unknown license