Improvements in Imitation Learning for Overcooked

Niemantsverdriet, D.P.

Improvements in Imitation Learning for Overcooked

Bachelor thesis (2023)

Authors

D.P. Niemantsverdriet Electrical Engineering, Mathematics and Computer Science

Contributors

Robert Loftin Interactive Intelligence (mentor)

Frans A Oliehoek Interactive Intelligence (mentor)

K.A. Hildebrandt Computer Graphics and Visualisation (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Artificial intelligence Overcooked AI Imitation learning Behavioral cloning Cooperative artificial intelligence

To reference this document use:

http://resolver.tudelft.nl/uuid:32294bd2-93a2-4456-8b34-9982c8406e0f

More Info

expand_more

Published Date

28-06-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Arguably the main goal of artificial intelligence is to create agents that can collaborate with humans to achieve a shared goal. It has been shown that agents that assume their partner to be optimal can converge to protocols that humans do not understand. Taking human suboptimality into consideration is imperative to perform well in a coordination task. One way to achieve this is imitation learning, where you train an agent on recorded data from a human playing optimally. I created several agents using different implementations of behavioral cloning, by reducing the dataset to state-action pairs and training a neural network on this. To evaluate their performance, I used an environment that poses a coordination challenge based on the popular game Overcooked. Neither expanding nor reducing the feature space that the agents are trained on yielded any significant improvement in the performance of the agents. In fact, expanding the feature space to include some historical data made the agent less generalizable and especially failed to perform when paired with agents with unfamiliar strategies. These limitations were mostly posed by the available dataset, which was not big enough to support more features and of too low quality of gameplay to create agents that perform exceptionally well.

Files

Research_Project.pdf

(pdf | 0.634 Mb)

License info not available