Improvements in Imitation Learning for Overcooked
More Info
expand_more
Abstract
Arguably the main goal of artificial intelligence is to create agents that can collaborate with humans to achieve a shared goal. It has been shown that agents that assume their partner to be optimal can converge to protocols that humans do not understand. Taking human suboptimality into consideration is imperative to perform well in a coordination task. One way to achieve this is imitation learning, where you train an agent on recorded data from a human playing optimally. I created several agents using different implementations of behavioral cloning, by reducing the dataset to state-action pairs and training a neural network on this. To evaluate their performance, I used an environment that poses a coordination challenge based on the popular game Overcooked. Neither expanding nor reducing the feature space that the agents are trained on yielded any significant improvement in the performance of the agents. In fact, expanding the feature space to include some historical data made the agent less generalizable and especially failed to perform when paired with agents with unfamiliar strategies. These limitations were mostly posed by the available dataset, which was not big enough to support more features and of too low quality of gameplay to create agents that perform exceptionally well.