AI Alignment Dialogues
An Interactive Approach to AI Alignment in Support Agents
More Info
expand_more
expand_more
Abstract
This project proposes a different way of looking at AI alignment, namely by introducing AI Alignment Dialogues. We argue that alignment dialogues have a number of advantages in comparison to data-driven approaches, especially for behaviour support agents, which aim to support users in achieving their desired future behaviours rather than their current behaviours. The advantages of alignment dialogues include allowing the users to directly convey higher-level concepts to the agent and making the agent more transparent and trusted.