AI Alignment Dialogues

An Interactive Approach to AI Alignment in Support Agents

More Info
expand_more

Abstract

This project proposes a different way of looking at AI alignment, namely by introducing AI Alignment Dialogues. We argue that alignment dialogues have a number of advantages in comparison to data-driven approaches, especially for behaviour support agents, which aim to support users in achieving their desired future behaviours rather than their current behaviours. The advantages of alignment dialogues include allowing the users to directly convey higher-level concepts to the agent and making the agent more transparent and trusted.

Files

3514094.3539531.pdf
(pdf | 0.805 Mb)
Unknown license