Inverse Reinforcement Learning (IRL) in Presence of Risk and Uncertainty Related Cognitive Biases

To what extent can IRL learn rewards from expert demonstrations with loss and risk aversion?

Bachelor thesis (2023)

Authors

M. Ikiz Electrical Engineering, Mathematics and Computer Science

Contributors

A. Caregnato Neto Interactive Intelligence - (mentor)

L. Cavalcante Siebert Interactive Intelligence - (mentor)

J.M. Weber Pattern Recognition and Bioinformatics - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Reinforcement Learning Inverse Reinforcement Learning Cognitive biases

To reference this document use:

http://resolver.tudelft.nl/uuid:1571ca1f-3f46-4cf1-982e-b012316644f8

More Info

expand_more

Published Date

29-06-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

A key issue in Reinforcement Learning (RL) research is the difficulty of defining rewards. Inverse Reinforcement Learning (IRL) is a technique that addresses this challenge by learning the rewards from expert demonstrations. In a realistic setting, expert demonstrations are collected from humans, and it is important to acknowledge that these demonstrations can deviate from rationality due to systematic biases known as cognitive biases. One group of cognitive biases, known as risk-sensitive cognitive biases, pertains to individuals' attitudes and behaviors towards risk and uncertainty. This paper investigates the extent to which IRL can learn from demonstrations that contain risk-sensitive cognitive biases such as loss aversion and risk aversion. Modelling biases using concepts from Prospect Theory and System 1 and 2 model and using Maximum Entropy IRL algorithm, this paper concludes that IRL can recreate similar solutions to experts but inferring the underlying motivations and the interactions between them is an intricate problem that requires novel approaches.

Files

BachelorThesis.pdf

(pdf | 1.74 Mb)